A while ago, we submitted a project description of our Digital History project Dutch Ships and Sailors to the DHCommons journal and this week the first issue of the journal was published containing our paper “The Dutch Ships and Sailors project“.
This is a nice companion piece to the more technical description of the dataset which was published in the proceedings of ISWC 2014. The new version highlights more the general setup of the project and the considerations and innovations of the project from a historical point of view.
Since submission of this ‘mid-term project description’, the DSS data cloud has been expanding, and the ‘development’ version
of the triple store now hosts six datasets thanks to the work of Jeroen Entjes
(see the datacloud figure).
[This post was written by Jeroen Entjes and describes his Msc Thesis research]
The Dutch maritime supremacy during the Dutch Golden Age has had a profound influence on the modern Netherlands and possibly other places around the globe. As such, much historic research has been done on the matter, facilitated by thorough documentation done by many ports of their shipping. As more and more of these documentations are digitized, new ways of exploring this data are created.
Screenshot showing an entry from the Elbing website
This master project uses one such way. Based on the Dutch Ships and Sailors project digitized maritime datasets have been converted to RDF and published as Linked Data. Linked Data refers to structured data on the web that is published and interlinked according to a set of standards. This conversion was done based on requirements for this data, set up with historians from the Huygens ING Institute that provided the datasets. The datasets chosen were those of Archangel and Elbing, as these offer information of the Dutch Baltic trade, the cradle of the Dutch merchant navy that sailed the world during the Dutch Golden Age.
Along with requirements for the data, the historians were also interviewed to gather research questions that combined datasets could help solve. The goal of this research was to see if additional datasets could be linked to the existing Dutch Ships and Sailors cloud and if such a conversion could help solve the research questions the historians were interested in.
Data visualization showing shipping volume of different datasets.
As part of this research, the datasets have been converted to RDF and published as Linked Data as an addition to the Dutch Ships and Sailors cloud and a set of interactive data visualizations have been made to answer the research questions by the historians. Based on the conversion, a set of recommendations are made on how to convert new datasets and add them to the Dutch Ships and Sailors cloud. All data representations and conversions have been evaluated by historians to assess the their effectiveness.
The data visualizations can be found at http://www.entjes.nl/jeroen/thesis/. Jeroen’s thesis can be found here: Msc. Thesis Jeroen Entjes
This year’s third issue of E-Data and Research magazine features an article about the Dutch Ships and Sailors project. The article (in Dutch) describes how our project provides new ways of interacting with Dutch maritime data. So far, four datasets are present in the DSS data cloud but we are currently extending the dataset with two new datasets. More on that later…
In the same issue, there is an article about the workshop around newspaper data as provided by the National Library. This includes a picture of me presenting the DIVE project.
You can read these articles and much more more in the june 2015 issue of E-Data and Research. And the backlog at www.edata.nl.
For the LODLAM challenge, I submitted this entry on Dutch Ships and Sailors, which now includes a more user-friendly interface. You can watch me explain the project and the demonstrate the new interface in this five-minute video. You can also vote for this project at by clicking “like” at the entry page.
Who knew publishing Open Data could be so rewarding? The good people
at DANS sent me a cake because I was the first to publish research
data under Open Access (read more about this on OpenAccess.nl) . This data was the result of a very small research project.
The goal of the “Diepere Maritieme Data” (DMD) project was to enrich the CLARIN Dutch Ships and Sailors (DSS) Linked Data cloud with links from DSS records to scans of the original archival documents from which the data was digitized. Specifically, we enriched the subset “Noordelijke Monsterrollen Database (Northern Muster Rolls Databases) created by historian Jurjen Leinenga which was converted to an RDF dataset within the DSS project (Persistent Identifier: urn:nbn:nl:ui:13-czhm-ug URL: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:57617)
[This post was written by Andrea Bravo Balado and is cross-posted at her own blog. It describes her MSc. project supervised by myself]
Linking historical datasets and making them available for the Web has increasingly become a subject of research in the field of digital humanities. In the Netherlands, history is intimately related to the maritime activity because it has been essential in the development of economic, social and cultural aspects of Dutch society. As such an important sector, it has been well documented by shipping companies, governments, newspapers and other institutions.
In this master project we assume that, given the importance of maritime activity in every day life in the XIX and XX centuries, announcements on the departures and arrivals of ships or mentions of accidents or other events, can be found in newspapers.
We have taken a two-stage approach: first, an heuristic-based method for record linkage and then machine-learning algorithms for article classification to be used for filtering in combination with domain features. Evaluation of the linking method has shown that certain domain features were indicative of mentions of ships in newspapers. Moreover, the classifier methods scored near perfect precision in predicting ship related articles.
Enriching historical ship records with links to newspaper archives is significant for the digital history community since it connects two datasets that would have otherwise required extensive annotating work and man hours to align. Our work is part of the Dutch Ships and Sailors Linked Data Cloud project. Check out Andrea’s thesis[pdf].
A bit belated, but on April 2nd, we organized a ‘datathon’: a one day event where we sat down with pizzas and laptops to link up multiple digital history datasets. The Dutch Ships and Sailors data was one of those datasets, other participants included Niels Ockeloen from the BiographyNet project, Albert Merono from the CEDAR project and Chris Dijkshoorn who brought Linked Data from the Naturalis and Rijksmuseum collections. External participants included Ivo Zandhuis from gemeentegeschiedenis.nl.
Linking and eating pizza
The end results are promising:
- 5000+ links from people in the BiographyNet RDF data to people in the Rijksmuseum RDF data.
- 2 links from Dutch Ships and Sailors to Rijksmuseum collections
- 61 links from Dutch Ships and Sailors Ranks to CEDAR Hisco ‘occupation’ URIs were made
- 1320 links of CEDAR municipalities (by Amsterdamse Code) to gemeentegeschiedenis.nl municipalities
- 33 links of ICONCLASS (used by Rijksmuseum) to HISCO occupations
We hope to expand this datacloud in the near future and show the added value of such an interconnected digital history cloud for historical research and the general public. You can read more at Albert’s blog or on the blog of Ivo Zandhuis’ Hic Sunt Leones
The e-history Linked Data cloud visualized
Last week saw the kickoff of the new Clarin NL-funded project “Dutch Ships and Sailors”(*). This project will run for one year and gives me the opportunity to work with historians from both VU and Huygens ING on applying Linked Data principles to Dutch maritime-historical data. From the official description:
As a sea-faring nation, a large portion of Dutch history is found on the water. However, much of the digitized historical source material is still scattered across many databases and archives. This curation and demonstrator project aims to bring together the rich maritime historical data preserved in the many different databases. We propose a (semantic) web-based infrastructure
that will house various maritime-historical datasets. We will provide a tool chain and methodology for converting legacy datasets. The infrastructure includes common vocabularies to normalize and enrich existing data. Links are established between the datasets and to other relevant datasets on the Web. Although the infrastructure will be set up to facilitate 25+ identified datasets, we initially populate the infrastructure with four selected datasets. These will allow us to investigate two case studies in order to answer the historical research question “To what extent did patterns of shipping and recruitment in the Dutch maritime sector change over the course of the 18th and 19th centuries?”
(*) the project’s official title is Dutch Ships and Seamen, but we think this is potentially less problematic 🙂