Linked WW II Data made at the OpenCultuurData Hackathon

Image

Michiel and me presenting the result at the hackathon

For OpenCultuurData, I assisted NIOD (Dutch Institute for War Documentation) as an ‘Open Data coach’. For the hackathon, organised 16 june 2012 by hackdeoverheid, NIOD published part of its image archive Beeldbank WO2as open data (see also their datablog). The dataset contains 140.000 images about WW II as well as its metadata. It is accessible through OAI-PMH.

Also for OpenCultuurData, the ‘Nationaal Comité 4 en 5 mei‘ (VVM) presented their database about war monuments as open data (again, see their datablog). This database (available as an XML datadump) contains 3500 monuments, most of which are related to WW II, including the Dam Square Monument.

For the hackathon of 16 June, Michiel Hildebrand and myself decided to take these two datasets and convert them to ‘five star linked data‘.

Conversion

For the conversion, we used the XML to RDF tool enclosed within Cliopatria, VU’s semantic toolset. Using a few rewriting rules, we converted the OAI XML of NIOD’s beeldbankWo2 as well as the XML of 4en5mei to RDF.

  • The NIOD data consists of 2,097,214 RDF triples, using 15 predicates, most of which are Dublin Core metadata fields. The images records are annotated with concepts from the NIOD thesaurus, which is currently under development within the Verrijkt Koninkrijk project .
  • The VVM data set contains 122,233 RDF triples and uses 37 predicates, most of which are specific to the dataset. We mapped these predicates to Dublin Core using subProperty predicates (for example, the 4en5mei:artist predicate is mapped to dc:creator. To be able to map address locations to other data sources, we upgraded addresses from literals to SKOS concepts.

Links

We semi-automatically linked produced the following links:

  • VVM city and community relations to GeoNames instances  (4,124 links)
  • VVM address relations to Amsterdam Museum thesaurus concepts (77 links)
  • NIOD thesaurus concepts to Amsterdam Museum concepts (488 links)
Linked Data graph figure

This Linked Data graph figure shows the two datasets, plus the vocabularies and datasets they link to.

In a previous effort, we produced links betweeb the NIOD thesaurus and a) Cornetto and b) Dutch AAT. The result is shown in the mini-datacloud figure below.

URIs and access

For the datasets, we used PURL URIs. This is mainly a matter of convenience since we do not have direct access to either the NIOD or the VVM web servers. We used the basenames http://purl.org/collection/nl/niod/ and http://purl.org/collection/nl/viervijfmei/. HTTP requests are forwarded to a running instance of Cliopatria at http://semanticweb.cs.vu.nl/pvb. Here, a SPARQL endpoint can also be found.

Below is a list of example URIs:

The link between a 4en5mei monument and an Amsterdam Museum object, through a mapped address concept

The link between a 4en5mei monument and an Amsterdam Museum object, through a mapped address concept.

Status and next steps
This represents only a first effort to make a these datasets linked open data. Some issues that we will look at in the near future are:
  • Link evaluation: none of the links were validated, so there is no guarantee of their quality.
  • More links: More possibilities for connecting the datasets remain. These include the enrichment of BeeldbankWO2 dc:coverage fields (to GeoNames) and mappings to Rijksmonumenten, Stadsarchief etc.
  • The NIOD data now lives on two separate Cliopatria servers (one associated with Amsterdam culture data and one with Verrijkt Koninkrijk). These should be merged.
  • We are also looking at use cases for applications that will use this linked data. We hope to submit one to the OpenCultuurData challenge.

Best Poster Award at ESWC 2012!

Poster thumbnail (click to view PDF)

The ESWC 2012 conference ended with a bang for me and the rest of the W4RA team: We were awarded the Best Poster Award for our poster “Bringing the Web of Data to Developing Countries: Linked Market Data in the Sahel”. You can find the poster abstract here and the poster itself here. This is especially nice for two reasons.

First of all, we spent this year’s VU Semantic Web Outing learning about desiging good posters and afterwards we took this poster as an example. The winning poster is very much a collaborative effort of everybody at the VU Semantic Web groups. Thank you all for your effort.

Secondly, a goal of the poster is to get a community of peers interested in issues and applications of Linked Data for “Warm Countries”. I hope that this prize and the publicity will help realize this.

More information about how you can access the Linked Market data can be found elsewhere on this blog and on the W4RA blog. Want to join our effort and support Linked Data for Development? Contact us or keep an eye on our blog worldwidesemanticweb.wordpress.com.