SPARQL Queries for Verrijkt Koninkrijk

[update: the links have been updated] In this post, I list a number of SPARQL queries that show the way external sources can be used to provide enriched access to the Verrijkt Koninkrijk text. The queries go with a two-page abstract  entitled “Enriched Access to a Large War Historical Text using the Back of the Book Index” I submitted to the SWAIE 2012 – Semantic Web and Information Extraction workshop I will be attending.

These queries use the back-of-the-book index that has been converted to SKOS and was subsequently aligned with a number of datasources.

The queries can be entered in the interactive SPARQL interface of the Verrijkt Koninkrijk semantic server, which can be found at http://semanticweb.cs.vu.nl/verrijktkoninkrijk/flint/ . (login: sparqltester, ww: sparqltester).

Query1: GeoNames. Get all paragrahs containing references to a place in the Dutch Province “Noord Holland”:

PREFIX niod: <http://purl.org/collections/nl/niod/&gt;
prefix dc:   <http://purl.org/dc/elements/1.1/&gt;
PREFIX skos: <http://www.w3.org/2004/02/skos/core#&gt;

SELECT DISTINCT ?subj ?bc ?par
WHERE  {
?subj <http://www.geonames.org/ontology#parentADM1&gt; <http://sws.geonames.org/2749879/&gt;.
?bc skos:closeMatch ?subj.
?bc skos:inScheme niod:BotBScheme.
?bc niod:pageRef ?pr.
?pr niod:parRef ?par.
}
limit 100

Edit 3 oct: I continued experimenting with some other SPARQL queries and used Willem van Hage and Tomi Kauppinen’s excellent SPARQl package for R to do some quick-and-dirty statistical analysis. I used a variant of  the query above, but with the province as a variable. I put the results in a pie chart showing Loe de Jong’s mentions of places found in each of the twelve provinces of the Netherlands.

Frequencies of page references to places in each of the twelve provinces in "Het Koninkrijk"

Frequencies of page references to places in each of the twelve provinces in “Het Koninkrijk”

And if you substitute the predicate ‘parentADM1’ for ‘parentADM2’, you get the frequencies for the individual municipalities:

Frequencies of page references to municipalities in "Het Koninkrijk"

Frequencies of page references to municipalities in “Het Koninkrijk”

I will leave the historical interpretation of these charts to the reader. Note however that a major disclaimer is needed. There are numerous errors in the data, including OCR errors, and concept  mapping errors. I am sure that the municipality ‘Berkelland’ is not as important as it now seems. Also, the data should be normalized by province size to give a better idea of what is going on.

The point is however that -given the linked data- these analyses are ridiculously easy to perform with SPARQL and R.

Query2: NIOD Thesaurus Beeldbank WO2. Get all combinations of BBWO2 images and paragraphs

PREFIX niod: <http://purl.org/collections/nl/niod/&gt;
prefix dc:   <http://purl.org/dc/elements/1.1/&gt;
PREFIX skos: <http://www.w3.org/2004/02/skos/core#&gt;

SELECT DISTINCT ?img ?par
WHERE {
?object dc:subject ?subj ;
dc:relation ?img .
?subj skos:inScheme niod:ConceptScheme.
?subj skos:exactMatch ?bc.
?bc skos:inScheme niod:BotBScheme.
?bc niod:pageRef ?pr.
?pr niod:parRef ?par.
}
limit 100

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s