CultuurLINK Linking Award

Happy and suprised to find the first (and so far only) CultuurLink Linking Award in my mail box yesterday! I checked with the nice people over at Spinque.com and it turns out it was a token of appreciation for being a prolific Cultuurlink user ūüôā

I think the¬†vocabulary alignment tool is¬†great and easy to work with, so I can recommend it to anyone with a SKOS vocabulary who wants to match it with¬†any of the major cultural thesauri in the ‘Hub’.¬†Thanks to the people at Spinque for the great tool and the nice gesture!

spinqeprijs

Linked Data for International Aid Transparency Initiative

In August 2013, VU Msc. student Kasper Brandt finished his thesis on developing, implementing and testing a Linked Data model for the International Aid Transparency Initiative (IATI). Now, more than a year later, that work was accepted for publication in the Journal on Data Semantics. We are very happy with this excellent result.

Model fragment

Model fragment

IATI is a multi-stakeholder initiative that seeks to improve the transparecy of development aid and to that end developed an open standard for the publication of aid information. Hundreds of NGOs and governments have registered to the IATI registry by publishing their aid activities in this XML standard. Taking the IATI model as an input, we have created a Linked Data model based on requirements elicitated from qualitative interviews using an iterative requirements engineering methodology. We have converted the IATI open data from a central registry to Linked Data and linked it to various other datasets such as World Bank indicators and DBPedia information. This dataset is made available for re-use at http://semanticweb.cs.vu.nl/iati .

burundi country page

Screenshot of an application bringing together information from multiple datasets

To demonstrate the added value of this Linked Data approach, we have created several applications which combine the information from the IATI dataset and the datasets it was linked to.  As a result, we have shown that creating Linked Data for the IATI dataset and linking it to other datasets give new valuable insights in aid transparency. Based on actual information needs of IATI users, we were able to show that linking IATI data adds significant value to the data and is able to fulfill the needs of IATI users.

A draft of the paper can be found here.

DIVE wins 3rd prize in Semantic Web Challenge!

During last week’s International Semantic Web Conference (ISWC2014) in Riva del Garda, the DIVE team presented a¬†demonstration prototype of the DIVE tool (which you can play around with live at http://dive.beeldengeluid.nl) .¬†We submitted DIVE to the Open Track of the¬†yearly Semantic Web Challenge for SW tools and applications.¬†Initially, we were invited to give a poster presentation on the first day of the conference and after very positive reviews, we progressed to the¬†challenge final.

Challenge 3rd place certificateFor this final we were asked to present the tool and give a live demonstration in front of the ISWC2014 crowd. Apparently the jury appreciated the effort since DIVE was awarded the third prize. The prize included a nice certificate as well as $1000,- sponsored by Elsevier.

This was a real team effort, but I think much of the praise goes to our partners at Frontwise. They built a very cool, very responsive and intuitive User Experience on top of our SPARQL endpoint. Great work! Also thanks to the people at Beeld en Geluid and KB for their assistance with delivering data in a timely fashion and of course the people at VU for their enrichment of the data. Great teamwork everyone! Embedded below you find the poster and the presentation. The paper is found here.

The presentation:

The poster:

Master Project Andrea Bravo Balado: Linking Historical Ship Records to Newspaper Archives

[This post was written by Andrea Bravo Balado and is cross-posted at her own blog. It describes her MSc. project supervised  by myself]

Linking historical datasets and making them available for the Web has increasingly become a subject of research in the field of digital humanities. In the Netherlands, history is intimately related to the maritime activity because it has been essential in the development of economic, social and cultural aspects of Dutch society. As such an important sector, it has been well documented by shipping companies, governments, newspapers and other institutions.

janwillemsen: foto Rotterdam historische schepen (click to view on flickr)In this master project we assume that, given the importance of maritime activity in every day life in the XIX and XX centuries, announcements on the departures and arrivals of ships or mentions of accidents or other events, can be found in newspapers.

We have taken a two-stage approach: first, an heuristic-based method for record linkage and then machine-learning algorithms for article classification to be used for filtering in combination with domain features. Evaluation of the linking method has shown that certain domain features were indicative of mentions of ships in newspapers. Moreover, the classifier methods scored near perfect precision in predicting ship related articles.

Enriching historical ship records with links to newspaper archives is significant for the digital history community since it connects two datasets that would have otherwise required extensive annotating work and man hours to align. Our work is part of the Dutch Ships and Sailors Linked Data Cloud project. Check out Andrea’s thesis[pdf].

Master project Rianne Nieland: Talking to Linked Data

[This post was written by Rianne Nieland. It describes her MSc. project supervised  by myself]

People in developing countries cannot access information on the Web, because they have no Internet access and are often low literate. A solution could be to provide voice-based access to data on the Web by using the GSM network.

afbeeldingIn my master project I have investigated how to make general-purpose data sets efficiently available using voice interfaces for GSM. To achieve this, I have developed two voice interfaces, one for Wikipedia and one for DBpedia. I have made two voice interfaces with two different kinds of input data sources, namely normal web data and Linked Data, to be able to compare them.

To develop the two voice interfaces, I first did requirements elicitation from literature and developed a user interface and conversion algorithms for Wikipedia and DBpedia concepts. With user tests the users evaluated the two voice interfaces, to be able to compare them on speed, error rate and usability.

[Rianne’s thesis presentation slides can be found on¬†slideshare and is embedded below. Her thesis is attached here:¬†Eindversie-Paper-Rianne-Nieland-2057069]

 

VU E-history datathon links up multiple datasets

A bit belated, but on April 2nd, we organized a ‘datathon’: a one day event where we sat down with pizzas and laptops¬†to link up multiple digital history datasets. The Dutch Ships and Sailors data was one of those datasets, other participants included Niels Ockeloen from the BiographyNet project, Albert Merono from the CEDAR project and Chris Dijkshoorn who brought Linked Data from the Naturalis and Rijksmuseum collections. External participants included¬†Ivo Zandhuis from¬†gemeentegeschiedenis.nl.

Image

Linking and eating pizza

The end results are promising:

  • 5000+ links from people in the BiographyNet RDF data to people in the Rijksmuseum RDF data.
  • 2 links from Dutch Ships and Sailors to Rijksmuseum collections
  • 61 links from Dutch Ships and Sailors Ranks to CEDAR Hisco ‘occupation’ URIs were made
  • 1320 links of CEDAR municipalities (by Amsterdamse Code) to gemeentegeschiedenis.nl municipalities
  • 33 links of ICONCLASS (used by Rijksmuseum) to HISCO occupations

We hope to expand this datacloud in the near future and show the added value of such an interconnected digital history cloud for historical research and the general public. You can read more at Albert’s blog or on the blog of¬†Ivo Zandhuis’ Hic Sunt Leones

The e-history Linked Data cloud visualized

The e-history Linked Data cloud visualized

DownScale 2013 workshop

DOWNSCALE 2013, the 2nd international workshop on downscaling the Semantic Web was held on 19-9-2013 in Geneva, Switzerland and was co-located with the Open Knowledge Conference 2013. The workshop seeks to provide first steps in exploring appropriate requirements, technologies, processes and applications for the deployment of Semantic Web technologies in constrained scenarios, taking into consideration local contexts. For instance, making Semantic Web platforms usable under limited computing power and limited access to Internet, with context-specific interfaces.

Downscale group picture

Downscale group picture

The workshop accepted three full papers after peer-review and featured five invited abstracts. in his keynote speech, Stephane Boyera of SBC4D gave a very nice overview of the potential use of Semantic Web for Social & Economic Development. The accepted papers and abstracts can be found in the  downscale2013 proceedings, which will also appear as part of the OKCon 2013 Open Book.

 

We broadcast the whole workshop live on the web, and you can actually watch the whole thing (or fragments) via the embedded videos below.


 

After the presentations, we had fruitful discussions about the main aspects of ‘downscaling’.¬†The consensus seemed to be that Downscaling involved the investigation and usage of Semantic Web technologies and Linked Data principles to allow for data, information and knowledge sharing in circumstances where ‚Äėmainstream‚Äô SW and LD is not feasible or simply does not work. These circumstances can be because of cultural, technical or physical limitations or because of natural or artificial limitations.

bb_1

The figure  illustrates a first attempt to come to a common architecture. It includes three aspects that need to be considered when thinking about data sharing in exceptional circumstances:

  1. Hardware/ Infrastructure. This aspect includes issues with connectivity, low resource hardware, unavailability, etc.
  2.  Interfaces. This concerns the design and development of appropriate interfaces with respect to illiteracy of users or their specific usage. Building human-usable interfaces is a more general issue for Linked data.
  3. Pragmatic semantics. Developing LD solutions that consider which information is relevant in which (cultural) circumstances is crucial to its success. This might include filtering of information etc.

The right side of the picture illustrates the downscaling stack.

Continue reading

CSWS2013 summer school and keynote in Shanghai

ShanghaiLast week, Knud Moeller from datalysator and I were invited to give a set of lectures about Linked Data in the CSWS 2013 summer school in Shanghai, China. As far as we are concerned the summer school was a success. About 60 students received three mornings worth of lectures about the principles and practice of Linked Data from the two of us. In the afternoon, they heard talks about Semantic Web efforts from the likes of Baidu and Google.

Interested students Because of the unavailability/-reachability of twitter, facebook, slideshare and wordpress in China, the lecture materual can be found are online as pdfs through a HTML page at my VU homepage.

I also had the honour of giving a keynote speech about Linked Data for Cultural Heritage and Digital History in the main conference. Those slides can be found on Slideshare.

Verrijkt Koninkrijk at the Soeterbeeck E-humanities workshop

The Soeterbeeck monastery with two e-humanistsLast week, I presented our work on the Verrijkt Koninkrijk project at the E-humanities workshop in the Soeterbeeck monastery which was organised by the university of Nijmegen and the e-humanities group of KNAW.

It was a very pleasant get-together with some nice talks and hands on sessions.¬†Alice Dijkstra from NWO ¬†presented a number of opportunities for getting funding for e-humanities projects. She mentioned some obvious candidates (vernieuwingsimpuls,…) and some less obvious ones (the hopefully upcoming CLARIAH programme, which would continue CLARIN and DARIAH).

The two hands on sessions were nice but showed that there is a more general issue with e-humanities that ‘nice tools’ are being developed but that these tools remain solutions to a single problem. Next to that they are either nice from a computer science or from a historical science viewpoint but it is hard to do exciting comp.science and historical science at the same time. This is reenforced by the issue that historical scientists rarely know what type of tools they want at the beginning of a project. A more interactive and cyclical approach makes sense for both parties. The BiographyNet idea of putting the researchers from different backgrounds in the same room would be one solution. The other in my view is the development of more general-purpose query environments .

In my poster presentation I showed how I tried to do that with Verrijkt Koninkrijk and I think for a more or less generic data analysis interface is also a good idea.

You can download the VK poster Abstract as well as the actual Poster.

Links to some of the web-demo’s we tried:
http://collatex.net/demo/
http://voyeurtools.org/?skin=scatter
http://eccentricity.org/delta3d/

Dutch Ships and Sailors project started

a very unofficial DSS logo i madeLast week saw the kickoff of the new Clarin NL-funded project “Dutch Ships and Sailors”(*). This¬†project will run for one year and gives me the opportunity to work with historians from both VU and Huygens ING on applying Linked Data principles to Dutch maritime-historical data.¬†From the official description:

As a sea-faring nation, a large portion of Dutch history is found on the water. However, much of the digitized historical source material is still scattered across many databases and archives. This curation and demonstrator project aims to bring together the rich maritime historical data preserved in the many different databases. We propose a (semantic) web-based infrastructure

that will house various maritime-historical datasets. We will provide a tool chain and methodology for converting legacy datasets. The infrastructure includes common vocabularies to normalize and enrich existing data. Links are established between the datasets and to other relevant datasets on the Web. Although the infrastructure will be set up to facilitate 25+ identified datasets, we initially populate the infrastructure with four selected datasets. These will allow us to investigate two case studies in order to answer the historical research question ‚ÄúTo what extent did patterns of shipping and recruitment in the Dutch maritime sector change over the course of the 18th and 19th centuries?‚ÄĚ

(*) the project’s official title is Dutch Ships and Seamen, but we think this is potentially less problematic ūüôā