This is my old blog. I have merged this blog with my homepage. Therefore, new updates will no longer appear on this blog, but on my homepage: http://victordeboer.com. All blog posts are copied to this new page. To not break any existing links, this blog will remain online for archival purposes.
Our paper “Evaluating Unsupervised Thesaurus-based Labeling of Audiovisual Content in an Archive Production Environment” was accepted for publication in the International Journal on Digital Libraries (IJDL). This paper, co-authored with Roeland Ordelman and Josefien Schuurman reports on a series of information extraction experiments carried out at the Netherlands Institute for Sound and Vision (NISV). Specifically, in the paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using subtitles. We look at how such an approach can provide acceptable results given requirements with respect to archival quality, authority and service levels to external users.
For this, we developed a text extraction pipeline (TESS), pictured here which extracts key terms and matches them to the NISV thesaurus, the GTAA. This journal paper is an extended version of the paper previously accepted at the TPDL conference and here provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective.
The paper will appear on the Journal site shortly. A final draft version of the paper can be found here: deboer_ijdl2016evaluating_draft [PDF].
Around 40 students joined this year’s “bachelor’s for a day” for the VU IMM programme this year. As in previous years, I give a 45 minute lecture and construct a hands-on session around “The Social Web”. Each year I do a non-scientific survey of Social Web use among the -mostly- 17 year old attendees. This year’s outcome:
- Everybody still uses Facebook (even though for the last couple of years, there are some murmurs about abandoning it
- Everybody uses Whatsapp. No surprise there
- More than half of the students use Snapchat.
- About 1/4 of students use LinkedIn.
- About 1/8 of students actively uses Twitter (one post in the last 3 months)
- Most students have heard of Hyves, but noone ever used it
- Almost noone has heard of Second Life 🙂
- Noone heard of Schoolbank.nl
You can find my slides below. The handson session can be found here.