ICTIR 2015 paper online

“Entity Linking in Queries: Tasks and Evaluation,” an upcoming ICTIR 2015 paper by Faegheh Hasibi, Svein Erik Bratsberg, and myself is available online now. The resources developed within this study are also made publicly available.

Annotating queries with entities is one of the core problem areas in query understanding. While seeming similar, the task of entity linking in queries is different from entity linking in documents and requires a methodological departure due to the inherent ambiguity of queries. We differentiate between two specific tasks, semantic mapping and interpretation finding, discuss current evaluation methodology, and propose refinements. We examine publicly available datasets for these tasks and introduce a new manually curated dataset for interpretation finding. To further deepen the understanding of task differences, we present a set of approaches for effectively addressing these tasks and report on experimental results.

Two fully funded PhD positions available

I have two fully funded PhD positions available in the context of the FAETE project.
The positions are for three years and come with no teaching duties! (There is also possibility for an extension to four years with 25% compulsory duties.) Starting date can be as early as Sept 2015, but no later than Jan 2016.

Further details and application are on Jobbnorge.
Application deadline: Aug 3, 2015

Evaluating document filtering systems over time

Performance of three systems over time. Systems A and B degrade, while System C improves over time, but they all have the same average performance over the entire period. We express the change in system performance using the derivative of the fitted line (in orange) and compare performance at what we call the “estimated end-point” (the large orange dots).

Our IPM paper “Evaluating document filtering systems over time” with Tom Kenter and Maarten de Rijke as co-authors is available online. In this paper we propose a framework for measuring the performance of document filtering systems. Such systems, up to now, have been evaluated in terms of traditional metrics like precision, recall, MAP, nDCG, F1 and utility. We argue that these metrics lack support for the temporal dimension of the task. We propose a time-sensitive way of measuring performance by employing trend estimation. In short, the performance is calculated for batches, a trend line is fitted to the results, and the estimated performance of systems at the end of the evaluation period is used to compare systems. To demonstrate the results of our proposed evaluation methodology, we analyze the runs submitted to the Cumulative Citation Recommendation task of the 2012 and 2013 editions of the TREC Knowledge Base Acceleration track, and show that important new insights emerge.


The Exploiting Semantic Annotations in Information Retrieval (ESAIR) workshop series aims to advance the general research agenda on the problem of creating and exploiting semantic annotations. The eighth edition of ESAIR, with a renewed set of organizers, sets its focus on applications. We invite presentations of prototype systems in a dedicated “Annotation in Action” demo track, in addition to the regular research and position paper contributions. A Best Demonstration Award, sponsored by Google, will be presented to the authors of the most outstanding demo at the workshop.

Submissions: regular research papers (4+ pages), position papers (2+1 pages), demo papers (4+ pages)
Deadline: July 2nd

The workshop also offers a track for authors of papers that were not successful at the main conference for their work to be considered for presentation at the workshop; the deadline for these contributions is July 8. In this case, authors are required to attach the reviews for their paper along with the paper so as to facilitate the decision process.

See the workshop’s homepage for details.

4M NOK funding from ToppForsk-UiS

I received 4M NOK external funding (~440K EUR) from the ToppForsk programme to work on the FAETE project (From Answering Engines to Task-completion Engines). ToppForsk is a research programme for outstanding young researchers, put in place by the University of Stavanger in cooperation with Universitetsfondet. The official announcement with more details about the programme and its 2015 winners can be found here (in Norwegian). The project, with a total project budget of 9.7M NOK (over 1M EUR), will allow me to spend 60% of my time on research for the following 4 years and will fund 2 PhD students (announcements will follow in due time).

PhD positions in IR

The University of Stavanger invites applications for up to three doctorate scholarships in Information Technology at the Faculty of Science and Technology, in the Department of Electrical Engineering and Computer Science, beginning September 1, 2015.

There are 15 projects offered in total, which include two IR projects supervised by me:

#8. Living labs for information retrieval

Living labs is a new evaluation paradigm for information retrieval (IR), where the idea is to perform experiments in situ, with real users doing real tasks using real-world applications. This type of evaluation is standard practice in (large) industrial research labs, but is only now becoming available to academic researchers [1,2]. Despite recent developments, there are still numerous challenges to be overcome, including living labs architecture and design, hosting, maintenance, security, privacy, participant recruiting, and scenarios and tasks for use development. This focus of this project is on developing and employing the living labs evaluation paradigm for IR. The PhD candidate will contribute to the understanding of online evaluation and how to generalize across different use-cases.

[1] http://living-labs.net/
[2] http://www.clef-newsreel.org/

#9. Answering complex queries

Web search engines have become remarkably effective in providing appropriate answers to queries that are issued frequently. However, when it comes to complex information needs, often formulated as natural language questions, responses become much less satisfactory (e.g., “Which European universities have active Nobel laureates?”). Manual effort is often required to collect and synthesize information from multiple sources, a process that may involve a series of filtering, sorting, and aggregation steps. The goal of this project is to investigate how to improve query understanding and answer retrieval for complex queries, using massive volumes of unstructured data in combination with knowledge bases.

Details and application instructions can be found here.
Application deadline: February 24, 2015.

Important: Feel free to contact me directly for more information regarding the projects. However, applications need to be submitted on jobbnorge.no (i.e., don’t send them in email to me). Also, don’t forget to indicate which projects you are applying for, in order of preference.

Digital exam

The University of Stavanger is one of the higher education institutions in Norway that are now working on a pilot project for digital examination. My web programming and interaction design course will be among the first ones (and the very first one within the Faculty of Science and Technology) that will have a digital exam already this semester.

The latest issue of Teknisk Ukeblad featured an article on this topic where I was also interviewed about my reasons for having a digital, as opposed to paper-based, exam.

The highlighed quote, in free translation, reads like “The point of the exam is that it should be as close to a work situation as possible. Students will not be doing programming on paper when they start their job.” Digital exams also allow for certain websites to be used as reference material, in addition to textbooks.

Read the whole article here (in Norwegian).

Living Labs developments

There have been a number of developments over the past months around our living labs for IR evaluation efforts.

We had a very successful challenge workshop in Amsterdam in June, thanks to the support we received from ELIAS, ESF, and ILPS. The scientific report summarizing the event is available online.

There are many challenges associated with operationalizing a living labs benchmarking campaign. Chief of these are incorporating results from experimental search systems into live production systems, and obtaining sufficiently many impressions from relatively low traffic sites. We propose that frequent (head) queries can be used to generate result lists offline, which are then interleaved with results of the production system for live evaluation. The choice of head queries is critical because (1) it removes a harsh requirement of providing rankings in real-time for query requests and (2) it ensures that experimental systems receive enough impressions, on the same set of queries, for a meaningful comparison. This idea is described in detail in an upcoming CIKM’14 short paper: Head First: Living Labs for Ad-hoc Search Evaluation.

A sad, but newsworthy development was that our CIKM’14 workshop got cancelled. It was our plan to organize a living labs challenge as part of the workshop. That challenge cannot be run as originally planned. Now we have something much better.

Living Labs for IR Evaluation (LL4IR) will run as a Lab at CLEF 2015 along the tagline “Give us your ranking, we’ll have it clicked!” The first edition of the lab will focus on three specific use-cases: (1) product search (on an e-commerce site), (2) local domain search (on a university’s website), (3) web search (through a commercial web search engine). See futher details here.

Postdoc position in Semantic Entity Search

About half a year ago I advertised a PhD position in Semantic Entity Search. There were no eligible candidates, so it has been converted to a 2-year Postdoc position.

There is good flexibility topic-wise—as long as it’s about entities and semantics :)
Please feel free to contact me with any questions or for further information.

Details and application instructions can be found here.

You will notice that there is a number of projects advertised. It’s a department-funded position, so “may the best applicant win” is the name of the game. Meaning, the strongest candidate will be offered the position, irrespective of the project chosen.

Starting date: from Sept 1, 2014.
Application deadline: June 22, 2014.