TREC Entity 2010 draft guidelines

The draft guidelines for the 2010 edition of the track have been posted on the track’s website.

In 2010, Related Entity Finding (REF) runs as the main task of the track. A number of changes has been made to the previous edition. We also attempted to clarify issues, such as what is and what is not an entity homepage.
In addition, the track introduces a second challenge, entity list completion (ELC), which will run as a pilot task.

Your feedback is not only welcomed, but encouraged! Post them as comments on the guidelines page or send them to the mailing list.

Best @INEX2009 Entity ranking

The UvA ISLA team (consisting of me, Marc Bron, Maarten de Rijke, and Wouter Weerkamp) achieved top performance at the Entity Ranking (XER) track at INEX 2009, on both tasks (entity ranking and list completion). Our submission employed a slightly tweaked variation of the best performing models we describe in our paper entitled Category-based Query Modeling for Entity Search; this work will be presented at ECIR at the end of this month (the paper is available online).

Although we did really great at INEX, our achievement is somewhat weakened by the fact that only 5 teams participated at the XER track (including us). The number of participating teams was 8 in 2007, 6 in 2008, and 5 in 2009. So, what’s the future of INEX-XER (or INEX for that matter)?

Update Apr 19, 2010: the paper describing our approach @INEX is available online.

Two evaluation campaigns related to entity/expert search

The CLEF 2010 labs will feature two evaluation campaigns that are potentially of interest to people working in the area of entity/people/expert search.

The third WePS Evaluation Workshop (WePS3) focuses on two tasks related to web entity search:

  • Task 1: Clustering andĀ Attribute Extraction for Web People Search.
    Given a set of web search results for a person name, the task is to cluster the pages according to the different people sharing the name and extract certain biographical attributes for each person. [details]
  • Task 2: Name ambiguity resolution forĀ Online Reputation Management.
    Given a set of Twitter entries containing an (ambiguous) company name, and given the home page of the company, the task is to discriminate entries that do not refer to the company. Entries will be given in two languages: English and Spanish. [details]

The Cross-lingual Expert Search (CriES) workshop addresses the problem of multi-lingual expert search in social media environments. The workshop also includes a pilot challenge, which is very much like the expert finding task at the TREC Enterprise track: given a document collection and a query topic, return a ranked list of experts, who are likely to be experts on the topic. However, the document collection is a multilingual social environment (Yahoo! Answers) and topics come in 4 different languages (English, German, French, Spanish).