The 3rd Semantic Search workshop (SemSearch2010) was held on Monday in conjunction with the WWW2010 conference at Raleigh, NC, USA.
This post is about the highlights of the workshop, with some personal comments at the end. For more information, check the post by Christian Grant, Jeff Dalton, and the #semsearch2010 hashtag on twitter.
TREC Entity: overview of 2009 and plans for 2010
The Entity track overview paper has been added to the TREC 2009 online Proceedings [direct link to the pdf].
The track continues in 2010. An overview of what happened at the 2009 TREC conference (entity wise), along with plans for the 2010 edition has been published on the track’s website. There is some discussion on the mailing list too.
Best @INEX2009 Entity ranking
The UvA ISLA team (consisting of me, Marc Bron, Maarten de Rijke, and Wouter Weerkamp) achieved top performance at the Entity Ranking (XER) track at INEX 2009, on both tasks (entity ranking and list completion). Our submission employed a slightly tweaked variation of the best performing models we describe in our paper entitled Category-based Query Modeling for Entity Search; this work will be presented at ECIR at the end of this month (the paper is available online).
Although we did really great at INEX, our achievement is somewhat weakened by the fact that only 5 teams participated at the XER track (including us). The number of participating teams was 8 in 2007, 6 in 2008, and 5 in 2009. So, what’s the future of INEX-XER (or INEX for that matter)?
Update Apr 19, 2010: the paper describing our approach @INEX is available online.
Two evaluation campaigns related to entity/expert search
The CLEF 2010 labs will feature two evaluation campaigns that are potentially of interest to people working in the area of entity/people/expert search.
The third WePS Evaluation Workshop (WePS3) focuses on two tasks related to web entity search:
- Task 1: Clustering and Attribute Extraction for Web People Search.
Given a set of web search results for a person name, the task is to cluster the pages according to the different people sharing the name and extract certain biographical attributes for each person. [details] - Task 2: Name ambiguity resolution for Online Reputation Management.
Given a set of Twitter entries containing an (ambiguous) company name, and given the home page of the company, the task is to discriminate entries that do not refer to the company. Entries will be given in two languages: English and Spanish. [details]
The Cross-lingual Expert Search (CriES) workshop addresses the problem of multi-lingual expert search in social media environments. The workshop also includes a pilot challenge, which is very much like the expert finding task at the TREC Enterprise track: given a document collection and a query topic, return a ranked list of experts, who are likely to be experts on the topic. However, the document collection is a multilingual social environment (Yahoo! Answers) and topics come in 4 different languages (English, German, French, Spanish).
Work hard, relax hard!
This picture gives you a bit of an idea how my week in the run-up for SIGIR deadline looked like (I also have a nice collection of energy drink cans at the office). But, that is finally over, and I managed to get in my submissions on time. That means I’m done with the work hard part, now it’s time for some serious relaxing and recharging of batteries.
I’ll be on vacation until Feb 10, and this time, I won’t be checking my emails. If anything urgent comes up… that has to wait, I’ll deal with it when I’m back.
I’m off to relax!