TREC 2008, 2009
The ILPS group of the University of Amsterdam participated in three tracks at TREC 2008: blog, enterprise, and relevance feedback. The working notes paper describing our approaches is available online.
Results for the Enterprise track were not available at the time of writing, therefore the paper only reports the runs we submitted.
I will present some interesting findings we came across, concerning the expert finding task, at the conference (November 19-21, Gaithersburg, USA). I hope that the title of my talk sounds promising: Now that you’ve bought into Model 2, we’ll tell you why to get Model 1.
After four successful years, the Enterprise track is coming to an end. Personally, I am extremely grateful for the TRECENT Organizers (Peter Bailey, Nick Craswell, Ian Soboroff, Paul Thomas, and Arjen P. de Vries, strictly in alphabetical order) for coordinating the track, and making this platform available to the research community!
A new track, Entity Ranking will run from 2009 that I’m co-organizing with Arjen P. de Vries, Paul Thomas, and Thijs Westerveld. I’m not supposed to share details about it at this point, but it’ll has something to do with searching entities (such as people) in web data. We hope to attract participants that would have performed expert ?nding at the Enterprise track… specifics should follow after TREC.
Thesis resources #1: CSIRO candidates and associations
As promised before, it’s now time to start sharing some resources that I obtained during my thesis work. This first release contains two CSIRO related items: the list of CSIRO candidates (e-mail addresses) and a list of document-candidate associations.
I was actually keen to make these available before the submission deadline for the Expert Search runs at the TREC 2008 Enterprise track. These lists, of course, are far from perfect, but worked for me quite well. If you have comments, suggestions, improved versions, etc. feel free to contact me!
The files are available at the same place as the CERC collection (so you’ll need the same username and password): http://es.csiro.au/cerc/data/balog. Thanks to Paul Thomas for arranging the hosting!
TREC Enterprise preparations
In the light of TREC conference preparations, I spent quite some time on performing additional experiments using the CSIRO collection this week. Although difficulties did occur, the road at the end has led to compelling results, with lessons learned. I will present our group’s results next Friday at TREC, and of course will put slides online after the talk.
While the problem of expertise management is not new in the Knowledge Management field, it was the introduction of the Expert search task at TREC 2005 that certainly attracted the attention of the IR community. This interest increased in 2006, and I anticipate it has just grown further this year. When I started my PhD career in Sept 2005, I immediately started working on expert finding, and looking back over the past two years, I feel this was a bet on the right horse.
Turning back to the TREC Enterprise track, on the other hand, the Discussion search task — featured in 2005 and 2006 — received much less interest. In fact, it was replaced with a Document search task this year. While it may be considered less innovative, I do find this interesting for the following reasons:
- the (new) CSIRO collection is quite different from both the W3C and the WWW settings;
- topics are broader than in case of W3C and topic definitions include example documents; and last but not least,
- the same set of topics is used in document and expert search.
I am looking forward hearing about how others tackled this year’s Enterprise search tasks, what the organizers’ plans are for next year, and of course, curious to see how our approaches performed compared to that of other participants. While W3C became the de facto standard evaluation set for expert search, it only represents one type of intranet. Therefore, I hope that forthcoming publications on expert search will include results using CSIRO and the TREC 2007 platform too.