Krisztian Balog

LHD-11 Call for papers

January 11, 2011 by krisztianbalog

Workshop on Discovering Meaning On the Go in Large Heterogeneous Data 2011 (LHD-11)

Held at The Twenty-second International Joint Conference on Artificial Intelligence (IJCAI-11) July 16, 2011, Barcelona, Spain.

This workshop is designed to bring together people from different fields working in the area of dynamic matching, interpretation, and integration of heterogeneous data, so that ideas, techniques and problems can be shared and discussed in a broad context. A key part of this aim is attracting those from industry as well as those from academia.

In order to interact successfully in an open and heterogeneous environment, being able to dynamically and adaptively integrate data from other systems “on the go” is necessary. This may not be a precise process but a matter of finding a good enough understanding to allow interaction to proceed successfully. With the advent of the Web, there are massive amounts of information available online that can assist in this task, but this information is often chaotically organised, stored in a wide variety of data-formats, and difficult to interpret.

~~Deadline for abstract subsmission: March 14, 2011~~
Update: Submission deadline extended to April 4th, 2011

More info

TREC 2010 summary

November 29, 2010 by krisztianbalog

The 19th Text REtrieval Conference (TREC) took place at the “usual” time and place: Gaithersburg, MD, in the second half of November. Seven tracks ran in 2010: Blog, Chemical IR, Entity, Legal, Relevance Feedback, Session, and Web.
The Entity track was very popular both in terms of the number of participants and the number of posters presented. The proposed approaches displayed a great degree of diversity and made the presentations very interesting. I don’t want to repeat myself, so I refer to the posts on the Entity website for the conference summary and plans for 2011.
As to TREC 2011, the Chemical IR, Entity, Session, Legal, and Web tracks will continue. The Blog track will migrate to a new Microblog track and will investigate social search, especially search over Twitter data. Two more new tracks will be added: Crowdsourcing (as a means of evaluation) and Medical records (content-based access to the free text fields of medical records, e.g., find patients with disease X treated with Y). Finally, CMU is planning another Web crawl, successor to ClueWeb09; one idea is to have a smaller set of pages, but crawled regularly over a period of time.

Hadoop Hackathon @SARA

November 4, 2010 by krisztianbalog

SARA organizes a kick-off meeting for its Proof-of-Concept Hadoop service on Dec 7, 2010 at the Science Park, Amsterdam. A major part of the event will be a “hackathon”, a hands-on introduction to Hadoop, with the support of two Hadoop-experts: Edgar Meij and Djoerd Hiemstra. It’s a good opportunity to learn about Hadoop and play with it on existing datasets (for example the Wikipedia, ENRON, or White House access records), or on a case of choice.

CLEF 2010 Keynote 2

September 21, 2010 by krisztianbalog

The second CLEF 2010 keynote, entitled Retrieval Evaluation in Practice, was given by Ricardo Baeza-Yates. As yesterday, here are my raw notes from the lecture.

CLEF 2010 Keynote 1

September 20, 2010 by krisztianbalog

I am at the CLEF conference this week. Here are my raw, unedited notes from the first keynote, IR Between Science and Engineering, and the Role of Experimentation, by Norbert Fuhr.