SIGIR ’25 contributions

I’m happy to share that I’ll be attending SIGIR ’25, which is shaping up to be a busy and exciting event.

Accepted papers:

  • “Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation” — perspectives paper with Don Metzler and Zhen Qin [PDF]
  • “GINGER: Grounded Information Nugget-Based Generation of Responses” — short paper with W. Łajewska [PDF]
  • “MultiConAD: A Unified Multilingual Conversational Dataset for Early Alzheimer’s Detection” — resource paper with Arezo Shakeri and Mina Farmanbar [PDF]

In addition to the papers, I’ll also be giving a tutorial, together with Nolwenn Bernard, Saber Zerhoudi, and ChengXiang Zhai, on “Theory and Toolkits for User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation” [website]. The tutorial covers key simulation methodologies, with a particular focus on recent advancements leveraging LLMs. Crucially, we will also provide practical guidance, highlighting relevant toolkits, libraries, and datasets available to researchers and practitioners.

Finally, I’m co-organizing the Second SIGIR Workshop on Simulations for Information Access (Sim4IA 2025) together with Philipp Schaer, Christin Katharina Kreutz, Timo Breuer, and Andreas Konstantin Kruff [website]. The workshop features a keynote, invited tech talks, a panel discussion, and (micro) shared tasks for simulating interactions with a traditional search engine or a conversational assistant.

If you’re attending the conference, please come say hello, drop into the tutorial or workshop, or reach out ahead of time—I’d love to connect.

User Simulation tutorial at AAAI’24 and WWW’24

Together with ChengXiang Zhai, we will be giving our user simulation tutorial at AAAI’24 and WWW’24, customized to the respective audiences. The AAAI’24 edition adopts a broader perspective of user simulation for evaluating an interactive AI system and focuses more on simulation algorithms and techniques that are well connected with various sub-fields of AI, such as machine learning and agent-based systems. In contrast, the WWW’24 edition emphasizes more on applications of user simulation for evaluating Web information access systems, including click modeling and the application of simulation techniques in e-commerce.

Event reports

The June 2022 issue of SIGIR Forum reports on two events that I co-organized:

TREC 2010 summary

The 19th Text REtrieval Conference (TREC) took place at the “usual” time and place: Gaithersburg, MD, in the second half of November. Seven tracks ran in 2010: Blog, Chemical IR, Entity, Legal, Relevance Feedback, Session, and Web.
The Entity track was very popular both in terms of the number of participants and the number of posters presented. The proposed approaches displayed a great degree of diversity and made the presentations very interesting. I don’t want to repeat myself, so I refer to the posts on the Entity website for the conference summary and plans for 2011.
As to TREC 2011, the Chemical IR, Entity, Session, Legal, and Web tracks will continue. The Blog track will migrate to a new Microblog track and will investigate social search, especially search over Twitter data. Two more new tracks will be added: Crowdsourcing (as a means of evaluation) and Medical records (content-based access to the free text fields of medical records, e.g., find patients with disease X treated with Y). Finally, CMU is planning another Web crawl, successor to ClueWeb09; one idea is to have a smaller set of pages, but crawled regularly over a period of time.

CLEF 2010 Keynote 2

The second CLEF 2010 keynote, entitled Retrieval Evaluation in Practice, was given by Ricardo Baeza-Yates. As yesterday, here are my raw notes from the lecture.

Read more…