EACL paper featured on the Google Research Blog

I’m excited to share that our EACL 2026 paper has been featured on the Google Research Blog!

We explore how to move beyond simple performance metrics to ensure simulated users actually behave like real ones and introduce a unique dual-agent data collection protocol that enables counterfactual validation. We also publicly release a new dataset of 4k+ human-AI shopping conversations.

Read the full deep-dive here: https://research.google/blog/convapparel-measuring-and-bridging-the-realism-gap-in-user-simulators/

CACM Opinion piece available online

I’m happy to share that our latest opinion piece, “The Indispensable Role of User Simulation in the Pursuit of AGI,” is now available in Communications of the ACM.

In this article, we argue that the path to Artificial General Intelligence (AGI) is currently blocked by two major bottlenecks: the lack of scalable evaluation and the scarcity of high-quality interaction data. We propose that user simulation is not just a helpful tool, but a critical catalyst for overcoming these challenges.

Read the full piece here: https://cacm.acm.org/opinion/the-indispensable-role-of-user-simulation-in-the-pursuit-of-agi/

EACL’26 and ECIR’26 papers

I’m excited to share some recent research we’ve been doing in the areas of user simulation, recommender systems, and explainability. The following papers will be presented at the upcoming EACL and ECIR conferences. Importantly, all these papers come with publicly available resources!

SIGIR’25 contributions

I’m happy to share that I’ll be attending SIGIR ’25, which is shaping up to be a busy and exciting event.

Accepted papers:

  • “Rankers, Judges, and Assistants: Towards Understanding the Interplay of LLMs in Information Retrieval Evaluation” — perspectives paper with Don Metzler and Zhen Qin [PDF]
  • “GINGER: Grounded Information Nugget-Based Generation of Responses” — short paper with W. Łajewska [PDF]
  • “MultiConAD: A Unified Multilingual Conversational Dataset for Early Alzheimer’s Detection” — resource paper with Arezo Shakeri and Mina Farmanbar [PDF]

In addition to the papers, I’ll also be giving a tutorial, together with Nolwenn Bernard, Saber Zerhoudi, and ChengXiang Zhai, on “Theory and Toolkits for User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation” [website]. The tutorial covers key simulation methodologies, with a particular focus on recent advancements leveraging LLMs. Crucially, we will also provide practical guidance, highlighting relevant toolkits, libraries, and datasets available to researchers and practitioners.

Finally, I’m co-organizing the Second SIGIR Workshop on Simulations for Information Access (Sim4IA 2025) together with Philipp Schaer, Christin Katharina Kreutz, Timo Breuer, and Andreas Konstantin Kruff [website]. The workshop features a keynote, invited tech talks, a panel discussion, and (micro) shared tasks for simulating interactions with a traditional search engine or a conversational assistant.

If you’re attending the conference, please come say hello, drop into the tutorial or workshop, or reach out ahead of time—I’d love to connect.

PhD position in Large Language Models for Recommendation

I have a PhD position in Large Language Models for Recommendation, funded by the NorwAI research-based innovation center.

The proposed PhD project aims to advance the field of personalized recommender systems by harnessing the natural language reasoning capabilities of large language models (LLMs). The research will focus on three key areas:

  1. developing methods to construct natural language user interest profiles that enhance transparency and provide user control over recommendations; 
  2. designing conversational recommendation systems that utilize LLMs to effectively elicit user preferences and generate tailored responses, including both recommendations and explanations; and
  3. developing approaches to mitigate limitations of LLMs through retrieval-augmented generation (RAG) and tool use.

Overall, this project seeks to push the boundaries of how LLMs can be applied to create more intuitive, responsive, and user-centered recommendation systems.


See the details on jobbnorge. Application deadline is Oct 31.