2026 – Krisztian Balog

I’m excited to share some recent research we’ve been doing in the areas of user simulation, recommender systems, and explainability. The following papers will be presented at the upcoming EACL and ECIR conferences. Importantly, all these papers come with publicly available resources!

ConvApparel: A Benchmark Dataset and Validation Framework for User Simulators in Conversational Recommenders (EACL full paper, with Google colleagues O. Meshi, S. Goldman, A. Caciularu, G. Tennenholtz, J. Jeong, A. Globerson, and C. Boutilier) — This work proposes a comprehensive validation framework for user simulators, combining statistical alignment, a human-likeness score, and counterfactual validation.
Trust Me on This: A User Study of Trustworthiness for RAG Responses (ECIR short paper, with W. Łajewska) — This study investigates how different types of explanations can influence user trust in a RAG setting.
UserSimCRS v2: Simulation-Based Evaluation for Conversational Recommender Systems (ECIR resource paper, with N. Bernard) — This paper presents significant extensions to the UserSimCRS toolkit, including LLM-based simulators, support for a wider range of CRSs and datasets, and new evaluation metrics and utilities.
Sim4IA-Bench: A User Simulation Benchmark Suite for Next Query and Utterance Prediction (ECIR resource paper, with A. K. Kruff, C. K. Kreutz, T. Breuer, and P. Schaer) — This work presents the simulation benchmark that is the result of the micro shared-tasks we ran at the Sim4IA workshop @SIGIR2025.
SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation (ECIR resource paper, with M. Arustashvili) — This paper introduces a synthetic dataset for NL profile-based recommendation in the scholarly domain.