PhD Thesis

People Search in the Enterprise

PhD Thesis, University of Amsterdam, 2008
ISBN: 978-90-9023247-8

download PDF [3.7 MB]


Within an organizational setting it is natural to look not only for documents, but for entities: answers, services, objects, … people! The work described in this thesis focuses on core algorithms for two information access tasks: expert finding and profiling.

The main contribution of the thesis is a generative probabilistic modeling framework for capturing the expert finding and profiling tasks in a uniform way. On top of this general framework two main families of models are introduced, by adapting generative language modeling techniques for document retrieval in a transparent and theoretically sound way.

Throughout the thesis we extensively evaluate and compare these models across different organizational settings, and perform an extensive and systematic exploration and analysis of the experimental results obtained. Through a series of examples we demonstrate that these models are able to incorporate and exploit special characteristics and features of various organizational settings. Finally, we provide further examples that illustrate the generic nature of these models and apply them to find associations between topics and entities other than people.

See also:

In the news

My thesis work has received press coverage around the world. Check out the list here.


I received the Best Doctoral Consortium Paper Award at the ACM SIGIR Conference in 2007 for my dissertation topic, and the Victorine van Schaickprijs in 2009 for my PhD thesis.


Part of the contributions of the thesis is a collection of resources, including software code, as well as data. These will come in several releases.

The resources available so far are:

Additional resources that may end up here at some point:

  • lists of document-candidate associations for the W3C collection;
  • baseline runs reported in the thesis in TREC format, along with the corresponding EARS configuration settings.