PhD thesis online

My PhD thesis titled People Search in the Enterprise is made available online. Contact me if you want a paperback version!

Part of the contributions of the thesis is a collection of resources, including software code, as well as data. These will come in several releases, starting very soon…

Thesis approved

I am happy to announce that my PhD thesis titled People Search in the Enterprise has been approved by the committee. The public PhD defense will take place on the 30th of September, 2008.
It is planned that the final version of thesis will be made available online early July, 2008.

Thesis completed

I am happy to announce that my thesis titled People Search in the Enterprise has been completed and submitted to the committee.

The main focus in the thesis is on two main expertise retrieval tasks: (1) expert finding — identifying a list of people who are knowledgeable about a given topic (“Who are the experts on topic X?”) and (2) expert profiling — returning a list of topics that a person is knowledgeable about (“What topics does person Y know about?”). In the thesis, expertise retrieval is approached as an association finding task between people and topics.

The main contribution of the thesis is a generative probabilistic modeling framework for capturing the expert finding and profiling tasks in a uniform way. On top of this general framework two main families of models are introduced, by adapting generative language modeling techniques for document retrieval in a transparent and theoretically sound way.

Throughout the thesis we extensively evaluate and compare these baseline models across different organizational settings, and perform an extensive and systematic exploration and analysis of the experimental results obtained. We show that our baseline models are robust yet deliver very competitive performance.

Through a series of examples we demonstrate that our generic models are able to incorporate and exploit special characteristics and features of test collections and/or the organizational settings that they represent. Additionally, we address a number of related tasks, including finding similar experts, mining contact details of people, and enterprise document search.

Finally, we provide further examples that illustrate the generic nature of our baseline models and apply them to find associations between topics and entities other than people.

Assuming that the committee’s answer is affirmative, the thesis is going to be printed in early June 2008.