Research Theme: Language Modeling

This research theme focuses on the development and application of (large and small) language models for information access, information extraction, and evaluation.

Keynotes, invited talks, and lectures

A Simple Introduction to Word Embeddings

North Eastern University
Seattle, USA, April 2016
SlideShare | PPT

Invited participation

Dagstuhl Seminar on Retrieval-Augmented Generation – The Future of Search?, September 2025
Future of Information Retrieval Research in the Age of Generative AI (report), by Computing Community Consortium, July 2024

Workshop organization

VulGen: International Workshop on Vulnerabilities in Generative Systems for Information Retrieval, SIGIR, July 2026
LLM4Eval: Large Language Model for Evaluation in IR, SIGIR, July 2025
LLM4Eval: Large Language Model for Evaluation in IR, WSDM, March 2025
LLM4Eval: Large Language Model for Evaluation in IR, SIGIR, July 2024

Publications

Judging the Judges: A Collection of LLM-Generated Relevance Judgements

Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
Preprint, 2025
PDF | ArXiv

Retrieval-Augmented Generation – The Future of Search? (Dagstuhl Perspectives Workshop 25391)

Matthias Hagen, Josiane Mothe, Smaranda Muresan, Martin Potthast, Min Zhang, Benno Stein, Qinqyao Ai, Mohammad Aliannejadi, Liesbeth Allein, Avishek Anand, Sophia Althammer, Nolwenn Bernard, Arjen P. de Vries, Niklas Deckers, Gianluca Demartini, Laura Dietz, Carsten Eickhoff, Maik Fröbe, Norbert Fuhr, Marcel Gohsen, Michael Granitzer, Faegheh Hasibi, Sebastian Heineking, Djoerd Hiemstra, Adam Jatowt, Abhinav Joshi, Johannes Kiesel, Wojciech Kusa, Sean MacAvaney, Bhaskar Mitra, Jian-Yun Nie, Heather O’Brien, Birte Platow, Mark Sanderson, Harrisen Scells, Damiano Spina, Benno Stein , Johanne Trippas, Stefan Voigt, and Guido Zuccon
Dagstuhl Reports (to appear), 2025
PDF

Towards Understanding Bias in Synthetic Data for Evaluation

Hossein A. Rahmani, Varsha Ramineni, Nick Craswell, Bhaskar Mitra, and Emine Yilmaz
In proc. ACM CIKM, 2025
Publication | PDF | ArXiv

LLM4Eval: Large Language Model for Evaluation in IR

Clemencia Siro, Hossein A. Rahmani, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
In proc. ACM SIGIR, 2025
Publication | PDF

JudgeBlender: Ensembling Automatic Relevance Judgments

Hossein A. Rahmani, Emine Yilmaz, Nick Craswell, and Bhaskar Mitra
In proc. ACM TheWebConf, 2025
Publication | PDF | ArXiv

SynDL: A Large-Scale Synthetic Test Collection for Passage Retrieval

Hossein A. Rahmani, Xi Wang, Emine Yilmaz, Nick Craswell, Bhaskar Mitra, and Paul Thomas
In proc. ACM TheWebConf, 2025
Publication | PDF | ArXiv

LLM4Eval@WSDM 2025: Large Language Model for Evaluation in Information Retrieval

Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L.A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
In proc. ACM WSDM, 2025
Publication | PDF

LLMJudge: LLMs for Relevance Judgments

Hossein A. Rahmani, Emine Yilmaz, Nick Craswell, Bhaskar Mitra, Paul Thomas, Charles L. A. Clarke, Mohammad Aliannejadi, Clemencia Siro, and Guglielmo Faggioli
In proc. LM4Eval: The First Workshop on Large Language Models for Evaluation in Information Retrieval, ACM SIGIR, 2024
Publication | PDF | ArXiv

Proceedings of The First Workshop on Large Language Models for Evaluation in Information Retrieval (LLM4Eval 2024)

Clemencia Siro, Mohammad Aliannejadi, Hossein A. Rahmani, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
Proceedings

Report on the 1st Workshop on Large Language Model for Evaluation in Information Retrieval (LLM4Eval 2024) at SIGIR 2024

Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
In ACM SIGIR Forum, 2024
Publication | PDF | ArXiv

LLM4Eval: Large Language Model for Evaluation in IR

Hossein A. Rahmani, Clemencia Siro, Mohammad Aliannejadi, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, and Emine Yilmaz
In proc. ACM SIGIR, 2024
Publication | PDF

Synthetic Test Collections for Retrieval Evaluation

Hossein A. Rahmani, Nick Craswell, Emine Yilmaz, Bhaskar Mitra, and Daniel Campos
In proc. ACM SIGIR, 2024
Publication | PDF | ArXiv

Large Language Models can Accurately Predict Searcher Preferences

Paul Thomas, Seth Spielman, Nick Craswell, and Bhaskar Mitra
In proc. ACM SIGIR, 2024
Publication | PDF | ArXiv

Learning to Extract Structured Entities Using Language Models

Haolun Wu, Ye Yuan, Liana Mikaelyan, Alexander Meulemans, Xue Liu, James Hensman, and Bhaskar Mitra
In proc. EMNLP, 2024
Publication | PDF | ArXiv

Bhaskar Mitra | ভাস্কর মিত্র

Research Theme: Language Modeling

Keynotes, invited talks, and lectures

A Simple Introduction to Word Embeddings

Invited participation

Workshop organization

Publications

Judging the Judges: A Collection of LLM-Generated Relevance Judgements

Retrieval-Augmented Generation – The Future of Search? (Dagstuhl Perspectives Workshop 25391)

Towards Understanding Bias in Synthetic Data for Evaluation

LLM4Eval: Large Language Model for Evaluation in IR

JudgeBlender: Ensembling Automatic Relevance Judgments

SynDL: A Large-Scale Synthetic Test Collection for Passage Retrieval

LLM4Eval@WSDM 2025: Large Language Model for Evaluation in Information Retrieval

LLMJudge: LLMs for Relevance Judgments

Proceedings of The First Workshop on Large Language Models for Evaluation in Information Retrieval (LLM4Eval 2024)

Report on the 1st Workshop on Large Language Model for Evaluation in Information Retrieval (LLM4Eval 2024) at SIGIR 2024

LLM4Eval: Large Language Model for Evaluation in IR

Synthetic Test Collections for Retrieval Evaluation

Large Language Models can Accurately Predict Searcher Preferences

Learning to Extract Structured Entities Using Language Models