Pretrained Transformer Language Models for Search - part 1
Blog post from Vespa
Pre-trained Transformer language models, particularly BERT, have significantly advanced text ranking and search, as illustrated by their impact on the MS Marco Passage ranking dataset. This blog series explores how Vespa.ai employs these models in multi-phase retrieval and ranking pipelines, achieving near state-of-the-art results with compact models, outperforming larger ensembles. The series introduces three key methods for applying Transformers in text ranking: representation-based ranking, all-to-all interaction models, and late interaction models exemplified by ColBERT. These models require fine-tuning with training data to optimize retrieval or ranking tasks. The series emphasizes the shift in information retrieval towards neural methodologies, dubbing this the "BERT revolution," and highlights the importance of using efficient retrieval strategies, such as hybrid dense-sparse methods, to manage computational complexity in multi-stage pipelines. The MS Marco dataset serves as a benchmark for evaluating the effectiveness of these approaches, with the mean reciprocal rank (MRR@10) being the primary metric for assessment.