Home / Companies / Vespa / Blog / Post Details
Content Deep Dive

Pretrained Transformer Language Models for Search - part 3

Blog post from Vespa

Post Details
Company
Date Published
Author
Jo Kristian Bergum
Word Count
2,014
Language
English
Hacker News Points
-
Summary

The blog post explores the use of the ColBERT model in a multiphase retrieval and ranking pipeline using Vespa.ai to enhance search results. By implementing ColBERT, a contextualized late interaction model, as a re-ranking phase over a dense retriever, the post demonstrates achieving close to state-of-the-art ranking performance on the MS Marco Passage ranking dataset utilizing a compact transformer model with only 22 million parameters. The ColBERT model employs a MaxSim function for relevancy scoring by computing cosine similarity between query and document token embeddings, allowing efficient re-ranking of documents retrieved in the first phase. The blog post further discusses the technical details of Vespa's implementation, including the use of tensor fields for storing token embeddings, ONNX format for representing the query encoder, and the ability to balance accuracy versus cost by adjusting retrieval settings. The post concludes by hinting at the next entry in the series, which will introduce a cross-encoder model for an additional re-ranking step.