Improving Product Search with Learning to Rank - part one
Blog post from Vespa
In the blog post, the author explores the paradigm shift in product search ranking brought about by pre-trained language models like BERT, which have outperformed traditional statistical methods in data-rich environments. The focus is on using learning to rank techniques to improve product search performance, demonstrated through a large Amazon dataset containing complex search queries and relevance judgments. The post examines various zero-shot baseline ranking models, including traditional lexical models like BM25 and Vespa's nativeRank, semantic models using dense vector embeddings, and hybrid models combining lexical and semantic approaches. The findings reveal the challenges of applying dense vector models in zero-shot settings and highlight the effectiveness of lexical methods. The author suggests future posts will delve into training more sophisticated models using the labeled dataset, exploring advanced techniques like semantic similarity models, gradient boosting, and ensemble models within the Vespa platform.