Improving Product Search with Learning to Rank - part two
Blog post from Vespa
In the continuation of a series on improving product search with learning to rank, this blog post explores the use of labeled relevance judgments to train deep neural ranking models, specifically focusing on neural cross-encoder and bi-encoder methods based on pre-trained language models. The dataset is split into training and development sets to avoid overfitting, and the models are evaluated using the Normalized Discounted Cumulative Gain (NDCG) metric. The cross-encoder model, which inputs both queries and documents into a Transformer model simultaneously, and the bi-encoder model, which encodes them independently, are compared. The results show that while both models improved over zero-shot baselines, the cross-encoder provided superior ranking performance. The blog also discusses the trade-offs between model accuracy, deployment costs, and the choice of product fields for model inputs. The post details the implementation of these models in Vespa for efficient product ranking and highlights the potential of combining neural models with lexical and statistical features for enhanced performance in future work.