Fine-Tuning Sparse Embeddings for E-Commerce Search | Part 2: Training SPLADE on Modal
Blog post from Qdrant
This article, part of a series on fine-tuning sparse embeddings for e-commerce search, focuses on training the SPLADE model using Amazon's ESCI dataset on Modal's serverless GPUs. The ESCI dataset, notable for its graded relevance labels, allows the model to learn nuanced product matches by treating both exact and substitute products as relevant during training. The SPLADE model's performance depends on careful product text formatting, using specific tokens to maintain lexical signals. The training process involves a SparseEncoder built from a DistilBERT base model, utilizing a contrastive loss combined with sparsity regularization to optimize query and product embeddings. The article emphasizes the use of Modal's infrastructure for efficient training, highlighting persistent storage to manage checkpoints and the advantages of detached runs to prevent data loss. It also warns against the pitfalls of replacing transformers with static embeddings, which led to poor results due to the loss of contextual understanding essential for e-commerce queries.