Unpacking Semantic Caching at Walmart

Post Details

Company

Portkey

Date Published

Feb. 5, 2024

Author

Vrushank Vyas

Word Count

576

Language

English

Hacker News Points

-

Source URL

portkey.ai/blog/semantic-caching-at-scale-with-walmarts-chief-architect

Summary

Rohit Chatter, Chief Software Architect at Walmart Tech Global, participated in a fireside chat with the LLMs in the Prod community, discussing Walmart's transition to Generative AI and semantic caching in its retail operations. The conversation covered Walmart's shift from traditional NLP to Generative AI models, such as BERT-based Mini LM V6, to improve e-commerce search by handling complex and contextually relevant product groupings and enhancing product recommendations. The company fine-tunes models like MiniLMv2 and T0 with customer engagement data to enhance search relevance for ambiguous queries, using techniques like Approximate Nearest Neighbour search for relevance matching. Walmart also implements semantic caching to cluster queries based on conceptual similarity, achieving about a 50% cache hit rate. Challenges include reducing search latency and plans for future developments in personalization, voice, and visual search. The session highlighted Walmart's strategies for improving customer experience and long-term ROI from Generative AI implementations.