Home / Companies / Zilliz / Blog / Post Details
Content Deep Dive

Embedding Inference at Scale for RAG Applications with Ray Data and Milvus

Blog post from Zilliz

Post Details
Company
Date Published
Author
By Christy Bergman, and Cheng Su
Word Count
1,761
Language
English
Hacker News Points
-
Summary

This blog discusses the use of Retrieval Augmented Generation (RAG) applications with open-source tools such as Ray Data and Milvus. The author highlights the performance boost achieved using Ray Data during the embedding step, where data is transformed into vectors. By using just four workers on a Mac M2 laptop with 16GB RAM, Ray Data was found to be 60 times faster than Pandas. The blog also presents an open-source RAG stack that includes BGM-M3 embedding model, Ray Data for fast, distributed embedding inference, and Milvus or Zilliz Cloud vector database. The author provides a step-by-step guide on how to set up these tools and use them to generate embeddings from data downloaded from Kaggle IMDB poster. Additionally, the blog discusses the benefits of using bulk import features in Milvus and Zilliz Cloud for efficient batch loading of vector data into a vector database.