Building a RAG Batch Inference Pipeline with Anyscale and Union

Post Details

Company

Anyscale

Date Published

Sept. 12, 2024

Author

Kevin Su and Kai-Hsun Chen

Word Count

1,665

Company Posts That Month

4

Language

English

Hacker News Points

-

Source URL

www.anyscale.com/blog/anyscale-union-batch-inference-pipeline

Summary

This blog showcases the versatility of Ray, an open-source unified compute framework, by demonstrating embedding generation and LLM batch inference with Ray in two Flyte pipelines. Flyte is an open-source orchestrator that facilitates building production-grade data and machine learning pipelines. The blog also highlights the importance of a unified distributed computation framework like Ray and a workflow orchestrator like Flyte for managing AI/ML workloads. Anyscale, built by the creators of Ray, provides a seamless user experience for developers to deploy AI/ML workloads at scale, while Union, built by the technical founding team behind Flyte, abstracts away the infrastructure, providing a turnkey system that lets ML engineers and data scientists focus on their tasks. The blog then dives into two Flyte pipelines: one for generating embeddings using Ray Data and saving them to cloud storage shared by Union and Anyscale; and another for monitoring GitHub issues in Flyte repositories and using the Anyscale Platform to serve an LLM with RAG to perform batch inference and reply to the GitHub issues.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Vector Search	19	3,675	269	79	+77%
LLM	6	3,889	441	129	+7%
RAG	5	1,936	254	78	-19%
Serverless	1	647	170	80	+31%