Harvey partners with Voyage to build custom legal embeddings

Post Details

Company

Voyage AI

Date Published

July 31, 2024

Author

Voyage AI

Word Count

563

Company Posts That Month

1

Language

English

Hacker News Points

-

Post removed?

No

Source URL

blog.voyageai.com/2024/07/31/harvey-partners-with-voyage-to-build-custom-legal-embeddings

Summary

Retrieval-augmented-generation (RAG) systems, crucial in real-world large language model (LLM) applications, are enhanced by embeddings, which allow retrieval based on semantic meaning. However, standard embeddings, trained on general data, often fail in specialized fields like law, where distinguishing relevant text can be challenging. Voyage AI, led by Stanford's Tengyu Ma, excels in developing customized embedding models tailored for specific domains. Collaborating with Harvey, Voyage AI fine-tuned embeddings using voyage-law-2, training on over 20 billion tokens of US case law and expert annotations. This led to the creation of voyage-law-2-harvey, a custom model that significantly improved retrieval accuracy by reducing irrelevant results by nearly 25% compared to other leading models, while also benefiting storage and latency due to reduced embedding dimensionality. Harvey plans to continue working with Voyage AI to develop additional custom embedding models for legal and other domains to further enhance enterprise search and RAG systems.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Vector Search	22	1,644	222	91	+2%
RAG	4	1,642	187	75	+52%
LLM	2	4,157	383	131	+53%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.