Home / Companies / Voyage AI / Blog / Post Details
Content Deep Dive

Domain-Specific Embeddings and Retrieval: Legal Edition (voyage-law-2)

Blog post from Voyage AI

Post Details
Company
Date Published
Author
Voyage AI
Word Count
1,180
Language
English
Hacker News Points
-
Summary

Voyage-law-2 is a newly released domain-specific embedding model optimized for legal document retrieval, significantly outperforming general-purpose models like OpenAI v3 large, particularly in legal contexts. Trained on a vast dataset of legal documents, it features a 16K-context length, excelling in long-context retrieval. On eight legal retrieval datasets, voyage-law-2 led in seven, including notable performance on LeCaRDv2, LegalQuAD, and GerDaLIR with over 10% improvement in comparison to competitors. The model also demonstrates strong cross-domain capabilities, having been trained on various domains to enhance its applicability outside the legal field. It surpasses OpenAI v3 large in retrieval tasks across 34 datasets and eight categories, including technical documentation, finance, and medicine, indicating its robust adaptability and effectiveness in various contexts.