Company
Date Published
Author
Suleman Kazi & Vivek Sourabh
Word count
2018
Language
English
Hacker News points
None

Summary

Vectara has released its new state-of-the-art multilingual retrieval model, named Boomerang, which is designed to improve search and generative AI use-cases. The model has strong generalization capabilities and can embed text in hundreds of languages. Performance comparisons with other embedding models show that Boomerang performs well on English datasets but can be outperformed by some models on specific domains. However, it consistently outperforms many open-source models on multilingual and cross-lingual settings. A design partner case study shows significant gains in retrieval performance for Vectara's customers when using the new model, with improvements of 54% relative in Precision@1 and 39% relative in Recall@20 compared to the legacy model. Boomerang is now available for use on the Vectara platform, and users can try it out by creating a new corpus or selecting it as the encoder when uploading data.