Cohere Embeddings Now Available Through Elastic’s Inference API

Company

Cohere

Date Published

March 28, 2024

Author

Jamie Linsdell, Elliott Choi

Word count

1338

Language

English

Hacker News points

None

URL

cohere.com/blog/elastic-inference-api

Summary

Cohere's Embed v3 models are now accessible through Elastic's Inference API with the release of Elasticsearch 8.13, enabling businesses to efficiently create and index embeddings for vector and hybrid searches across their documents. This integration allows developers to utilize Elastic's ingest pipelines for adding Cohere embeddings to indices with a single API call, benefiting from native embedding compression to significantly reduce storage costs by 75%. The API simplifies vector search by negating the need for self-hosted models and supports both float and int8 (byte) embeddings, with int8 compression reducing embedding sizes by four times without compromising search quality. Cohere's embedding models are particularly cost-effective, offering competitive performance against OpenAI's models at a reduced storage expense, and the integration aims to make semantic search more accessible within Elastic's platform.