Introducing Pinecone Inference to streamline your AI workflow

Company

Pinecone

Date Published

July 9, 2024

Author

Gibbs Cullen

Word count

423

Language

English

Hacker News points

None

URL

www.pinecone.io/blog/pinecone-inference

Summary

Pinecone has introduced Inference, an API designed to streamline AI workflows by providing easy access to embedding and reranking models hosted on its infrastructure. This tool complements Pinecone's vector database, making it a comprehensive solution for embedding, managing, and retrieving vector data through a single API, thereby reducing complexity and tool management for developers. Inference is in public preview and initially supports the multilingual-e5-large model, chosen for its open-source nature, multilingual capabilities, and performance across languages. The API simplifies the process of choosing and implementing embedding models, allowing developers to focus on prototyping and deploying AI applications. While Inference aims to accelerate AI development, Pinecone also offers support for dense embeddings with up to 20k dimensions from any model provider. The service is available to all users, with pricing tiers based on usage, and plans to expand with additional models and features in the near future.