Home / Companies / Pinecone / Blog / Post Details
Content Deep Dive

Introducing reranking to Pinecone Inference to simplify building accurate AI

Blog post from Pinecone

Post Details
Company
Date Published
Author
Xian Huang
Word Count
743
Language
English
Hacker News Points
-
Summary

Pinecone Inference has introduced reranking capabilities to its API, enhancing the efficiency and accuracy of AI applications by scoring and filtering documents based on semantic relevance to a query. This feature, currently in public preview, supports the bge-reranker-v2-m3 model and aims to reduce hallucination and costs associated with AI model operations. By integrating rerankers into vector retrieval systems, such as RAG applications, Pinecone enables efficient document filtering, reducing the computational resources required and improving overall accuracy. The reranking process significantly decreases input token costs by up to 85% when used with models like GPT-4, streamlining the building of AI applications by embedding, managing, querying, and reranking data through a single API. This development simplifies the AI development stack, reducing the need for multiple tools and infrastructures, and is available for free public preview until August 31, 2024, after which it will cost $0.002 per request.