Introducing Qdrant Cloud Inference

Post Details

Company

Qdrant

Date Published

July 15, 2025

Author

Daniel Azoulai

Word Count

539

Language

English

Hacker News Points

-

Source URL

qdrant.tech/blog/qdrant-cloud-inference-launch

Summary

Qdrant Cloud Inference is a newly launched service that enables users to generate, store, and index embeddings for text and images in a seamless manner using a single API call, thereby simplifying workflows and accelerating application development for various use cases like RAG, multimodal, and hybrid search. By integrating model inference directly into Qdrant Cloud, it eliminates the need for separate infrastructure and manual data pipelines, reducing complexity, latency, and network costs. The service supports several curated models for different search applications and uniquely accommodates OpenAI CLIP-style models for multimodal tasks. Paid users of Qdrant Cloud can benefit from a monthly allocation of free tokens to ease onboarding and development processes, while inference capabilities are automatically enabled for all paid clusters with the appropriate software version. This integration allows for a more efficient and streamlined approach to developing AI applications without the need for additional tools or APIs.