OpenSearch vs LanceDB for Vector Search: Query Cost and Infrastructure

Post Details

Company

LanceDB

Date Published

May 11, 2026

Author

Justin Miller

Word Count

3,782

Language

English

Hacker News Points

-

Source URL

www.lancedb.com/blog/opensearch-vs-lancedb-for-vector-search-query-cost-and-infrastructure

Summary

Choosing between OpenSearch and LanceDB for vector databases involves a tradeoff between a distributed search service and an embedded library, with each having distinct infrastructure and cost implications. OpenSearch operates as a distributed cluster with full-text search, security, and other features, storing vectors and HNSW graph in RAM and EBS, while images are stored in S3. LanceDB, on the other hand, stores everything in S3 using a columnar file format, pulling index pages into memory as needed, which allows it to scale with query per second (QPS) rather than corpus size, resulting in potentially lower costs. Both systems handle a workload involving 287,360 images from the COCO 2017 dataset, embedded into 1152-dimensional vectors, with LanceDB being generally more cost-effective due to its reliance on S3 for storage and its ability to scale with demand. The key cost driver is how each system stores and accesses the vector index, with OpenSearch's costs scaling with RAM usage, while LanceDB's costs scale with QPS and S3 GET requests. Operational complexity differs, with OpenSearch offering broader features and LanceDB focusing on efficient vector search, and the choice between them should consider the specific needs such as recall targets, latency, and the necessity of additional features like full-text search and security.