Building a High-Performance Entity Matching Solution with Qdrant - Rishabh Bhardwaj | Vector Space Talks
Blog post from Qdrant
Rishabh Bhardwaj, a data engineer at HRS Group, shared insights into building a high-performance hotel entity matching solution using Qdrant, an open-source vector database, during a Vector Space Talk with Demetrios Brinkmann. Initially, the project experimented with Postgres, but Qdrant was found to offer superior performance in terms of speed and recall, aided by the Hierarchical Navigable Small World (HNSW) algorithm. The solution addresses data inconsistency, duplication, and real-time processing challenges by employing the Mini LM model for embedding creation, which balances speed and accuracy effectively. Geofiltering is used to ensure accurate matching based on hotel locations, while GDPR compliance is maintained through secure infrastructure. The project evolved from an MVP using Postgres to a scalable architecture leveraging AWS services, demonstrating significant improvements in both performance and resource optimization.