Kreuzberg & SurrealDB: from unstructured documents to hybrid retrieval
Blog post from SurrealDB
The integration of Kreuzberg and SurrealDB offers a streamlined solution for building document search and retrieval systems by combining the document intelligence framework of Kreuzberg with SurrealDB's multi-model database capabilities. This partnership facilitates the extraction, chunking, and embedding of over 88 document formats, while SurrealDB manages the storage, indexing, and search functionalities, including both keyword and semantic search options. The kreuzberg-surrealdb connector simplifies the ingestion workflow with automatic schema setup and content deduplication, enabling immediate readiness for document search. By integrating these processes into a single system, it eliminates the need for separate tools and supports both BM25 keyword search and advanced hybrid search methods using HNSW vector indexes and Reciprocal Rank Fusion.