Introducing the New Weaviate Confluent Apache Kafka® Connector: Real-Time Vector Data Pipelines Made Easy
Blog post from Weaviate
The Weaviate Sink Confluent Apache Kafka Connector is a new tool designed to facilitate the integration of Kafka with Weaviate’s vector database, enabling real-time data ingestion and transformation of Kafka messages into structured objects for retrieval and generative AI applications. This connector supports real-time streaming, full CRUD operations, and integrated vectorization with robust error-handling and retry logic, while accommodating JSON, Avro, and Protobuf message formats. It provides a scalable and secure solution verified by Confluent, allowing users to seamlessly incorporate vector-based indexing into existing pipelines without the need for custom code. The connector operates by subscribing to Kafka topics, processing messages through a configurable pipeline, and pushing data to Weaviate via gRPC API, with configuration managed through a JSON file, making it user-friendly for deployment in environments like Confluent Cloud and Weaviate Cloud.