All blog post summaries for Zilliz


2024

Why Milvus Makes Building RAG Easier, Faster, and More Cost-Efficient

Date published
May 17, 2024

Author(s)
By Ken Zhang

Language
English

Word count
1185

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, enhances the development of Retrieval Augmented Generation (RAG) applications by streamlining processes and improving efficiency. Its integration with popular embedding models simplifies text transformation into searchable vectors, while its hybrid search capability supports multimodal data retrieval. Additionally, Milvus offers a cost-effective solution for managing large knowledge bases through minimizing memory consumption, implementing tiered data storage, and leveraging intelligent caching and data-sharding techniques. Overall, Milvus helps developers build faster, more accurate, and cost-efficient RAG applications.

Zilliz Achieves AWS Generative AI Competency Partner Designation, Driving Innovation in AI Solutions

Date published
May 16, 2024

Author(s)
By Sachi Tolani

Language
English

Word count
334

Hacker News points
None found.

What's this blog post about?

Zilliz has achieved AWS Generative AI Competency Partner designation, demonstrating its commitment to advancing generative AI technologies. As an AWS Differentiated Partner, Zilliz provides critical infrastructure and best practices for implementing transformative generative AI applications such as image retrieval, video analysis, NLP, recommendation engines, customized search, intelligent customer service, fraud detection, and more. The company's expertise combined with the scalability, performance, and security of AWS Cloud enables customers to unlock new possibilities and gain a competitive edge in their industries. Zilliz remains dedicated to fostering collaboration, knowledge sharing, and responsible AI practices while working closely with AWS and its customers to shape the future of generative AI.

Running Llama 3, Mixtral, and GPT-4o

Date published
May 15, 2024

Author(s)
By Christy Bergman

Language
English

Word count
1801

Hacker News points
None found.

What's this blog post about?

This blog post discusses various ways to run the G-Generation part of Retrieval Augmented Generation (RAG) using different models and inference endpoints. The author provides step-by-step instructions on how to use Llama 3 from Meta, Mixtral from Mistral, and the newly announced GPT-4o from OpenAI. They also cover running these models locally or through Anyscale, OctoAI, and Groq endpoints. Additionally, the author explains how to evaluate answers using Ragas and provides a summary table of results for each model endpoint. The conclusion emphasizes the importance of considering answer quality, latencies, and costs when choosing an appropriate model and inference endpoint for the G-Generation part of RAG.

Harnessing Generative Feedback Loops in AI Systems with Milvus

Date published
May 10, 2024

Author(s)
By Uppu Rajesh Kumar

Language
English

Word count
2958

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database designed to store, index, and search massive amounts of vector data in real-time, can be integrated with LLMs in a generative feedback loop. This allows for continuous learning and improvement of the AI system. Feedback loops are crucial in ensuring the ongoing refinement of model outputs in AI systems, offering benefits such as adaptability to new data, reduced bias and errors, personalized model outputs, and enhanced creativity and innovation. Milvus's features make it suitable for enhancing the data-handling capabilities of LLMs, particularly in scenarios where feedback loops are used to refine predictive and generative accuracies.

Exploring DSPy and Its Integration with Milvus for Crafting Highly Efficient RAG Pipelines

Date published
May 9, 2024

Author(s)
By David Wang

Language
English

Word count
2043

Hacker News points
None found.

What's this blog post about?

DSPy is a programmatic framework designed to optimize prompts and weights in language models (LMs), particularly in use cases where you integrate LMs across multiple pipeline stages. It provides various composable and declarative modules for instructing LMs in Pythonic syntax. Unlike traditional prompting engineering techniques that rely on manually crafting and tweaking prompts, DSPy learns query-answer examples and imitates this learning to generate optimized prompts for more tailored results. This allows for the dynamic reassembly of the entire pipeline, explicitly tailored to the nuances of your task, thus eliminating the need for ongoing manual prompt adjustments. DSPy has been integrated into the DSPy workflow as a retrieval module in the form of the MilvusRM Client, making it easier to implement a fast and efficient RAG pipeline. In this demonstration, we'll build a simple RAG application using GPT-3.5 (gpt-3.5-turbo) for answer generation. We use Milvus as the vector store through MilvusRM and DSPy to configure and optimize the RAG pipeline.

Milvus Reference Architectures

Date published
May 9, 2024

Author(s)
By Christy Bergman

Language
English

Word count
1017

Hacker News points
None found.

What's this blog post about?

This blog discusses resource allocation for Milvus, an open-source vector database. It provides reference architectures based on specific numbers of users or requests per second (RPS) and different mixes of READ and WRITE operations. The article emphasizes the importance of understanding workload characteristics to determine Milvus' computational power and memory requirements. It also outlines a method for estimating resource needs, load testing, benchmarking, and concludes with recommendations for resource allocation based on data size and QPS requirements.

Revolutionizing Search with Zilliz and Azure OpenAI

Date published
May 2, 2024

Author(s)
By Daniella Pontes

Language
English

Word count
1910

Hacker News points
None found.

What's this blog post about?

Zilliz and Azure OpenAI have integrated to redefine similarity and semantic search, offering remarkable speed, intelligence, and safeguards. The collaboration combines Azure OpenAI's advanced generative AI capabilities with Zilliz's scalable search solutions, enhancing AI search functionalities and data retrieval. This partnership enables seamless integration of AI models and scalable search solutions for developers. Zilliz is a specialized data management system optimized for managing high-dimensional vector data on a large scale, while Azure OpenAI provides additional features like private networking, regional availability, and responsible AI content filtering. The integration of these technologies offers robust data storage, sophisticated indexing options, and comprehensive similarity metrics and retrieval mechanisms, enabling developers to create scalable and efficient AI-driven search solutions.

Hybrid Search with Milvus

Date published
April 30, 2024

Author(s)
By Stephen Batifol

Language
English

Word count
1100

Hacker News points
None found.

What's this blog post about?

Milvus 2.4 introduces multi-vector search and hybrid search capabilities, allowing simultaneous queries across multiple vector fields and integrating the results with re-ranking strategies. Hybrid search is a process of conducting searches across various vector fields within the same dataset. This tutorial demonstrates how to leverage Milvus's hybrid search capabilities using the eSci dataset and BGE-M3 model. The steps include preparing the dataset, generating embeddings with BGE-M3, setting up a Milvus collection, inserting data into the collection, and executing hybrid searches in Milvus.

Vector Databases Are the Base of RAG Retrieval

Date published
April 28, 2024

Author(s)
By Ken Zhang

Language
English

Word count
1523

Hacker News points
None found.

What's this blog post about?

Implementing Retrieval Augmented Generation (RAG) technology in chatbots can significantly enhance customer support by combining large language models with knowledge stored in vector databases from various fields. RAG systems consist of two core components: the Retriever and the Generator, which work synergistically to handle complex queries effectively. Compared to traditional LLMs, RAG offers several advantages such as reduced hallucination issues, enhanced data privacy and security, and real-time information retrieval. While advancements in LLMs also address these challenges, RAG remains a robust, reliable, and cost-effective solution due to its transparency, operability, and private data management capabilities. RAG technology is often integrated with vector databases, leading to the development of popular solutions like the CVP stack. Vector databases are favored in RAG implementations for their efficient similarity retrieval capabilities, superior handling of diverse data types, and cost-effectiveness. Ongoing engineering optimizations aim to enhance the retrieval quality of vector databases by improving precision, response speed, multimodal data handling, and interpretability. As demand for RAG applications grows across various industries, RAG technology will continue to evolve and revolutionize information retrieval and knowledge acquisition processes.

The Landscape of Open Source Licensing in AI: A Primer on LLMs and Vector Databases

Date published
April 28, 2024

Author(s)
By Emily Kurze

Language
English

Word count
1467

Hacker News points
None found.

What's this blog post about?

This guide provides an overview of open-source licensing in the context of AI technology, specifically vector databases and large language models (LLMs). Open source allows creators to make software or hardware available for free, often developed and maintained by community efforts. Understanding different license types is crucial as changes can significantly impact companies and businesses that rely on open-source software. The benefits of open-source vector databases and LLMs include rapid prototyping, increased trust and transparency, and reduced costs for developers. Various types of licenses exist, including permissive licenses (e.g., MIT License), copyleft licenses (e.g., GNU General Public License), weak copyleft licenses (e.g., GNU Affero General Public License), non-commercial licenses (e.g., Creative Commons Non-Commercial License), and public domain releases. Key organizations like the Open Source Initiative, Free Software Foundation, and Apache Software Foundation govern open-source licensing standards. The degrees of openness in different licensing models influence collaboration, innovation, and transparency in AI development. Licensing plays a vital role in shaping AI technologies' trajectory by governing accessibility, adaptability, and equitable distribution.

Ensuring Data Privacy in AI Search with Langchain and Zilliz Cloud

Date published
April 27, 2024

Author(s)
By Antony G.

Language
English

Word count
1330

Hacker News points
None found.

What's this blog post about?

LangChain and Zilliz Cloud offer an effective combination to create AI-powered search systems. These systems use natural language processing (NLP) and machine learning algorithms to enhance the accuracy and relevance of information retrieval across business-specific data. With the rise of generative models, AI-powered search applications have become more prominent compared to traditional search methods. However, ensuring user privacy in these applications is critical due to ethical and legal implications. The integration of LangChain with Zilliz Cloud allows for the creation of custom search engines that prioritize data privacy while offering tailored solutions based on specific needs and data. Both tools provide robust frameworks for ensuring privacy and safety when utilizing large language models (LLMs), effectively preventing private data misuse and generating harmful or unethical content.

Practical Tips and Tricks for Developers Building RAG Applications

Date published
April 27, 2024

Author(s)
By James Luan

Language
English

Word count
2804

Hacker News points
None found.

What's this blog post about?

Vector search is a technique used in data retrieval for RAG applications and information retrieval systems to find items or data points that are similar or closely related to a given query vector. While many vector database providers market their capabilities as easy, user-friendly, and simple, building a scalable real-world application requires considering various factors beyond the coding, including search quality, scalability, availability, multi-tenancy, cost, security, and more. To effectively deploy your vector database in your RAG application production environment with Milvus, follow these best practices: design an effective schema, plan for scalability, and select the optimal index and fine-tune performance.

Demystifying the Milvus Sizing Tool

Date published
April 26, 2024

Author(s)
By Christy Bergman

Language
English

Word count
658

Hacker News points
None found.

What's this blog post about?

Milvus is an open source vector database that enables efficient search over large amounts of data. When deploying Milvus, it's crucial to select the optimal configuration to ensure efficient performance and resource utilization. Key points to consider include index selection, balancing memory usage, disk space, cost, speed, and accuracy; segment size and deployment configuration; and additional customization options available in the Enterprise version of Zilliz Cloud. The Milvus sizing tool provides a starting point for these configurations, but users should also consider their specific needs and requirements when choosing an index algorithm or segment size.

An Overview of Milvus Storage System and Techniques to Evaluate and Optimize Its Performance

Date published
April 24, 2024

Author(s)
By Fendy Feng, and Jay Zhu

Language
English

Word count
1593

Hacker News points
None found.

What's this blog post about?

This guide explores Milvus, an open-source vector database known for its horizontal scalability and fast performance. At the core of Milvus lies its robust storage system, which comprises meta storage, log broker, and object storage. The architecture is organized into four key layers: access layer, coordinator service, worker nodes, and storage. Milvus uses three main storage components to ensure data integrity and availability: meta storage (etcd), object storage (MinIO), and a log broker (Pulsar or Kafka). To evaluate and optimize the performance of Milvus storage, it is crucial to monitor disk write latency, I/O throughput, and disk drive performance. The guide provides recommendations for selecting appropriate block storage options from various cloud providers and offers strategies to enhance MinIO's throughput performance by using SSD or NVMe-type drives.

RAG Without OpenAI: BentoML, OctoAI and Milvus

Date published
April 23, 2024

Author(s)
By Yujian Tang

Language
English

Word count
2820

Hacker News points
None found.

What's this blog post about?

This tutorial demonstrates how to build retrieval augmented generation (RAG) applications using large language models (LLMs) without relying on OpenAI. The process involves serving embeddings with BentoML, inserting data into a vector database for RAG, setting up an LLM for RAG, and providing instructions to the LLM. Key components include BentoML for serving embeddings, OctoAI for accessing open-source models, and Milvus as the vector database. The example uses BentoML's Sentence Transformers Embeddings repository, a local Milvus instance using Docker Compose, and the Nous Hermes fine-tuned Mixtral model from OctoAI for RAG.

Kickstart Your Local RAG Setup: A Beginner's Guide to Using Llama 3 with Ollama, Milvus, and Langchain

Date published
April 19, 2024

Author(s)
By Stephen Batifol

Language
English

Word count
844

Hacker News points
None found.

What's this blog post about?

This guide provides a beginner's approach to setting up a Retrieval Augmented Generation (RAG) system using Ollama, Llama 3, Milvus, and Langchain. The RAG technique enhances large language models (LLMs) by integrating additional data sources. In this tutorial, we will build a question-answering chatbot that can answer questions about specific information. Key components of the setup include indexing data using Milvus, retrieval and generation with Llama 3, and interaction with data using Langchain. The guide assumes familiarity with Docker and Docker Compose, as well as installation of Milvus Standalone, Ollama, and other necessary tools.

Milvus Server Docker Installation and Packaging Dependencies

Date published
April 17, 2024

Author(s)
By Christy Bergman

Language
English

Word count
682

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database with significant traction in Generative AI and RAG use cases. It offers flexible deployment options, including local and cloud (Zilliz) services. The main dependencies for Milvus standalone server include FAISS, etcd, Pulsar/Kafka, Tantivy, RocksDB, Minio/S3/GCS/Azure Blob Storage, Kubernetes, StorageClass, Persistent Volumes, Prometheus, and Grafana. The Docker image size for Milvus standalone container is around 300MB. It has a frequent release cycle with approximately one major release per month. Six SDKs are available in Python, Node, Go, C#, Java, and Ruby. Understanding these details can help organizations better plan and prepare for integrating Milvus into their technology stack.

Emerging Trends in Vector Database Research and Development

Date published
April 16, 2024

Author(s)
By Li Liu

Language
English

Word count
2159

Hacker News points
None found.

What's this blog post about?

The future of vector databases is closely tied to the evolution of product requirements and user demands. Key areas of development include cost-efficiency, hardware advancements, collaboration with advanced machine learning models, prioritizing retrieval accuracy, optimizing for offline use cases, expanding feature sets for diverse industries, and more. As AI continues to mature, these advancements will enable vector databases to support a broader range of applications across various sectors, enhancing their overall functionality and versatility in production environments.

The Evolution and Future of AI and Its Influence on Vector Databases: Insights from Charles, CEO of Zilliz

Date published
April 15, 2024

Author(s)
By Charles Xie

Language
English

Word count
1604

Hacker News points
None found.

What's this blog post about?

Charles Xie, CEO of Zilliz, discusses the evolution and future of AI and its influence on vector databases. He highlights how Zilliz developed Milvus, a vector database, before the advent of large language models (LLMs), emphasizing the importance of data management for unstructured data. The article also explores the transition from enterprise-centric to democratized AI, as well as the significance of vector databases in the age of Foundation Models and LLMs. Furthermore, it delves into the role of Milvus 3.0 in enhancing retrieval accuracy for RAG systems and how ChatGPT and vector databases complement each other in semantic search. Lastly, Xie shares his vision for Affordable General Intelligence within five years, aiming to make AI-solutions accessible to all individuals and businesses.

Embedding Inference at Scale for RAG Applications with Ray Data and Milvus

Date published
April 12, 2024

Author(s)
By Christy Bergman, and Cheng Su

Language
English

Word count
1761

Hacker News points
None found.

What's this blog post about?

This blog discusses the use of Retrieval Augmented Generation (RAG) applications with open-source tools such as Ray Data and Milvus. The author highlights the performance boost achieved using Ray Data during the embedding step, where data is transformed into vectors. By using just four workers on a Mac M2 laptop with 16GB RAM, Ray Data was found to be 60 times faster than Pandas. The blog also presents an open-source RAG stack that includes BGM-M3 embedding model, Ray Data for fast, distributed embedding inference, and Milvus or Zilliz Cloud vector database. The author provides a step-by-step guide on how to set up these tools and use them to generate embeddings from data downloaded from Kaggle IMDB poster. Additionally, the blog discusses the benefits of using bulk import features in Milvus and Zilliz Cloud for efficient batch loading of vector data into a vector database.

Monitoring Milvus with Grafana and Loki

Date published
April 11, 2024

Author(s)
By Stephen Batifol

Language
English

Word count
1333

Hacker News points
None found.

What's this blog post about?

This guide provides step-by-step instructions on setting up Grafana and Loki to effectively monitor Milvus deployments. Milvus is a distributed vector database designed for storing, indexing, and managing massive embedding vectors. Grafana is an open-source platform for monitoring and observability, while Loki pairs with Grafana as a log aggregation system. Together, they offer a solid monitoring setup for Milvus and beyond. The prerequisites include Docker, Kubernetes, Helm, and kubectl. After setting up the K8s cluster, users can deploy Grafana and Loki using Helm. Finally, configure Grafana data sources and dashboard to visualize and query logs effectively.

The Cost of Open Source Vector Databases: An Engineer’s Guide to DYI Pricing

Date published
April 8, 2024

Author(s)
Steffi Li

Language
English

Word count
1764

Hacker News points
None found.

What's this blog post about?

The cost of open source vector databases can be complex and challenging to quantify. Engineers often start projects using free software like Milvus, but hardware costs soon arise. Running a distributed database requires setting up dependencies such as Kafka or Pulsar for WAL, etcd for metadata storage, and Kubernetes for orchestration. Additionally, costs include load balancers, monitoring and logging tools, EC2 instances for worker nodes, and storage solutions like S3 or Azure Blob. Some aspects of running an open-source vector database are difficult to quantify, such as capacity planning, setup phase tasks, routine maintenance, troubleshooting latency issues, and disaster recovery plans. Other costs include time to market, engineering morale and retention, and risk mitigation. To assess costs in vector database management, performance tests should be conducted to gather data on how the database handles real-life workloads. Optimizing for cost involves adopting dynamic scaling, adjusting recall accuracy, latency, and throughput according to project needs, and using MMap to store less data in memory. The decision on how to manage a vector database ultimately depends on comparing costs and making an intelligent economic choice based on the most cost-effective option.

Redis tightens its license: How can an OSS company survive in the Cloud Era

Date published
April 5, 2024

Author(s)
James Luan

Language
English

Word count
1076

Hacker News points
None found.

What's this blog post about?

Redis, an open-source database software, has transitioned from the BSD license to the Server Side Public License (SSPLv1), causing some controversy. This change may lead to multiple Linux distributors dropping Redis from their codebases, but alternative options like Valkey and Microsoft's Garnet are available. The shift in open-source licensing has been driven by cloud computing's impact on the traditional business model of open-source software companies. Some open-source projects have adopted more restrictive licenses to protect their profits, while others continue to offer permissive licenses and focus on commercial services. Companies like Zilliz are finding new ways to balance open-source and commercialization by offering unique capabilities in their managed services while maintaining compatibility with the open-source API.

The Evolution and Future of Vector Databases: Insights from Charles, CEO of Zilliz

Date published
April 4, 2024

Author(s)
Charles Xie

Language
English

Word count
1737

Hacker News points
None found.

What's this blog post about?

Charles, CEO of Zilliz, discusses the evolution and future of vector databases in AI applications. He explains that vector databases are designed to manage and query unstructured data like images, videos, and natural languages through deep learning algorithms and semantic queries. They are widely used in recommendation systems, chatbots, and semantic search. The current landscape of vector databases includes purpose-built ones like Milvus, traditional databases with a vector search plugin like Elasticsearch, lightweight vector databases like Chroma, and more technologies with vector search capabilities like FAISS. Charles shares insights into building the Milvus vector database system, emphasizing its support for heterogeneous computing, both vertical and horizontal scalability, and offering a smooth developer experience from prototyping to production. He also provides guidance on choosing the right vector database for businesses based on performance requirements and projected data volume growth. Charles predicts that future vector databases will extend their capabilities beyond similarity-based search to include exact search or matching, as well as support additional vector computing workloads like clustering and classification.

Building a Tax Appeal RAG with Milvus, LlamaIndex, and GPT

Date published
April 3, 2024

Author(s)
Ash Naik

Language
English

Word count
794

Hacker News points
None found.

What's this blog post about?

A group of four strangers, including a Product Manager, full-stack developers, and an AI enthusiast, came together during a monthly Hackathon in Seattle to build the SaveHaven project. The team developed a Retrieval Augmented Generation (RAG) app called SaveHaven that helps individuals contest property and income tax assessments by leveraging technologies like LlamaIndex, Milvus, and GPT from OpenAI. By automating data collection and analysis from public records, the app simplifies the tax appeal process for the general public. The team's experience serves as an example for future entrepreneurs to build meaningful innovations using GenAI technologies.

An LLM Powered Text to Image Prompt Generation with Milvus

Date published
April 2, 2024

Author(s)
Werner Oswald

Language
English

Word count
797

Hacker News points
None found.

What's this blog post about?

The author discovered their love for open-source image-generating AI systems and started searching through webpages to find cool images and the prompts that made them. They used those prompts to make their own images, but it took a lot of time. To speed up the process, they downloaded millions of prompts and put them into a Milvus vector database. The system was able to fetch similar results based on simple prompts entered into a UI. Users found that the system produced better results than what they were doing before with their regular prompts. The author chose Milvus for performance reasons, as it was five times faster than pgvector with almost the same code. They also added instructions telling the LLM that it was a prompt engineer and provided some example conversation history to get it to start producing wonderful images. The next step is to add the same function for negative prompts, which have a positive influence on how prompts can be used to generate images.

JSON and Metadata Filtering in Milvus

Date published
March 26, 2024

Author(s)
Christy Bergman

Language
English

Word count
1140

Hacker News points
None found.

What's this blog post about?

JSON, or JavaScript Object Notation, is a flexible data format used for storage and transmission. It employs key-value pairs adaptively, making it ideal for NoSQL databases and API results. Milvus Client, a wrapper around the Milvus collection object, uses a flexible JSON "key":value format to allow schema-less data definitions. This makes it faster and less error-prone than defining a full schema upfront. The schema-less schema includes fields for id (str) and vector (str), with the rest of the fields determined flexibly when the data is inserted into Milvus. JSON data can be uploaded directly into Milvus, which also supports metadata filtering on JSON fields and JSON array data types.

Community and Open Source Contributions in Vector Databases

Date published
March 26, 2024

Author(s)
Stephen Batifol

Language
English

Word count
1077

Hacker News points
None found.

What's this blog post about?

Vector databases, designed to store high-dimensional data points, are particularly useful in handling unstructured data such as image recognition, natural language processing, and recommendation systems. The open-source nature of many vector database projects allows for diverse contributions from various individuals and organizations, fostering innovation and transparency. Open source also promotes accessibility, enabling a wider range of projects and innovations. Community collaboration is crucial in the development of vector databases, with knowledge sharing and inclusive participation playing significant roles. Resources such as well-maintained documentation, chat channels, hackathons, meetups, and conferences contribute to fostering a sense of community and driving innovation. Contributing to open-source vector databases involves finding contribution opportunities, engaging with the community, and understanding that contributions are not limited to coding. Success stories include improvements in scalability, performance, usability, and accessibility of vector databases due to open-source contributions and active community engagement. Challenges faced by these projects include managing a high volume and variety of contributions and balancing diverse interests and visions. However, with robust systems for tracking, reviewing, and integrating contributions, as well as transparent decision-making processes, these challenges can be addressed effectively. In conclusion, the open-source model has proven to be a driving force in advancing vector databases, breaking down barriers, and democratizing access to cutting-edge technology. The diverse community of contributors ensures that these tools are continually improving in terms of robustness, efficiency, and versatility.

Milvus 2.4 Unveils CAGRA: Elevating Vector Search with Next-Gen GPU Indexing

Date published
March 20, 2024

Author(s)
Li Liu

Language
English

Word count
1587

Hacker News points
None found.

What's this blog post about?

Milvus 2.4 introduces CAGRA (CUDA Anns GRAph-based), a GPU-based graph index that significantly enhances vector search performance. Leveraging the parallel capabilities of GPUs, CAGRA offers improved efficiency in both small and large batch queries compared to traditional methods like HNSW. Additionally, CAGRA accelerates index building by approximately 10 times. The integration of CAGRA into Milvus marks a significant milestone in overcoming challenges associated with GPU-based vector search algorithms and sets the stage for future advancements in high recall, low latency, cost efficiency, and scalability in vector search.

What’s New in Milvus 2.4.0?

Date published
March 20, 2024

Author(s)
Steffi Li

Language
English

Word count
629

Hacker News points
None found.

What's this blog post about?

Milvus 2.4, a significant update in search capabilities for large datasets, has been released. This version accelerates search efficiency and broadens the horizons towards a unified search platform capable of fulfilling diverse search use cases with exceptional speed and precision. Key highlights include support for NVIDIA's CAGRA Index, Multi-vector Search, Grouping Search, beta support for sparse vector embeddings, and other key enhancements. These updates significantly boost Milvus's performance and versatility for complex data operations.

Build Real-time GenAI Applications with Zilliz Cloud and Confluent Cloud for Apache Flink®

Date published
March 19, 2024

Author(s)
Jiang Chen

Language
English

Word count
762

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has partnered with Confluent to unlock semantic search for real-time updates powered by Apache Kafka, Apache Flink, and the Milvus vector database. The new cloud-native, serverless Apache Flink service is now available directly alongside cloud-native Apache Kafka on Confluent's fully managed data streaming platform. This integration enables users to easily build high-quality, reusable data streams for real-time GenAI applications. By leveraging Kafka and Flink as a unified platform, teams can connect to data sources across any environment, clean and enrich data streams on the fly, and deliver them in real-time to the Milvus vector database for efficient semantic search or recommendation.

Using Similarity Search - How Not to Lose Meetup Content on the Internet

Date published
March 19, 2024

Author(s)
Stephen Batifol

Language
English

Word count
1207

Hacker News points
None found.

What's this blog post about?

The author discusses the problem of losing valuable content from Meetup events and how similarity search techniques can be used to address this issue. They introduce Milvus, an open-source vector database that excels in managing complex data landscapes, and SentenceTransformers, a Python framework for generating text embeddings. The author demonstrates how to use these tools to create a system that searches for similar content within Meetup descriptions. By using OpenAI GPT-3.5-turbo to summarize the content of Meetups, they aim to improve search results by reducing noise in event descriptions.

RAG Evaluation Using Ragas

Date published
March 18, 2024

Author(s)
Christy Bergman

Language
English

Word count
1018

Hacker News points
None found.

What's this blog post about?

Retrieval Augmented Generation (RAG) is an approach to building AI-powered chatbots that answer questions based on data the model has been trained on. However, natural language retrieval accuracy remains low, necessitating experiments to tune RAG parameters before deployment. Large Language Models (LLMs) are increasingly being used as judges for modern RAG evaluation, automating and speeding up evaluation while offering scalability and saving time and cost spent on manual human labeling. Two primary flavors of LLM-as-judge for RAG evaluation include MT-Bench and Ragas, with the latter emphasizing automation and scalability for RAG evaluations. Key data points needed for Ragas evaluation include the question, contexts, answer, and ground truth answer.

Building an AI-driven Car Repair Assistant with Milvus and the OpenAI LLM

Date published
March 13, 2024

Author(s)
Lin Liu

Language
English

Word count
467

Hacker News points
None found.

What's this blog post about?

The AI-driven car repair assistant project aims to create an interactive and reliable platform for drivers seeking automotive advice and solutions. Using the OpenAI LLM model with Milvus vector database, the product combines user inputs with AI capabilities to provide relevant diagnostic suggestions. This tool hopes to revolutionize car maintenance and repair in today's digital world by refining the process of identifying car issues and broadening access to expert advice.

Zilliz Cloud Now Available on Azure Marketplace

Date published
March 11, 2024

Author(s)
Steffi Li

Language
English

Word count
414

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud is now available on Azure Marketplace, following its successful integration into AWS and GCP marketplaces. This expansion simplifies subscription management and billing, allowing smoother integration into developers' existing Azure workflows. Getting started with Zilliz Cloud on Azure Marketplace involves searching for "Zilliz Cloud," subscribing, configuring the project and SaaS details, linking the Azure Marketplace subscription with a Zilliz Cloud account, and setting Azure Marketplace as the payment method. This integration enables developers to easily incorporate Zilliz Cloud's powerful capabilities into their AI projects.

Building an AI Agent for RAG with Milvus and LlamaIndex

Date published
March 11, 2024

Author(s)
Yujian Tang

Language
English

Word count
1380

Hacker News points
None found.

What's this blog post about?

In 2023, large language models (LLMs) gained immense popularity, leading to the development of two main types of LLM applications: retrieval augmented generation (RAG) and AI agents. RAG involves using a vector database like Milvus to inject contextual data, while AI Agents use LLMs to utilize other tools. This article combines these two concepts by building an AI Agent for RAG using Milvus and LlamaIndex. The tech stack includes Milvus, LlamaIndex, and OpenAI (or alternatively OctoAI or HuggingFace). The process involves spinning up Milvus, loading data into it via LlamaIndex, creating query engine tools for the AI Agent, and finally building the AI Agent for RAG. This architecture allows an AI Agent to perform RAG on documents by providing it with the necessary tools for querying a vector database.

Stephen Batifol - Why I Joined Zilliz

Date published
March 6, 2024

Author(s)
Stephen Batifol

Language
English

Word count
371

Hacker News points
None found.

What's this blog post about?

Stephen Batifol, Developer Advocate at Zilliz in Berlin, is organizing events and creating content to help people understand and use Milvus. With experience as an Android developer, data scientist, machine learning engineer, and now a developer advocate, he has always aimed to simplify the work of data scientists and software engineers. His interest in open-source projects led him to Zilliz, where he is excited to build a community from scratch and engage with people at various events. Batifol plans to immerse himself technically in the domain, start a new Meetup series in Berlin, and release open-source projects soon. He encourages interested candidates to join Zilliz as Developer Advocates across different regions.

Will Retrieval Augmented Generation (RAG) Be Killed by Long-Context LLMs?

Date published
March 5, 2024

Author(s)
James Luan

Language
English

Word count
1858

Hacker News points
None found.

What's this blog post about?

Google's Gemini 1.5, an LLM capable of handling contexts up to 10 million tokens, and OpenAI's Sora, a text-to-video model, have sparked discussions about the future of AI, particularly the role and potential demise of Retrieval Augmented Generation (RAG). Gemini 1.5 Pro supports ultra-long contexts of up to 10 million tokens and multimodal data processing. In a "needle-in-a-haystack" evaluation method, Gemini 1.5 Pro achieves 100% recall from up to 530,000 tokens and maintains over 99.7% recall from up to 1M tokens. Even with a super long document of 10M tokens, the model retains an impressive 99.2% recall rate. While Gemini excels in managing extended contexts, it grapples with persistent challenges encapsulated as the 4Vs: Velocity, Value, Volume, and Variety. LLMs’ 4Vs Challenges include hurdles in achieving sub second response times for extensive contexts, considerable inference costs associated with generating high-quality answers in long contexts, vastness of unstructured data that may not be adequately captured by an LLM, and diverse range of structured data. Strategies for optimizing RAG effectiveness include enhancing long context understanding, utilizing hybrid search for improved search quality, and leveraging advanced technologies to enhance RAG’s performance. The RAG framework is still a linchpin for the sustained success of AI applications. Its provision of long-term memory for LLMs proves indispensable for developers seeking an optimal balance between query quality and cost-effectiveness.

Using Your Vector Database as a JSON (or Relational) Datastore

Date published
March 4, 2024

Author(s)
Frank Liu

Language
English

Word count
1436

Hacker News points
None found.

What's this blog post about?

The blog post discusses the use of vector databases, such as Milvus or Zilliz Cloud, as a JSON (or relational) datastore. It explains how to create a collection in Milvus and perform CRUD operations on JSON data stored within it. The author demonstrates querying, updating, and deleting records using Python code snippets. Additionally, the post introduces a package called milvusmongo that implements basic CRUD functionality across collections using Milvus as the underlying database instead of MongoDB. The author emphasizes that vector databases are not meant to replace NoSQL databases or lexical text search engines but can be used as an efficient data store for solo developers and small teams, with the option to optimize infrastructure usage later as they grow.

Zilliz Cloud Introduces BYOC for Greater Data Sovereignty and Compliance

Date published
Feb. 29, 2024

Author(s)
Steffi Li

Language
English

Word count
820

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced Zilliz Cloud Bring Your Own Cloud (BYOC) to provide greater data sovereignty and compliance. This solution allows customers to use Zilliz Cloud's managed services while keeping their data within their private network. The architecture of Zilliz Cloud BYOC is built on two main pillars: the Data Plane, which encompasses all essential components for data collection, management, and query processing; and the Control Plane, responsible for deployment, management, and seamless coordination across all instances of the Zilliz Data Plane. The new deployment model in BYOC lets customers deploy the Data Plane within their own Virtual Private Cloud (VPC) while the Control Plane remains managed by Zilliz. This setup offers benefits such as data security and compliance, fine-grained control, and cost savings. Security measures include adherence to the Principle of Least Privilege, controlled access for software updates, and data plane access restrictions. BYOC is currently available on AWS with plans to expand to other cloud providers in the future.

How Sohu Enhances Personalized News Recommendation with Milvus

Date published
Feb. 28, 2024

Author(s)
Fendy Feng

Language
English

Word count
650

Hacker News points
None found.

What's this blog post about?

Sohu, a NASDAQ-listed company, partnered with Milvus to enhance its news recommendation system. The outdated legacy vector search stack in the recommender system was struggling to deliver real-time, personalized news due to slow retrieval and scalability issues. Milvus, an open-source vector database, provided a solution for handling large datasets and improving classification accuracy of short-text news articles. Sohu News integrated Milvus into its recommender system using a dual-tower structure and achieved a 10x faster vector retrieval speed and significantly improved recommendation accuracy. The collaboration with Milvus has transformed the user experience by offering more personalized and engaging news content.

Finding the Right Fit: Automatic Embeddings Support for AI Retrieval (RAG) in Zilliz Cloud Pipelines from OSS, VoyageAI, and OpenAI

Date published
Feb. 27, 2024

Author(s)
Christy Bergman

Language
English

Word count
1579

Hacker News points
None found.

What's this blog post about?

This blog post discusses the use of embedding models in Retrieval Augmented Generation (RAG) applications. RAG is an approach used to enhance question-answering bots by integrating domain knowledge into AI's knowledge base. The process involves using embedding models to generate vector embeddings of chunks of text from all documents, followed by indexing and search using the same embedding model. Finally, a large language model (LLM) generates an answer based on the given domain knowledge. The most common type of embedding model is SBERT (Sentence-BERT), which specializes in understanding complete sentences. The HuggingFace MTEB Leaderboard provides a list of embedding models sorted by retrieval performance, making it easier for developers to choose the best model for their needs. Zilliz Cloud Pipelines support various embedding models, including BAAI/bge-base-en(or zh)-v1.5, VoyageAI's voyage-2 and voyage-code-2, and OpenAI's text-embedding-3-small(or large). Each model has its advantages and is best suited for different use cases. In conclusion, embedding models play a crucial role in enhancing AI retrieval capabilities by integrating domain knowledge into the AI's knowledge base. The choice of an appropriate embedding model depends on factors such as context length, embedding dimensions, and specific use case requirements.

Building RAG Apps Without OpenAI - Part Two: Mixtral, Milvus and OctoAI

Date published
Feb. 26, 2024

Author(s)
Yujian Tang

Language
English

Word count
2044

Hacker News points
None found.

What's this blog post about?

This blog discusses building Retrieval Augmented Generation (RAG) applications without using OpenAI's GPT models. The authors demonstrate how to build RAG apps with Mixtral, Milvus, and OctoAI. They also provide an overview of the tools involved in this process: Mixtral as the LLM, Milvus as the vector database, OctoAI for serving the LLM and embedding model, and LangChain as the orchestrator. The tutorial covers setting up RAG tools, loading data into a vector database, querying data with OctoAI and Mixtral, and leveraging Mixtral's multilingual capabilities.

Exploring Multimodal Embeddings with FiftyOne and Milvus

Date published
Feb. 23, 2024

Author(s)
Yujian Tang

Language
English

Word count
1514

Hacker News points
None found.

What's this blog post about?

This tutorial explores the concept of multimodal embeddings using open-source tools like Voxel51 and Milvus. It covers the meaning of "multimodal", how Milvus handles multimodal embeddings, examples of multimodal models, and how to use FiftyOne and Milvus for multimodal embedding exploration. The tutorial uses Fashion MNIST dataset with CLIP-VIT model from OpenAI to demonstrate the process. It also discusses how to further customize FiftyOne for data exploration with Milvus and provides a summary of exploring multimodal embeddings with Voxel51 and Milvus.

Building Zilliz Cloud in 18 months: Lessons learned while creating a scalable Vector Search Service on the public cloud

Date published
Feb. 16, 2024

Author(s)
James Luan

Language
English

Word count
3009

Hacker News points
None found.

What's this blog post about?

The article details the creation of Zilliz Cloud, a fully managed service powered by Milvus, the most adopted open-source vector database, developed from the ground up over eighteen months. It covers the design choices and invaluable insights gained during the journey to build this cloud service. The author emphasizes maximizing the use of mature third-party products, simplifying architecture, anticipating day 2 challenges from day 1, and focusing on cloud finops as key principles for building a successful cloud service. They also discuss the lessons learned while creating a scalable Vector Search Service on the public cloud and acknowledge the support of their users in this endeavor.

TL;DR Milvus regression in LangChain v0.1.5

Date published
Feb. 12, 2024

Author(s)
Christy Bergman

Language
English

Word count
577

Hacker News points
None found.

What's this blog post about?

A recent regression in Milvus has caused an error when using Langchain v0.1.5 to connect with it, specifically a "KeyError: 'pk'" error due to the absence of an automatically generated primary key field. The temporary solution is to downgrade to Langchain version <= v0.1.4 until the fix is officially merged. A permanent solution will be provided in an upcoming update that addresses this issue by handling cases where "pk" is not present during insertion. Until then, users can workaround the problem by downgrading their Langchain version or waiting for the official fix.

Zilliz Cloud Pipelines February Release - 3rd Party Embedding Models and Usability Improvements!

Date published
Feb. 9, 2024

Author(s)
Jiang Chen

Language
English

Word count
857

Hacker News points
None found.

What's this blog post about?

Zilliz has released an update to its Cloud Pipelines, focusing on embedding models and usability improvements. The February release includes new 3rd-party embedding models from OpenAI and Voyage AI, providing a total of six options for users. Additionally, the platform now supports all dedicated vector db clusters in GCP's us-west-2 region, enhancing performance and reliability. Usability improvements include a new "Run Pipeline" page, local file upload feature, and support for running pipelines on any type of vector database cluster.

Zilliz Joins the AI Alliance: Advancing Open Innovation in AI for a Better Future

Date published
Feb. 8, 2024

Author(s)
Charles Xie

Language
English

Word count
393

Hacker News points
None found.

What's this blog post about?

Zilliz has joined the AI Alliance, a consortium promoting open innovation in AI for responsible development and safe practices. The company's journey into open source began with its vector database Milvus, which was donated to the Linux Foundation. Open-source projects foster transparency, collaboration, innovation, and accessibility. Zilliz is committed to working alongside other industry players within the AI Alliance to shape a future where AI benefits everyone and positively impacts society.

Introducing the Databricks Connector, a Well-Lit Solution to Streamline Unstructured Data Migration and Transformation

Date published
Feb. 8, 2024

Author(s)
Jiang Chen

Language
English

Word count
1107

Hacker News points
None found.

What's this blog post about?

This integration enables developers to effortlessly transfer data from Spark/Databricks to Milvus/Zilliz Cloud, whether in real-time or batch mode. By leveraging the Databricks Connector for Apache Arrow, developers can streamline their workflow and focus on building efficient and scalable AI solutions using these powerful technologies. The integration approach involves connecting Spark to Milvus through a shared filesystem such as S3 or MinIO buckets. By granting access to Spark or Databricks, the Spark job can use Milvus connectors to write data to the bucket in batch and then bulk-insert the entire collection for serving. To help developers get started quickly, we have prepared a notebook example that walks them through the streaming and batch data transfer processes with Milvus and Zilliz Cloud. This integration empowers developers to build efficient and scalable AI solutions, unlocking the full potential of these powerful technologies. For more information on this integration and its use cases, check out the official documentation for Databricks Connector for Apache Arrow.

The High-performance Vector Database Zilliz Cloud Now Available on Google Cloud Marketplace

Date published
Feb. 7, 2024

Author(s)
Steffi Li

Language
English

Word count
623

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud, a fully managed Milvus vector database that supports various AI applications, is now available on Google Cloud Marketplace. This integration simplifies billing as charges will appear directly on the developer's regular Google Cloud bill. Users can easily subscribe to the Zilliz service using their existing GCP account and access all features without upfront costs. A 100 credit bonus is available for new sign-ups, enabling developers to kickstart their journey with Zilliz Cloud on GCP.

Crafting Superior RAG for Code-Intensive Texts with Zilliz Cloud Pipelines and Voyage AI

Date published
Feb. 7, 2024

Author(s)
Jiang Chen

Language
English

Word count
694

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud Pipelines has integrated the Voyage AI embedding models, voyage-2 and voyage-code-2, which have shown outstanding performance in retrieval tasks related to source code, technical documentation, and general tasks. The incorporation of these models enhances the RAG system implemented with various embedding models for code-related tasks. Notably, when compared to other popular embedding models on code datasets, Voyage's models demonstrate significantly better retrieval capability and lead to over ten percentage point improvements in Answer Correctness and overall performance scores.

Zilliz Cloud Enhances Data Protection with More Granular RBAC

Date published
Feb. 6, 2024

Author(s)
Sarah Tang

Language
English

Word count
1157

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has introduced enhanced Role-Based Access Control (RBAC) functionality, providing more nuanced RBAC capabilities to improve access management and data isolation. The updated system features two primary categories of roles - Operation Layer Roles and Data Layer Roles - catering to diverse developer requirements. In the operational layer, Zilliz Cloud has four predefined Organization and Project Roles: Organization Owner, Organization Member, Project Owner, and Project Member. Additionally, it offers three predefined Cluster Roles in the data layer: Admin, Read-Write, and Read-Only. Users can also create custom roles to fine-tune permissions for specific collections or operations. The enhanced RBAC capabilities are exemplified through real-world use cases such as cross-team collaboration in a medium-sized company and managing a RAG-based knowledge base. These features ensure effective data management, improved security, and efficient resource allocation.

Choosing a Vector Database: Milvus vs. Chroma

Date published
Feb. 5, 2024

Author(s)
Fendy Feng

Language
English

Word count
1401

Hacker News points
None found.

What's this blog post about?

In this comparison, we delve into the functionalities and performance of two open-source vector databases: Milvus and Chroma. We assess these platforms based on their capabilities in handling vector data storage, indexing, searching, scalability, and ecosystem support. Additionally, we examine the purpose-built features and performance trade-offs between Milvus and Chroma. Milvus is a versatile and comprehensive open-source vector database, offering extensive support for various index types, including 11 different options. It supports hybrid search operations and offers flexible in-memory and on-disk indexing configurations. Furthermore, Milvus ensures strong consistency and provides multi-language SDKs encompassing Python, Java, JavaScript, Go, C++, Node.js, and Ruby. On the other hand, Chroma is a relatively simpler vector database with a primary focus on enabling easy initiation and usage. It currently supports only the HNSW algorithm for its KNN search operations and lacks advanced features such as RBAC support. Additionally, it offers limited SDK options, primarily focusing on Python and JavaScript. While Chroma's simplicity may be adequate for specific applications, its limitations could restrict its adaptability across diverse use cases. With its comprehensive functionality and extensive feature set, Milvus emerges as a more versatile and scalable solution for addressing a broader spectrum of vector data management needs. In the upcoming Milvus 2.4 release, we plan to support the inverted index with tantivy, which promises significant enhancements to prefiltering speed. This update further solidifies Milvus as a cutting-edge open-source vector database that continues to evolve and adapt to emerging requirements in the AI ecosystem. In summary, while Chroma offers simplicity and ease of use, Milvus distinguishes itself with its comprehensive feature set, extensive index type support, and robust multi-language SDKs. As a result, Milvus remains a highly recommended open-source vector database for developers and organizations seeking to optimize their applications' performance, scalability, and data management capabilities. Milvus Lite, a lightweight alternative to the full Milvus version, has also been introduced. It aims to preserve the ease of initiation while retaining an extensive set of features, making it particularly useful for specific use cases such as integration into Python applications without adding extra weight or spinning up a Milvus instance in Colab or Notebook for quick experiments.

An Introduction to Milvus Architecture

Date published
Feb. 2, 2024

Author(s)
Yujian Tang

Language
English

Word count
1420

Hacker News points
None found.

What's this blog post about?

Milvus is a distributed system designed to scale vector operations, addressing the challenges of scalability in vector databases. Unlike traditional databases, vector data doesn't require complex transactions and has diverse use cases that necessitate tunable tradeoffs between performance and consistency. Some vector data operations are computationally expensive, requiring elastic resource allocation. Milvus achieves horizontal scaling through its deliberate design as a distributed system, overcoming the limitations of single-instance databases. It contains four layers: access, coordination, worker, and storage. The separation of concerns in querying, data ingestion, and indexing allows for independent scaling of each operation. Milvus ensures large-scale write consistency through sharding and supports pre-filtering metadata search to enhance efficiency. Its unique architecture provides benefits such as horizontal scaling and flexibility, making it a suitable choice for cloud-native vector databases catering to diverse use cases.

Introducing Cardinal: The Most Performant Engine For Vector Searches

Date published
Feb. 1, 2024

Author(s)
Alexandr Guzhva

Language
English

Word count
1665

Hacker News points
None found.

What's this blog post about?

Cardinal is a new vector search engine developed by Zilliz, which has demonstrated a threefold increase in performance compared to the previous version. It offers a search performance (QPS) that reaches tenfold that of Milvus. Cardinal is capable of performing brute-force search, creating and modifying ANNS indices, working with various input data formats, and filtering results during the search based on user-provided criteria. The key to Cardinal's speed lies in its algorithm optimizations, engineering optimizations, low-level optimizations, and AutoIndex feature for search strategy selection.

Nurturing Innovation: Our Approach to Feature Deployment from Open-Source Milvus to Zilliz Cloud

Date published
Jan. 30, 2024

Author(s)
James Luan

Language
English

Word count
580

Hacker News points
None found.

What's this blog post about?

James Luan, VP of Engineering at Zilliz, discusses the company's commitment to innovation and community collaboration through open-source projects like Milvus. The four essential freedoms of open source, as emphasized by Richard Stallman, guide their approach to feature deployment from Milvus to Zilliz Cloud. They follow three fundamental principles: iteration with precision, testing the waters, and quality over speed. Despite occasional delays in feature deployment, they prioritize maintaining a robust and reliable platform while encouraging community feedback for continuous improvement.

The Best Vector Database Just Got Better

Date published
Jan. 30, 2024

Author(s)
Frank Liu

Language
English

Word count
1034

Hacker News points
None found.

What's this blog post about?

In 2023, vector databases gained popularity due to the widespread adoption of ChatGPT and other large language models (LLMs). Zilliz Cloud, a vector database service, has seen increased usage in retrieval-augmented generation systems as well as various search and retrieval applications. The platform aims to help computers understand human-generated data such as text, images, bank transactions, and user behaviors. Zilliz Cloud recently introduced new features like range search, multi-tenancy & RBAC, up to 10x improved search & indexing performance, and more in response to customer demand. These enhancements have proven critical for users developing applications that require a purpose-built vector database supporting essential database features and various workloads. Three real-world use cases demonstrate the importance of these new features: efficient autonomous agents, product recommendation systems, and AI-powered drug discovery. In each case, Zilliz Cloud's performance optimizations, adaptability, and range search feature have enabled users to overcome challenges in their respective applications. The platform's ability to handle diverse data types and workloads makes it a valuable tool for developers working with vector databases.

New for Zilliz Cloud: Cardinal Search Engine, GCP Marketplace, Databricks Connector and More

Date published
Jan. 30, 2024

Author(s)
Steffi Li

Language
English

Word count
1575

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced new features to its cloud product, enhancing vector search performance and ensuring enterprise-grade security. The latest updates include the Cardinal Search Engine, which delivers a 10x performance boost; Milvus 2.3, offering advanced vector search capabilities for production workloads; GCP Marketplace integration, simplifying budget planning, payment, and procurement processes; and the Databricks Connector, enabling data migration and transformation without custom code. Additionally, Zilliz Cloud now supports role-based access control (RBAC) across both control and data layers for enhanced security and compliance.

Sharding, Partitioning, and Segments - Getting the Most From Your Database

Date published
Jan. 29, 2024

Author(s)
Christy Bergman

Language
English

Word count
1219

Hacker News points
None found.

What's this blog post about?

This blog delves into the concepts of sharding, partitioning, and segments in distributed databases like Milvus. Sharding refers to horizontal data partitioning across multiple servers, enabling faster writing by utilizing distributed systems. Partitioning organizes data for efficient retrieval, optimizing targeted reads. Automatic partitioning is recommended as it minimizes errors and ensures optimal performance. Each shard and partition has segments of data, with growing and sealed segments being the smallest unit in Milvus for load balancing. The default segment size is 512 MB, but adjustments should only be made if there are large machine resources available.

Zilliz Vector Search Algorithm Dominates All Four Tracks of BigANN

Date published
Jan. 26, 2024

Author(s)
Li Liu

Language
English

Word count
1651

Hacker News points
None found.

What's this blog post about?

The BigANN challenge is an important competition in the vector search domain, fostering the development of indexing data structures and search algorithms. Zilliz's solution dominated all four tracks of BigANN 2023, achieving a remarkable up to 2.5x performance improvement. This year's BigANN introduced more significant challenges with larger datasets and complex scenarios across four tracks: filtered, out-of-distribution, sparse, and streaming variants of ANNS. Zilliz's solution is based on graph algorithms and optimizations driven by the specific characteristics of each track. The company plans to integrate these insights into their products, extending their impact on a broader range of issues.

Building RAG Apps Without OpenAI - Part One

Date published
Jan. 17, 2024

Author(s)
Yujian Tang

Language
English

Word count
1615

Hacker News points
None found.

What's this blog post about?

This post discusses the creation of a Conversational Retriever Augmentation Generator (RAG) application without using OpenAI. The tech stack includes LangChain, Milvus, and Hugging Face for embedding models. The process involves setting up the conversation RAG stack, creating a conversation, asking questions, and testing the app's memory retention. The example demonstrates how to use Nebula, a conversational LLM created by Symbl AI, in place of OpenAI's GPT-3.5.

How Mozat's Stylepedia and Milvus Are Redefining Your Closet

Date published
Jan. 16, 2024

Author(s)
Fendy Feng

Language
English

Word count
564

Hacker News points
None found.

What's this blog post about?

Singapore-based tech company Mozat has developed an innovative wardrobe management approach with its app, Stylepedia. The app is designed to redefine how users engage with fashion by integrating Milvus, an open-source vector database, to power its smart image search system. This integration allows Stylepedia to manage a rapidly growing database of clothing images, respond to user queries in milliseconds, and handle user-uploaded photos with varying resolutions. By leveraging Milvus, Stylepedia offers personalized style recommendations, facilitates user connections, and enables image searches for similar clothing items.

What’s New in Milvus 2.3.4

Date published
Jan. 15, 2024

Author(s)
Steffi Li

Language
English

Word count
476

Hacker News points
None found.

What's this blog post about?

Milvus 2.3.4, the latest update of the vector database platform, introduces enhancements to improve availability and usability. The release focuses on streamlining monitoring, data import, and search efficiency. Key highlights include access logs for improved system performance insights, Parquet file support for efficient large-scale data operations, Binlog index on growing segments for faster search within expanding datasets, and other improvements such as increased collection/partition support, enhanced memory efficiency, clearer error messaging, faster data loading speeds, and better query shard balance. Developers are encouraged to visit the release notes for a comprehensive overview of all new features and enhancements in Milvus 2.3.4.

Understanding Consistency Models for Vector Databases

Date published
Jan. 11, 2024

Author(s)
Yujian Tang

Language
English

Word count
1479

Hacker News points
None found.

What's this blog post about?

Distributed systems are crucial for vector search applications, offering scalability, fault tolerance, enhanced performance, and global accessibility. Consistency is a key principle in distributed systems, ensuring that data remains accurate across all replicas. The fully distributed Milvus vector database offers Tunable Consistency through its unique architecture, allowing users to scale out data writing while maintaining consistency without additional tools. Consistency levels in Milvus include Eventual, Session, Bounded, and Strong. Eventual consistency ensures that data will eventually be consistent across all replicas, prioritizing speed over immediate data updates. Session consistency maintains up-to-date data within a single session, while Bounded Consistency forces instances and replicas to sync within a certain period. Strong consistency ensures immediate data availability but comes with increased latency. Understanding the levels of consistency is essential for building resilient, high-performing applications that utilize distributed systems.

Dissecting OpenAI's Built-in Retrieval: Unveiling Storage Constraints, Performance Gaps, and Cost Concerns

Date published
Jan. 9, 2024

Author(s)
Robert Guo

Language
English

Word count
2284

Hacker News points
None found.

What's this blog post about?

OpenAI's built-in retrieval feature has storage constraints, performance gaps, and cost concerns. The current pricing model of $0.2 per GB per day is expensive compared to traditional document services like Office 365 and Google Workspace. However, the server cost for serving vectors is only about $0.30 per day, which is a bargain compared to the pricing. The architecture of OpenAI Assistants' retrieval feature has limitations such as a maximum of 20 files per assistant, a cap of 512MB per file, and a hidden limitation of 2 million tokens per file. The current architecture may not scale well enough to support larger businesses with more extensive data requirements. To address these challenges and reduce costs, the service's architecture needs to be optimized. A refined vector database solution, hybrid disk/memory vector storage, streamlining disaster recovery by pooling system data, and multi-tenancy support for diverse user base are suggested improvements. Among popular vector databases, Milvus is considered the most mature open-source option with effective separation of system and query components, isolation of query components through Resource Group feature, hybrid memory/disk architecture, and application-level multi-tenancy facilitated by RBAC and Partition features. However, no single vector database solution can comprehensively address all challenges and meet every design requirement for imminent infrastructure development. The choice of vector databases should be tailored to specific requirements to effectively navigate the complexities of optimizing OpenAI Assistants' architecture.

OpenAI RAG vs. Your Customized RAG: Which One Is Better?

Date published
Jan. 5, 2024

Author(s)
Cheney Zhang

Language
English

Word count
2134

Hacker News points
None found.

What's this blog post about?

The OpenAI Assistants' retrieval feature has been a topic of discussion in the AI community, as it incorporates Retrieval Augmented Generation (RAG) capabilities for question-answering. A comparison between OpenAI's built-in RAG and a customized RAG using Milvus shows that while the former slightly outperforms in answer similarity, the latter performs better in context precision, faithfulness, answer relevancy, and correctness. The Milvus-powered Customized RAG system also has higher Ragas Scores than OpenAI's built-in RAG. This superior performance is attributed to factors such as effective utilization of external data, better document segmentation and data retrieval, and the ability for users to adjust parameters in the customized RAG pipeline.

Demystify Benchmark Result Divergence: Milvus vs. Qdrant

Date published
Jan. 4, 2024

Author(s)
Steffi Li

Language
English

Word count
859

Hacker News points
None found.

What's this blog post about?

The blog post discusses the disparities between benchmark results of Qdrant's vector database technology and VectorDB Bench, which uses Milvus. It highlights three reasons for these differences: outdated Milvus version used in testing, improper use of Milvus by only using Growing Segments, and benchmark-driven optimizations for Qdrant that may compromise operational flexibility in real-world scenarios. The post emphasizes the importance of trustworthy and comprehensive benchmarking for vector databases and suggests developers should access truthful and precise benchmarks or conduct their own tests against their data to make informed decisions when choosing a vector database.


2023

Optimizing RAG Applications: A Guide to Methodologies, Metrics, and Evaluation Tools for Enhanced Reliability

Date published
Dec. 29, 2023

Author(s)
Cheney Zhang

Language
English

Word count
1700

Hacker News points
None found.

What's this blog post about?

Optimizing Retrieval Augmented Generation (RAG) applications involves using methodologies, metrics, and evaluation tools to enhance their reliability. Three categories of metrics are used in RAG evaluations: those based on the ground truth, those without the ground truth, and those based on LLM responses. Ground truth metrics involve comparing RAG responses with established answers, while metrics without ground truth focus on evaluating the relevance between queries, context, and responses. Metrics based on LLM responses consider factors such as friendliness, harmfulness, and conciseness. Evaluation tools like Ragas, LlamaIndex, TruLens-Eval, and Phoenix can help assess RAG applications' performance and capabilities.

Harmony in Pixels: Picdmo's Leap into Seamless Photo Management with Zilliz Cloud

Date published
Dec. 27, 2023

Author(s)
Fendy Feng

Language
English

Word count
614

Hacker News points
None found.

What's this blog post about?

Picdmo, an AI-powered photo management app, sought to improve its search performance and user experience. The team initially used Milvus, an open-source vector database, but found it labor-intensive and financially burdensome. They then integrated Zilliz Cloud, a fully managed Milvus service, into their infrastructure. This resulted in response times plummeting from 8 seconds to less than 1 second, even under extreme data loads. The adoption of Zilliz Cloud brought efficient search performance, substantial time and cost savings, and responsive support from the Zilliz team. As Picdmo evolves into a comprehensive multimedia application, its collaboration with Zilliz remains crucial for future features.

How To Evaluate a Vector Database?

Date published
Dec. 26, 2023

Author(s)
Li Liu

Language
English

Word count
1363

Hacker News points
None found.

What's this blog post about?

In the data-driven world, the exponential growth of unstructured data has led to the rise of vector databases. These powerful tools specialize in storing, indexing, and searching unstructured data through high-dimensional numerical representations known as vector embeddings. They are used for building recommender systems, chatbots, and applications for searching similar images, videos, and audio. When selecting a vector database, scalability, functionality, and performance are the top three most crucial metrics to consider. Scalability is essential for accommodating growing data demands effectively, while functionality includes both vector-oriented features like support for multiple index types and database-oriented features such as Change Data Capture (CDC) and multi-tenancy support. Performance is evaluated using benchmarking tools like ANN-Benchmark and VectorDBBench, which measure recall rate, QPS, latency, and other metrics. Various vector search technologies are available beyond vector databases, including vector search libraries, lightweight vector databases, vector search plugins, and purpose-built vector databases. Each type has its strengths and weaknesses, so the choice depends on specific business needs.

What Is A Dynamic Schema?

Date published
Dec. 25, 2023

Author(s)
Yujian Tang

Language
English

Word count
1506

Hacker News points
None found.

What's this blog post about?

This post discusses database schemas, specifically focusing on vector databases and their dynamic schema feature. It explains that SQL databases have predefined schemas while NoSQL databases typically have a dynamic or schemaless schema. The Milvus vector database supports dynamic schema, allowing users to add data in JSON format without defining attributes when creating the database. The article covers how to use dynamic schema with the Milvus vector database and how the feature is implemented. It also discusses the pros and cons of dynamic schemas, such as ease of setup and flexibility but slower filtered search compared to fixed schemas.

Unlocking Next-Level APK Security: Trend Micro's Journey with Milvus

Date published
Dec. 21, 2023

Author(s)
Fendy Feng

Language
English

Word count
913

Hacker News points
None found.

What's this blog post about?

Trend Micro, a global leader in cybersecurity, has integrated Milvus, an open-source vector database, into their security infrastructure to enhance APK (Android application package) security. The company initially used MySQL for APK similarity search but faced scalability issues as the dataset grew. They then shifted focus to Faiss, which excelled in speed but lacked critical features required for a production environment. Milvus addressed these challenges with seamless integration with mainstream vector index libraries and simple, intuitive APIs. The implementation of Milvus has resulted in low query latency and high data import speed, significantly enhancing Trend Micro's ability to detect and neutralize harmful APKs.

Metadata Filtering with Zilliz Cloud Pipelines

Date published
Dec. 17, 2023

Author(s)
Christy Bergman

Language
English

Word count
1014

Hacker News points
None found.

What's this blog post about?

The text discusses the use of vector databases like Milvus and Zilliz Cloud, which allow hybrid vector and scalar searches. It explains how metadata filtering can be used to perform more precise results that cater to specific needs by limiting search with certain conditions using boolean expressions on scalar fields or primary key field. The text also provides a step-by-step guide on how to create collections and pipelines in Zilliz Cloud, as well as searching via the web console or API calls.

Optimizing User Experience: BIGO Leverages Milvus for Duplicate Video Removal

Date published
Dec. 14, 2023

Author(s)
Fendy Feng

Language
English

Word count
659

Hacker News points
None found.

What's this blog post about?

BIGO, the owner of short video platform Likee, has leveraged Milvus, an open-source vector database, to optimize its duplicate video removal process. With millions of daily uploads on Likee, the proliferation of duplicate videos posed a threat to content quality and user experience. Previously, BIGO used FAISS for similarity search but faced limitations in managing massive vectors. Milvus provided faster query responses and scalability, improving throughput and efficiency. The transformation involved converting new video frames into feature vectors and matching them against an extensive database of existing content using cutting-edge technologies like Kafka, deep learning models, and relational databases. BIGO plans to extend Milvus's capabilities for content moderation, restriction, and customized video services in the future.

Improving ChatGPT’s Ability to Understand Ambiguous Prompts

Date published
Dec. 12, 2023

Author(s)
Cheney Zhang

Language
English

Word count
1531

Hacker News points
None found.

What's this blog post about?

Prompt engineering techniques are being used to help large language models (LLMs) handle pronouns and other complex coreferences in retrieval augmented generation (RAG) systems. RAG combines the power of LLMs with a vector database acting as long-term memory, enhancing the accuracy of generated responses. One example is Akcio, an open source project that offers a robust question-answer system. However, implementing RAG systems introduces challenges, particularly in multi-turn conversations involving coreference resolution. Researchers are turning to LLMs like ChatGPT for coreference resolution tasks, but they occasionally produce direct answers instead of following the prompt instructions. A refined approach using few-shot prompts and Chain of Thought (CoT) methods has been developed to guide ChatGPT through coreference resolution, resulting in coherent responses.

Similarity Metrics for Vector Search

Date published
Dec. 11, 2023

Author(s)
Yujian Tang

Language
English

Word count
1490

Hacker News points
None found.

What's this blog post about?

This article discusses vector similarity search metrics and how they work. It covers three primary distance metrics: L2 or Euclidean distance, cosine similarity, and inner product. Additionally, it mentions other interesting vector similarity or distance metrics such as Hamming Distance and Jaccard Index. The article explains the concept of vectors in terms of orientation and magnitude, and how these metrics can be used to compare any data that can be vectorized. It also provides examples of when each metric should be used.

Shaping Tomorrow: How Milvus Powers Shopee's Multimedia Ambition

Date published
Dec. 7, 2023

Author(s)
Fendy Feng

Language
English

Word count
588

Hacker News points
None found.

What's this blog post about?

In order to stay competitive in the e-commerce industry, Shopee ventured into short video services. However, they faced challenges handling vast amounts of unstructured data such as videos, images, audio, and text. Milvus emerged as a solution due to its ability to handle billions of vectors, scalability, and seamless integration with Shopee's internal ecosystem. The migration from Milvus 1.x to 2.x improved stability, scalability, and multi-replica capabilities, resulting in low-latency and high-availability retrieval services. With Milvus, Shopee has elevated its real-time search capabilities and streamlined offline data retrieval for copyright video matching and video deduplication processes.

Introducing Zilliz Cloud Pipelines: A One-Stop Service for Building AI-Powered Search

Date published
Dec. 6, 2023

Author(s)
Steffi Li

Language
English

Word count
983

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced its new service, Zilliz Cloud Pipelines, which simplifies the process of creating and retrieving unstructured data as vectors. This solution is designed to empower developers in building high-quality semantic searches without requiring extensive customization or infrastructure adjustments. The platform consists of three specific pipelines: Ingestion, Search, and Deletion. Zilliz Cloud Pipelines currently focuses on semantic search in text documents but will be expanded to include image search, video copy detection, and multi-modal search capabilities in the future.

Create a Movie Recommendation Engine with Milvus and Python

Date published
Dec. 4, 2023

Author(s)
Gourav Bais

Language
English

Word count
1594

Hacker News points
None found.

What's this blog post about?

This article explains how to build a movie recommender system using the open source vector database, Milvus. The process involves setting up the environment, collecting and preprocessing data, connecting to Milvus, generating embeddings for movies, sending embeddings to Milvus, and finally recommending new movies using Milvus. By leveraging vector storage and similarity search, Milvus can help build an efficient and scalable movie recommendation system, enhancing user engagement and showcasing the role of advanced vector-based models in modern recommendation systems.

Building an Open Source Chatbot Using LangChain and Milvus in Under 5 Minutes

Date published
Nov. 29, 2023

Author(s)
Christy Bergman

Language
English

Word count
2068

Hacker News points
None found.

What's this blog post about?

This blog post demonstrates how to build an open source chatbot using LangChain and Milvus in under 5 minutes. The process involves creating a retrieval augmented generation (RAG) stack with LangChain, which allows for answering questions about custom data while reducing hallucinations. The text is grounded on factual, custom data such as product documentation to ensure accuracy. The source code for the live chatbot is available on GitHub. The blog post also explains how to use Milvus, a high-performance vector database optimized for fast storage, indexing, and searching of embeddings or vectors. OpenAI's language models like GPT series are used in this process. Overall, the RAG retrieval and question-answering chatbot on custom documents is shown to be efficient and cost-effective as it allows free calls to data almost all the time for retrieval, evaluation, and development iterations, with only a paid call to OpenAI once for the final chat generation step.

Transforming Ad Recommendations: SmartNews's Journey with Milvus

Date published
Nov. 29, 2023

Author(s)
Fendy Feng

Language
English

Word count
609

Hacker News points
None found.

What's this blog post about?

SmartNews, a leading news app, faced the challenge of optimizing ad recommendations for its highly engaged user base. The company turned to Milvus after researching solutions that could handle high-throughput and low-latency queries. Milvus's vector similarity search capabilities were instrumental in optimizing SmartNews's dynamic ad vector recall. Adopting Milvus led to more relevant ads, increasing click-through rates and driving up ad revenue. The company has upgraded its Milvus to 2.2.4 and is looking forward to leveraging new features for building even more real-time and reliable systems.

Kicking Off the Open Source Advent

Date published
Nov. 27, 2023

Author(s)
Yujian Tang

Language
English

Word count
558

Hacker News points
None found.

What's this blog post about?

The Open Source Advent is a project that aims to introduce participants to open-source software. For 25 days in December, one open-source project will be featured on social media along with a tutorial for quick start-up. Participants can earn points by starring the project's GitHub repo, creating repos using the project, and making posts tagging the company page. Extra points are awarded for writing a PR that gets merged or writing a blog about their experience. The top three scorers will receive swag packs from Zilliz and partners, as well as shoutouts on social media. Participants can join the Open Source Advent Discord Channel to submit their entries between December 26th, 2023, and January 2nd, 2024. Winners will be announced on January 8th, 2024.

Getting Started with a Milvus Connection

Date published
Nov. 24, 2023

Author(s)
Christy Bergman

Language
English

Word count
595

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database designed for building AI applications using unstructured data embeddings. It provides four SDKs, including Java, Python, React, and Ruby. The text outlines the steps to install and start a Milvus server, connect to it, create a collection with schema and index, insert data into the collection, and query the collection. Additionally, it mentions using LangChain and Milvus for building chatbots in an upcoming blog post. Resources are provided to get started with Milvus and Zilliz.

How Milvus Powers Credal’s Mission for “Useful AI, Made Safe”

Date published
Nov. 22, 2023

Author(s)
Anya Sage

Language
English

Word count
1020

Hacker News points
None found.

What's this blog post about?

Credal, an enterprise AI platform, aims to make Generative AI integration safer and more accessible for businesses. Their solution focuses on seamlessly integrating data from various sources while ensuring privacy and security. At the core of their offering is Milvus, an open-source vector database that enables efficient search, filtering, and data curation capabilities. Credal's architecture prioritizes high-quality data interpretations and effective communication with GenAI models. The platform offers observability and governance tools for administrators and IT teams, including features like PII redaction, audit logging, and data access controls. Milvus's scalability and robustness make it a game-changer for Credal, enabling them to deliver "Useful AI, made safe" to businesses worldwide.

Zilliz Cloud Now Available on Microsoft Azure

Date published
Nov. 21, 2023

Author(s)
Steffi Li

Language
English

Word count
278

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud is now available on Microsoft Azure, expanding its presence across major cloud platforms including AWS and Google Cloud. This integration allows Azure-centric developers and enterprises to leverage the unique capabilities of Zilliz Cloud for vector database workloads. The move also signifies seamless access to Azure's cutting-edge AI services such as Semantic Kernel. Future enhancements include expansion into new Azure regions, availability on the Azure Marketplace, and continuous integration efforts to ensure data security and optimize application performance.

Milvus 2.3 Beta and Enterprise Upgrades on Zilliz Cloud

Date published
Nov. 21, 2023

Author(s)
Steffi Li

Language
English

Word count
460

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has released the beta version of Milvus 2.3, introducing new features to enhance data management and querying processes for developers. The update includes Cosine similarity integration, Range Search feature, Upsert functionality, raw vector returns, JSON_CONTAINS filter, entity count, and more. Additionally, Zilliz Cloud has introduced enhanced enterprise features such as improved Role-Based Access Control (RBAC), expanded geographical options with the general availability of AWS EU Frankfurt region, and Self-Service Account and Organization Deletion feature. Furthermore, Zilliz Cloud is now available on Microsoft Azure in the azure-east-us region, completing its availability across major cloud platforms including AWS and Google Cloud. The company invites feedback from developers to shape the future of their vector database technology.

Enhancing Data Flow Efficiency: Zilliz Introduces Upsert, Kafka Connector, and Airbyte Integration

Date published
Nov. 20, 2023

Author(s)
Steffi Li

Language
English

Word count
1348

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced Upsert, Kafka Connector, and Airbyte integration to enhance data flow efficiency in its vector database. Upsert simplifies the update process by inserting or updating data based on atomicity. The Kafka Connector enables real-time streaming of vector data from Confluent/Kafka into Milvus or Zilliz vector databases, enhancing capabilities for Generative AI and e-commerce recommendations. Airbyte Integration streamlines data transfer and processing in LLMs and vector databases, improving search functionality. These enhancements aim to improve search performance and streamline the entire data pipeline, making it more efficient and developer-friendly.

What’s New in Milvus 2.3.2 & 2.3.3

Date published
Nov. 20, 2023

Author(s)
Steffi Li

Language
English

Word count
544

Hacker News points
None found.

What's this blog post about?

Milvus, the vector database system, has released versions 2.3.2 and 2.3.3 with significant improvements aimed at enhancing performance and user experience. The latest updates include support for array data types, complex delete expressions, integration of TiKV for metadata storage, FP16 vector type, and vector index MMAP. Other enhancements include a rolling upgrade experience, performance optimization, upgraded CDC (Change Data Capture), bulk insert of binlog data with partition keys, and the return of binary metric types such as SUBSTRUCTURE and SUPERSTRUCTURE. The developer community's contributions have been instrumental in shaping these updates, and feedback is welcome for future enhancements.

How LangChain Implements Self Querying

Date published
Nov. 16, 2023

Author(s)
Yujian Tang

Language
English

Word count
890

Hacker News points
None found.

What's this blog post about?

LangChain, an open-source library for LLM orchestration, recently added the "Self Query" retriever. This feature allows users to query vector databases like Milvus using LangChain. The implementation of this self-query retriever is covered in lines 189 to 233 of the base.py file in the self-query folder. The only class method for the self-query base class is from_llm, which has eight specified parameters and one allowing keyword arguments (kwargs). Four required parameters are llm, vectorstore, document_contents, and metadata_field_info. Other optional parameters include structured_query_translator, chain_kwargs, enable_limit, and use_original_query. The self-query retriever implementation involves parsing the self-query parameters, creating an LLM chain, and returning a self-query retriever. This feature enables users to build simple retrieval augmented generation (RAG) applications using an LLM, vector database, and prompts to interface with the LLM.

Join us at AWS re:Invent 2023

Date published
Nov. 15, 2023

Author(s)
Emily Kurze

Language
English

Word count
400

Hacker News points
None found.

What's this blog post about?

AWS re:Invent, one of the largest global cloud computing events, will take place in Las Vegas from November 27 to December 1, 2023. Zilliz invites attendees to visit their booth (#1339) and meet the team behind Milvus, a vector database solution. The event offers opportunities for innovative solution demos, problem-solving expertise, collaboration, community engagement, and swag. Attendees can also book private demos or meetings with Zilliz's experts to discuss specific projects or use cases. Additionally, users are invited to join the team for dinner to share project updates and feedback. Resources on vector databases are recommended for those interested in learning more before the event.

Grounding Our Chat Towards Data Science Results

Date published
Nov. 15, 2023

Author(s)
Yujian Tang

Language
English

Word count
940

Hacker News points
None found.

What's this blog post about?

In this tutorial, we learn how to ground our Retriever- Augmenter-Generator (RAG) results using LlamaIndex and citations. We start by setting up the necessary libraries and environment variables for our chatbot. Next, we define the parameters of our RAG chatbot, including the embedding model, vector database, and data abstractions. Finally, we implement citations via LlamaIndex's CitationQueryEngine module to ensure grounded results. This tutorial uses Zilliz Cloud as a fully managed and optimized version of Milvus for persisting data across multiple projects.

Do We Still Need Vector Databases for RAG with OpenAI's Releasing of Its Built-In Retrieval?

Date published
Nov. 13, 2023

Author(s)
Jael Gu

Language
English

Word count
1281

Hacker News points
None found.

What's this blog post about?

OpenAI's built-in Retrieval feature in its Assistants API has some limitations, such as scalability constraints and lack of customization. These issues can be addressed by integrating a custom retriever powered by a vector database like Milvus or Zilliz Cloud. This approach allows developers to optimize and configure the retrieval process according to their specific needs, improving overall efficiency.

Unlock Advanced Recommendation Engines with Milvus' New Range Search

Date published
Nov. 9, 2023

Author(s)
Leon Cai

Language
English

Word count
1198

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, has introduced a new feature called Range Search to enhance its similarity search capabilities. This feature allows developers to specify a distance range for relevant vectors in their searches, addressing limitations of traditional KNN searches in recommendation systems where results can be either too similar or too diverse. The technical architecture and usage guide for Range Search are outlined, along with details on when to use it over Top-K search. The feature is not limited to recommendation engines but has broader applications in areas like content matching, anomaly detection, and NLP search tasks. It is now available for public preview on Zilliz Cloud.

Zilliz at HackNC 2023

Date published
Nov. 8, 2023

Author(s)
Yujian Tang

Language
English

Word count
203

Hacker News points
None found.

What's this blog post about?

HackNC 2021, an annual hackathon event hosted by the University of North Carolina at Chapel Hill, saw over 1,300 registrations and 650 participating hackers. Zilliz, a data observability company, was represented during the event with a workshop and keynote speech. The winning project, "wellSpent," is an expense tracking app that provides users with a dynamic pie chart of their expenses, transaction lists, and various financial planning tools. Congratulations to the team behind wellSpent for their victory in the Best Use of Zilliz category.

Zilliz at CalHacks 2023

Date published
Nov. 3, 2023

Author(s)
Christy Bergman

Language
English

Word count
1137

Hacker News points
None found.

What's this blog post about?

CalHacks, a hackathon event held in San Francisco from October 27-29, featured over 1000 students from around the world participating in various projects. With $137,650 in prize money and sponsor awards, several innovative projects were awarded for their use of Milvus, an open-source vector database. The winning project, Second Search, utilized Milvus to search lecture videos by embedding video caption text into vectors and returning relevant sections based on user queries. Other notable projects included Jarvis, which described visual scenes to visually impaired users, an AI 911 agent that assessed emergency situations, and Mental Maps, a chatbot for mental well-being tracking.

Announcing Confluent's Kafka Connector for Milvus and Zilliz Cloud: Unlocking the Power of Real-Time AI

Date published
Nov. 3, 2023

Author(s)
Fendy Feng

Language
English

Word count
966

Hacker News points
None found.

What's this blog post about?

Confluent, a data streaming platform, has announced the availability of its Kafka Connector for open-source Milvus and Zilliz Cloud. This collaboration enables seamless real-time vector data streaming from Confluent to Milvus or Zilliz vector databases, significantly enhancing real-time Generative AI powered by large language models (LLMs) like OpenAI's GPT-4. The integration of Zilliz and Confluent allows for continuous flow of real-time Confluent vector streams converted from unstructured data to be ported to Milvus/Zilliz, empowering developers to build applications for various use cases such as real-time semantic search, image/video/audio similarity search and retrieval augmented generation. The integration opens up possibilities for various sectors and applications, including enhancing Generative AI with a real-time knowledge base and optimizing personalized recommendations for e-commerce platforms.

Alexandr Guzhva: Why I Joined Zilliz

Date published
Nov. 2, 2023

Author(s)
Alexandr Guzhva

Language
English

Word count
404

Hacker News points
None found.

What's this blog post about?

Alexandr Guzhva, an expert in performance optimization, joined Zilliz to outcompete its competitors and fully utilize his expertise. With over 15 years of experience in finance and two years at Meta, he has contributed significantly to the FAISS library and written more than 2 million lines of code. Zilliz's focus on advanced similarity search methods and integration with NVIDIA Raft attracted him to the company. His goal is to improve Zilliz products and contribute to Milvus OSS, potentially applying his knowledge of ANNS for time series prediction in the future.

How Troop Uses Milvus Vector Database to Unlock the Collective Power of Retail Investors

Date published
Nov. 1, 2023

Author(s)
Anya Sage

Language
English

Word count
722

Hacker News points
None found.

What's this blog post about?

Troop, a tech company revolutionizing shareholder activism and engagement, leverages machine learning and AI technologies to enable investors to participate in corporate governance. Using the Milvus vector database, Troop built a solution that empowers individuals for collective financial activism in major corporations. The integration of Milvus enabled scalability, efficient handling of massive datasets, separation of storage and compute, rapid scaling of nodes, data partitioning, and improved semantic search capabilities. This infrastructure supports retrieval augmented generation (RAG) to process large volumes of unstructured data and build intelligent shareholder voting recommendation engines.

Evaluations for Retrieval Augmented Generation: TruLens + Milvus

Date published
Oct. 31, 2023

Author(s)
Josh Reini

Language
English

Word count
2154

Hacker News points
None found.

What's this blog post about?

This article discusses the use of vector search technologies, such as Milvus and Zilliz Cloud, in building retrieval augmented generation (RAG) applications. RAGs are question-answering applications that allow large language models (LLMs) to access a verified knowledge base for context. The article highlights various configuration choices that can affect the quality of retrieval, including data selection, embedding model, index type, amount of context retrieved, and chunk size. It also introduces TruLens, an open-source library for evaluating and tracking the performance of LLM applications like RAGs. By using TruLens to evaluate different configurations and parameters, developers can identify failure modes and find the most performant combination for their specific use case.

Retrieval Augmented Generation on Notion Docs via LangChain

Date published
Oct. 30, 2023

Author(s)
Yujian Tang

Language
English

Word count
1042

Hacker News points
None found.

What's this blog post about?

This tutorial demonstrates how to build a retrieval augmented generation (RAG) type app using LangChain and Milvus. The process involves reviewing LangChain self-querying, working with Notion docs in LangChain, ingesting Notion documents, storing them in a vector database, and querying the documents. The tutorial uses LangChain for operational framework and Milvus as the similarity engine. It covers how to load and parse a Notion document into sections to query in a basic RAG architecture, with future tutorials exploring different chunking strategies, embeddings, splitting strategies, and evaluation methods.

Exploring LLM-Driven Agents in the Age of AI

Date published
Oct. 27, 2023

Author(s)
David Wang

Language
English

Word count
872

Hacker News points
None found.

What's this blog post about?

Large Language Models (LLMs) are driving innovation in AI, with LLM-driven Agents at the forefront. These agents combine LLMs with planning, memory, and tool modules to make decisions and take actions autonomously. The AutoGPT project demonstrates their potential by generating tasks, prioritizing them, and executing them using external resources. However, challenges such as getting stuck in loops and prompt length constraints need to be addressed. Ongoing research is focused on improving LLMs' reasoning abilities, enhancing agent frameworks, and developing specialized agent applications for various scenarios.

Experimenting with Different Chunking Strategies via LangChain

Date published
Oct. 24, 2023

Author(s)
Yujian Tang

Language
English

Word count
1499

Hacker News points
None found.

What's this blog post about?

This tutorial explores the impact of different chunking strategies on retrieval augmented generation applications using LangChain. Chunking is the process of dividing text into smaller parts, and the choice of strategy can significantly affect the output quality. The code for this post can be found in a GitHub repo on LLM experimentation. The tutorial covers setting up the environment, importing necessary tools, and creating a function that takes parameters for document ingestion and chunking experimentation. It then tests five different chunking strategies with varying lengths and overlaps. The results show that finding an ideal chunking size is challenging and depends on the desired output format. Future tutorials may cover testing overlaps and using other libraries to refine chunking strategies further.

Jiang Chen: Why I Joined Zilliz

Date published
Oct. 16, 2023

Author(s)
Jiang Chen

Language
English

Word count
835

Hacker News points
None found.

What's this blog post about?

Over the past decade, the author has specialized in various aspects of data infrastructure, including access control, data privacy, NoSQL databases, and web-scale data indexing. In recent years, big data emerged as a significant innovation with technologies like MapReduce, distributed computing, and structured data storage leading the way. However, the AI era requires different technology stacks, especially with the growing popularity of Large Language Models. Embedding and vector stores are at the center of this stage, which is also the focus of Zilliz. The author's experience includes working on search indexing at Google, where they built ultra-flexible infrastructures to understand billions of images and videos on the public web. They believe that AI-native infrastructure holds the key to the future of business and are enthusiastic about democratizing this highly complex infrastructure for resource-limited startups. The author joined Zilliz due to its ambitious mission, exceptional team, and challenging work environment. At Zilliz, they build a suite of tooling and services that ease the information retrieval process on unstructured data, including Towhee, Akcio, and a vector database for efficient storage and search of vector embeddings.

Milvus Introduced MMap for Redefined Data Management and Increased Storage Capability

Date published
Oct. 13, 2023

Author(s)
Yang Cen

Language
English

Word count
661

Hacker News points
None found.

What's this blog post about?

Milvus introduces the MMap feature, which redefines how large data volumes are managed and promises cost efficiency without compromising functionality. MMap is a memory-mapped file technology that allows Milvus to map large files directly into system memory space, transforming them into contiguous memory blocks. This integration eliminates the need for explicit read or write operations, fundamentally changing how Milvus manages data. The feature benefits vector databases by enabling more efficient storage and access to large files or situations where users need to access files randomly. However, it may cause performance fluctuations as data volume grows. Enabling MMap in Milvus is straightforward, requiring a modification of the milvus.yaml file. Future updates will refine memory usage and provide more granular control over the feature.

How to Choose a Vector Database: Qdrant Cloud vs. Zilliz Cloud

Date published
Oct. 13, 2023

Author(s)
Steffi Li

Language
English

Word count
1239

Hacker News points
None found.

What's this blog post about?

This blog compares two vector databases, Qdrant and Zilliz/Milvus. While both are purpose-built for vector data, they serve different market needs. Qdrant is designed for developers who prioritize modern technology and minimal infrastructure maintenance, while Zilliz/Milvus is engineered for extreme scale, high performance, and low latency. The benchmark results show that Zilliz Cloud outperforms Qdrant Cloud in terms of queries per second (QPS), queries per dollar (QP$), and latency. Furthermore, the feature comparison highlights differences in scalability, functionality, and purpose-built features between the two vector databases.

Chat with Towards Data Science Using LlamaIndex

Date published
Oct. 12, 2023

Author(s)
Yujian Tang

Language
English

Word count
1338

Hacker News points
None found.

What's this blog post about?

This tutorial demonstrates how to use LlamaIndex, an open-source data retrieval framework, to improve the performance of a chatbot built with Zilliz Cloud. The primary challenge addressed in this project is integrating an existing Milvus collection into LlamaIndex while handling differences in embedding vector dimensions and metadata field usage. By using LlamaIndex as a query engine, the chatbot's retrieval capabilities are significantly enhanced, providing more accurate and relevant responses to user queries.

Optimizing Data Communication: Milvus Embraces NATS Messaging

Date published
Oct. 11, 2023

Author(s)
Zhen Ye

Language
English

Word count
1055

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, has introduced NATS messaging integration in its latest version 2.3. This feature enhances the handling of substantial data volumes and complex scenarios compared to its predecessor, RocksMQ. NATS is a distributed system connectivity technology implemented in Go that supports various communication modes like Request-Reply and Publish-Subscribe across systems. Milvus 2.3 offers a new control option, mq.type, which allows users to specify the type of MQ they want to use. To enable NATS, set mq.type=natsmq. The migration from RocksMQ to NATS is seamless and involves steps like stopping write operations, flushing data, modifying configurations, and verifying the migration through Milvus logs. Performance testing results show that NATS outperforms RocksMQ for larger data packets (> 64kb), offering much faster response times. In extensive testing with a 100 million vectors dataset, NATS showcased lower vector search and query latency compared to RocksMQ.

Use Milvus and Airbyte for Similarity Search on All Your Data

Date published
Oct. 10, 2023

Author(s)
Joe Reuter

Language
English

Word count
1909

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database used to store, index, and efficiently search high-dimensional vector data. It's particularly useful in applications involving similarity searches across unstructured data, such as Generative Chat responses, product recommendations, and more. By using Airbyte, it's straightforward to transfer data from many different sources into Milvus, calculating vector embeddings of texts along the way. The power of embeddings is to be able to search for relevant pieces of information, even if similar concepts are phrased differently. This article demonstrates how to use Zilliz Cloud as a vector store, Airbyte to extract and load the data, OpenAI embedding API to calculate embeddings, and Streamlit to build a smart submission form showing relevant data.

Christy Bergman: Why I Joined Zilliz

Date published
Oct. 6, 2023

Author(s)
Christy Bergman

Language
English

Word count
1432

Hacker News points
None found.

What's this blog post about?

Christy Bergman, a new Developer Advocate at Zilliz, shares her journey of discovering and choosing Milvus, the world's most popular open-source vector database. She explains how she explored various vector databases, including FAISS, Qdrant, Chroma, Weaviate, Pinecone, and finally settled on Milvus due to its user-friendly experience, speed in loading vectors and querying, and additional features. Christy also discusses her role at Zilliz and her plans for organizing events, writing blogs, improving documentation, and helping developers learn how to use Milvus.

Efficient Vector Similarity Search in Recommender Workflows Using Milvus with NVIDIA Merlin

Date published
Oct. 4, 2023

Author(s)
Burcin Bozkaya

Language
English

Word count
3087

Hacker News points
None found.

What's this blog post about?

This blog post discusses the integration of NVIDIA Merlin, an open-source framework developed for training end-to-end models to make recommendations at any scale, with Milvus, an efficient vector database created by Zilliz. The integration is beneficial in the item retrieval stage with a highly efficient top-k vector embedding search. The post also highlights how Milvus complements Merlin in recommender systems workflows and provides benchmark results showing impressive speedups with GPU-accelerated Milvus that uses NVIDIA RAFT with the vector embeddings generated by Merlin Models.

How to Get the Right Vector Embeddings

Date published
Oct. 3, 2023

Author(s)
Yujian Tang

Language
English

Word count
1846

Hacker News points
None found.

What's this blog post about?

Vector embeddings are crucial when working with semantic similarity. They represent input data as a series of numbers, allowing mathematical operations to be performed on the data instead of relying on qualitative comparisons. The appropriate vector embeddings must be obtained before use, as using an image model for text or vice versa may result in poor results. Vector embeddings are influential for many tasks, particularly semantic search. Vector embeddings are created by removing the last layer and taking the output from the second-to-last layer of a deep learning model (embedding models or a deep neural network). The dimensionality of a vector embedding is equivalent to the size of the second-to-last layer in the model. Common vector dimensionalities include 384, 768, 1,536, and 2,048. A single dimension in a vector embedding does not mean anything; however, when all dimensions are taken together, they provide the semantic meaning of the input data. The dimensions represent high-level, abstract attributes that depend on the training data and the model itself. Different models generate different embeddings based on their training data and architecture. To obtain proper vector embeddings, identify the type of data you wish to embed (images, text, audio, videos, or multimodal data) and use appropriate open-source embedding models from Hugging Face or PyTorch. For example, ResNet-50 is a popular image recognition model, while MiniLM-L6-v2 and MPNet-Base-V2 are text embedding models. Vector databases like Milvus and Zilliz Cloud are used to store, index, and search across massive datasets of unstructured data through vector embeddings. They employ the Approximate Nearest Neighbor (ANN) algorithm to calculate spatial distances between query vectors and stored vectors in the database.

How to Migrate Your Data to Milvus Seamlessly: A Comprehensive Guide

Date published
Oct. 2, 2023

Author(s)
Wenhui Zhang

Language
English

Word count
1741

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database designed for similarity search, offering robust storage, processing, and retrieval capabilities for billions of vector data with minimal latency. As of September 2023, it has garnered almost 23,000 stars on GitHub and is used by tens of thousands of users across various industries. The latest release introduces new features such as GPU support and MMap storage for increased performance and capacity. To facilitate the migration process from older versions of Milvus (1.x), FAISS, and Elasticsearch 7.0 and beyond to the latest Milvus 2.x versions, a data migration tool called Milvus Migration has been developed. This powerful tool is written in Go and supports multiple interaction modes, including command-line interface (CLI) using the Cobra framework, Restful API with built-in Swagger UI, and integration as a Go module in other tools. Milvus Migration simplifies the migration process through its robust feature set, which includes support for various data sources such as Milvus 1.x to Milvus 2.x, Elasticsearch 7.0 and beyond to Milvus 2.x, and FAISS to Milvus 2.x. It also supports multiple file formats like local files, Amazon S3, Object Storage Service (OSS), Google Cloud Platform (GCP), and flexible Elasticsearch integration for migrating dense_vector type vectors from Elasticsearch as well as other field types such as long, integer, short, boolean, keyword, text, and double. The migration process involves configuring a migration.yaml file with details about the data source, target, and other relevant settings. Users can then execute the migration job using either command-line or Restful API methods. Once completed, users can view the total number of successful rows migrated and perform other collection-related operations using Attu, an all-in-one vector database administration tool. Future plans for Milvus Migration include supporting migration from more data sources like Redis and MongoDB, adding resumable migration capabilities, simplifying migration commands by merging the dump and load processes into one, and expanding support to other mainstream data sources.

Getting Started with GPU-Powered Milvus: Unlocking 10x Higher Performance

Date published
Sept. 29, 2023

Author(s)
Jaken Ma

Language
English

Word count
803

Hacker News points
None found.

What's this blog post about?

Milvus 2.3 introduces GPU support, unlocking a 10x increase in throughput and significant reductions in latency. This strategic innovation is aimed at enhancing vector searching capabilities, particularly with the rise of Large Language Models (LLMs) like GPT-3. The integration of Milvus and NVIDIA GPUs allows for efficient searching through massive datasets and expands the AI landscape. To get started with the Milvus GPU version, users need to install CUDA drivers, configure Milvus GPU settings, build Milvus locally, and run it in standalone mode or using a provided docker-compose file.

Using LangChain to Self-Query a Vector Database

Date published
Sept. 28, 2023

Author(s)
Yujian Tang

Language
English

Word count
1206

Hacker News points
None found.

What's this blog post about?

LangChain, known for orchestrating interactions with large language models (LLMs), has introduced self-querying capabilities. This tutorial demonstrates how to perform self-querying on Milvus, the world's most popular vector database. The process involves setting up LangChain and Milvus, obtaining necessary data, informing the model about expected data format, and finally, performing self-querying. Self-querying allows an LLM to query itself using the underlying vector store, creating a simple retrieval augmented generation (RAG) app in the CVP framework.

Zilliz x Galileo: The Power of Vector Embeddings

Date published
Sept. 27, 2023

Author(s)
Yujian Tang

Language
English

Word count
1119

Hacker News points
None found.

What's this blog post about?

Unstructured data, which makes up 80% of global data, is becoming increasingly prevalent. Vector embeddings are numerical representations used to work with unstructured data such as text, images, audio, and videos. They can be extracted from trained machine-learning models and have high dimensionality to store complex data. Vector embeddings are the de facto way to work with unstructured data, allowing for comparisons between data points. When generating embedding vectors, factors like vector size, training data quality, and quantity should be considered. Vector embeddings can be used to debug training data by detecting errors through clustering, finding samples not present in the training data, identifying hallucinations, and fixing errors in retrieval augmented generation (RAG). Additionally, they can be indexed, stored, and queried using vector databases like Milvus or Zilliz Cloud. The power of vector embeddings is evident from their wide range of use cases, making them a valuable tool for working with unstructured data in machine learning applications.

Zilliz Makes Real-Time AI a Reality with Confluent

Date published
Sept. 26, 2023

Author(s)
Steffi Li

Language
English

Word count
976

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has integrated with Confluent Cloud, allowing users of both platforms to access real-time data streams across their entire business for building AI applications. The integration enables the ingestion, parsing, and processing of real-time data into Zilliz Cloud using Confluent's Kafka producer and consumer APIs. This collaboration opens new avenues for leveraging Generative Artificial Intelligence (GenAI) in real-time scenarios, such as personalized responses and content generation platforms. The integration also enhances traditional AI use cases like recommender systems and anomaly detection. With easy access to data streams from across their entire business, Zilliz users can now create a real-time knowledge base, build governed, secured, and trusted AI applications, and experiment, scale, and innovate faster.

Exploring the Marvels of Knowhere 2.0

Date published
Sept. 25, 2023

Author(s)
Patrick Xu

Language
English

Word count
803

Hacker News points
None found.

What's this blog post about?

Milvus 2.3 has been released with significant updates, including the transformative upgrade of Knowhere 2.0. Key features of Knowhere 2.0 include support for GPU indexes, Cosine similarity, ScaNN index, ARM architecture, range search, optimized filter queries, code structure and compilation enhancements, MMap support, and retrieval of original vectors. These improvements aim to elevate Milvus's performance and user experience in vector databases.

How to Choose A Vector Database: Weaviate Cloud vs. Zilliz Cloud

Date published
Sept. 21, 2023

Author(s)
Steffi Li

Language
English

Word count
1234

Hacker News points
None found.

What's this blog post about?

This blog compares two vector databases, Weaviate and Zilliz/Milvus. While both are designed to manage vector data, they serve different needs. Weaviate is a strong choice for developers seeking quick and straightforward implementation, while Zilliz/Milvus excels in handling large-scale, high-performance, low-latency applications. The benchmark results show that Zilliz Cloud outperforms Weaviate Cloud in terms of queries per second (QPS), queries per dollar (QP$), and latency. Furthermore, a feature comparison reveals differences in scalability, functionality, and purpose-built features between the two vector databases.

Chat Towards Data Science: Building a Chatbot with Zilliz Cloud

Date published
Sept. 20, 2023

Author(s)
Yujian Tang

Language
English

Word count
2347

Hacker News points
None found.

What's this blog post about?

In the first part of the Chat Towards Data Science blog series, we guide you through building a chatbot using your dataset as the knowledge backbone. We employ web scraping techniques to collect data for our knowledge base and store it in Zilliz Cloud, a fully managed vector database service built on Milvus. The tutorial covers creating a chatbot for the Towards Data Science publication, demonstrating how to prompt the user for a query, vectorize the query, and query the vector database. However, we discovered that while the results are semantically similar, they are not exactly what we desire. In the next part of this blog series, we will explore using LlamaIndex to route queries and see if we can achieve better results.

Getting Started with Pgvector: A Guide for Developers Exploring Vector Databases

Date published
Sept. 15, 2023

Author(s)
Siddhant Varma

Language
English

Word count
2072

Hacker News points
None found.

What's this blog post about?

This guide explores the use of Pgvector, an extension of PostgreSQL that allows developers to store and query vector data. It covers setting up Pgvector, integrating it with PostgreSQL, using it for similarity searches, understanding its indexes and limitations, and comparing it with dedicated vector databases like Milvus and Zilliz. The article also discusses the advantages of using dedicated vector databases over traditional relational databases and provides benchmarking results to help developers choose the best solution for their projects.

Comparing Llama 2 Chat and ChatGPT: How They Perform in Question Answering

Date published
Sept. 13, 2023

Author(s)
Towhee team

Language
English

Word count
2113

Hacker News points
None found.

What's this blog post about?

Meta AI has released its open-source large language model (LLM), Llama 2, which is available for free use in commercial applications. It comes in three sizes and supports context lengths of up to 4096 tokens. Llama Chat, the fine-tuned model of Llama 2, has been trained on over 1 million human annotations and is specifically tailored for conversational AI scenarios. The performance of Llama 2 in answering questions was compared with that of ChatGPT, showing that both models excel at answering questions based on real-world knowledge. However, Llama 2 faces challenges maintaining answer quality when confronted with complex text formatting. Llama 2 stands out by not requiring high-end GPUs and can operate smoothly on desktop-level GPUs, especially after undergoing low-bit quantization.

An Engineering Perspective: Why Milvus is a Compelling Option for Your Apps?

Date published
Sept. 10, 2023

Author(s)
Owen jiao

Language
English

Word count
688

Hacker News points
None found.

What's this blog post about?

Milvus 2.3, the latest version of the pioneering vector database, offers numerous enhancements and new features that make it an excellent choice for users looking to build applications ranging from recommendation systems and chatbots to artificial general intelligence (AGI) and retrieval augmented generation (RAG). The updated version balances performance, cost, and scalability while providing multiple deployment options. It also empowers developers with simplicity by enhancing its API and supporting data integration with other products. Furthermore, Milvus 2.3 ensures stability and second-level availability through improved system reliability features. Future updates will introduce additional cutting-edge features to enhance the user experience further.

How to Choose A Vector Database: Elastic Cloud vs. Zilliz Cloud

Date published
Sept. 5, 2023

Author(s)
Chris Churilo

Language
English

Word count
1221

Hacker News points
None found.

What's this blog post about?

This blog compares Elastic Cloud and Zilliz Cloud, two vector database cloud services. It delves into benchmarks to offer a performance perspective and performs an in-depth feature analysis of both platforms. The results show that Zilliz outperforms Elastic Cloud in terms of QPS, queries per dollar (QP$), and latency. Additionally, the blog highlights the features of each platform, such as scalability, multi-tenancy, data isolation, API support, and user interface/administrative console. It also provides a migration tutorial for moving from Elasticsearch to Zilliz Cloud.

What’s New in Milvus 2.3

Date published
Aug. 30, 2023

Author(s)
Steffi Li

Language
English

Word count
364

Hacker News points
None found.

What's this blog post about?

Milvus 2.3.0 has been released, featuring numerous enhancements and improvements. Key features include computational upgrades with GPU & ARM64 support, search & indexing enhancements such as range search and ScaNN index integration, data pipeline tools like iterator in Pymilvus and upsert operation, and system optimizations for better operability, load balancing, and query performance. The release also includes bug fixes and updates to existing tools like Birdwatcher and Attu. Developers are encouraged to integrate these updates and provide feedback.

Building LLM Apps with 100x Faster Responses and Drastic Cost Reduction Using GPTCache

Date published
Aug. 28, 2023

Author(s)
Fendy Feng

Language
English

Word count
1461

Hacker News points
None found.

What's this blog post about?

The article discusses the challenges faced by developers while building applications based on large language models (LLMs) such as high costs of API calls and poor performance due to response latency. It introduces GPTCache, an open-source semantic cache designed to improve efficiency and speed of GPT-based applications. GPTCache stores LLM responses in the cache, allowing users to retrieve previously requested answers without calling the LLM again. The article explains how GPTCache works, its benefits including drastic cost reduction, faster response times, improved scalability, and better availability. It also provides an example of OSS Chat, an AI chatbot that utilizes GPTCache and the CVP stack for more accurate results.

Comparing Different Vector Embeddings

Date published
Aug. 21, 2023

Author(s)
Yujian Tang

Language
English

Word count
2436

Hacker News points
None found.

What's this blog post about?

This article discusses the differences between vector embeddings generated by different neural networks and how to evaluate them in Jupyter Notebook. Vector embeddings are numerical representations of unstructured data, such as images, videos, audio, text, and molecular images. They are generated by running input data through a pre-trained neural network and taking the output of the second-to-last layer. The article provides an example of comparing vector embeddings from three different multilingual models based on MiniLM from Hugging Face using L2 distance metric and an inverted file index as the vector index. It also demonstrates how to compare vector embeddings directly in a Jupyter Notebook with Milvus Lite, a lightweight version of Milvus.

How to Build an AI Chatbot with Milvus and Towhee

Date published
Aug. 18, 2023

Author(s)
Eric Goebelbecker

Language
English

Word count
2364

Hacker News points
None found.

What's this blog post about?

In this tutorial, we will create an intelligent chatbot using Milvus and Towhee. We will use the following components to build our chatbot: 1. Milvus: An open-source vector database for efficient similarity search and AI applications. 2. Towhee: A Python library that provides a set of pre-built machine learning models and tools for processing unstructured data. 3. OpenAI API: A service that allows developers to access powerful language generation models like GPT-3.5. 4. Gradio: An open-source Python library for creating interactive demos of machine learning models. First, we need to install the required packages: ```bash pip install milvus pymilvus towhee gradio ``` Next, let's define some variables and answer the prompt for the API key. Run this code to do so: ```python import os import getpass MILVUS_URI = 'http://localhost:19530' [MILVUS_HOST, MILVUS_PORT] = MILVUS_URI.split('://')[1].split(':') DROP_EXIST = True EMBED_MODEL = 'all-mpnet-base-v2' COLLECTION_NAME = 'chatbot_demo' DIM = 768 OPENAI_API_KEY = getpass.getpass('Enter your OpenAI API key: ') if os.path.exists('./sqlite.db'): os.remove('./sqlite.db') ``` Sample pipeline Now, let's download some data and store it in Milvus. But before you do that, let's look at a sample pipeline for downloading and processing unstructured data. You'll use the Towhee documentation pages for this example. You can try different sites to see how the code processes different data sets. This code uses Towhee pipelines: - input - begins a new pipeline with the source passed into it - map - uses ops.text_loader() to retrieve the URL and map it to 'doc' - flat_map - uses ops.text_splitter() to process the document into "chunks" for storage - output - closes and prepares the pipeline for use Pass this pipeline to DataCollection to see how it works: ```python from towhee import pipe, ops, DataCollection pipe_load = ( pipe.input('source') .map('source', 'doc', ops.text_loader()) .flat_map('doc', 'doc_chunks', ops.text_splitter(chunk_size=300)) .output('source', 'doc_chunks') ) DataCollection(pipe_load('https://towhee.io')).show() ``` Here's the output from show(): The pipeline created five chunks from the document. Sample embedding pipeline The pipeline retrieved the data and created chunks. You need to create embeddings, too. Let's take a look at another sample pipeline: This one uses map() to run ops.sentence_embedding.sbert() on each chunk. In this example, we're passing in a single block of text. ```python from towhee import pipe, ops, DataCollection pipe_embed = ( pipe.input('doc_chunk') .map('doc_chunk', 'vec', ops.sentence_embedding.sbert(model_name=EMBED_MODEL)) .map('vec', 'vec', ops.np_normalize()) .output('doc_chunk', 'vec') ) text = '''SOTA Models We provide 700+ pre-trained embedding models spanning 5 fields (CV, NLP, Multimodal, Audio, Medical), 15 tasks, and 140+ model architectures. These include BERT, CLIP, ViT, SwinTransformer, data2vec, etc. ''' DataCollection(pipe_embed(text)).show() ``` Run this code to see how the pipeline processes the single text block: Prepare Milvus Now, you need a collection to hold the data. This function defines create_collection(), which uses MILVUS_HOST and MILVUS_PORT to connect to Milvus, drop any existing collections with the specified name, and create a new one with this schema: - id - an integer identifier - embedding - a vector of floats for the embeddings - text - the corresponding text for the embeddings ```python from pymilvus import ( connections, utility, Collection, CollectionSchema, FieldSchema, DataType ) def create_collection(collection_name): connections.connect(host=MILVUS_HOST, port=MILVUS_PORT) has_collection = utility.has_collection(collection_name) if has_collection: collection = Collection(collection_name) if DROP_EXIST: collection.drop() else: return collection # Create collection fields = [ FieldSchema(name='id', dtype=DataType.INT64, is_primary=True, auto_id=True), FieldSchema(name='embedding', dtype=DataType.FLOAT_VECTOR, dim=DIM), FieldSchema(name='text', dtype=DataType.VARCHAR, max_length=500) ] schema = CollectionSchema( fields=fields, description="Towhee demo", enable_dynamic_field=True ) collection = Collection(name=collection_name, schema=schema) index_params = { 'metric_type': 'IP', 'index_type': 'IVF_FLAT', 'params': {'nlist': 1024} } collection.create_index( field_name='embedding', index_params=index_params ) return collection ``` Insert pipeline It's time to process your input text and insert it into Milvus. Let's start with a pipeline that collapses what you learned above: This function: - Creates the new collection - Retrieves the data - Splits it into chunks - Creates embeddings using EMBED_MODEL - Insert the text and embeddings into Milvus ```python from towhee import pipe, ops, DataCollection load_data = ( pipe.input('collection_name', 'source') .map('collection_name', 'collection', create_collection) .map('source', 'doc', ops.text_loader()) .flat_map('doc', 'doc_chunk', ops.text_splitter(chunk_size=300)) .map('doc_chunk', 'vec', ops.sentence_embedding.sbert(model_name=EMBED_MODEL)) .map('vec', 'vec', ops.np_normalize()) .map(('collection_name', 'vec', 'doc_chunk'), 'mr', ops.ann_insert.osschat_milvus(host=MILVUS_HOST, port=MILVUS_PORT)) .output('mr') ) ``` Here it is in action: ```python project_name = 'towhee_demo' data_source = 'https://en.wikipedia.org/wiki/Frodo_Baggins' mr = load_data(COLLECTION_NAME, data_source) print('Doc chunks inserted:', len(mr.to_list())) ``` The model does a good job of pulling three closely matched nodes: Search knowledge base Now, with the embeddings and text stored in Milvus, you can search it: This function creates a query pipeline. The most important step is this one: ops.ann_search.osschat_milvus(host=MILVUS_HOST, port=MILVUS_PORT, **{'metric_type': 'IP', 'limit': 3, 'output_fields': ['text']})) The osschat_milvus searches the embeddings for matches to the submitted text. Here is the whole pipeline: ```python from towhee import pipe, ops, DataCollection pipe_search = ( pipe.input('collection_name', 'query') .map('query', 'query_vec', ops.sentence_embedding.sbert(model_name=EMBED_MODEL)) .map('query_vec', 'query_vec', ops.np_normalize()) .map(('collection_name', 'query_vec'), 'search_res', ops.ann_search.osschat_milvus(host=MILVUS_HOST, port=MILVUS_PORT, **{'metric_type': 'IP', 'limit': 3, 'output_fields': ['text']})) .flat_map('search_res', ('id', 'score', 'text'), lambda x: (x[0], x[1], x[2])) .output('query', 'text', 'score') ) ``` Try it: ```python query = 'Who is Frodo Baggins?' DataCollection(pipe_search(project_name, query)).show() ``` The model does a good job of pulling three closely matched nodes: Add an LLM Now, it’s time to add a large language model (LLM) so users can hold a conversation with the chatbot. We’ll use ChatGPT and the OpenAI API for this example. Chat history In order to get better results from the LLM, you need to store chat history and present it with queries. You’ll use SQLite for this step: Here's a function for retrieving the history: ```python from towhee import pipe, ops, DataCollection pipe_get_history = ( pipe.input('collection_name', 'session') .map(('collection_name', 'session'), 'history', ops.chat_message_histories.sql(method='get')) .output('collection_name', 'session', 'history') ) ``` Here's the one to store it: ```python from towhee import pipe, ops, DataCollection pipe_add_history = ( pipe.input('collection_name', 'session', 'question', 'answer') .map(('collection_name', 'session', 'question', 'answer'), 'history', ops.chat_message_histories.sql(method='add')) .output('history') ) ``` LLM query pipeline Now, we need a pipeline to submit queries to ChatGPT: This pipeline: - searches Milvus using the user's query - collects the current chat history - submits the query, Milvus search, and chat history to ChatGPT - Appends the ChatGPT result to the chat history - Returns the result to the caller ```python from towhee import pipe, ops, DataCollection chat = ( pipe.input('collection_name', 'query', 'session') .map('query', 'query_vec', ops.sentence_embedding.sbert(model_name=EMBED_MODEL)) .map('query_vec', 'query_vec', ops.np_normalize()) .map(('collection_name', 'query_vec'), 'search_res', ops.ann_search.osschat_milvus(host=MILVUS_HOST, port=MILVUS_PORT, **{'metric_type': 'IP', 'limit': 3, 'output_fields': ['text']})) .map('search_res', 'knowledge', lambda y: [x[2] for x in y]) .map(('collection_name', 'session'), 'history', ops.chat_message_histories.sql(method='get')) .map(('query', 'knowledge', 'history'), 'messages', ops.prompt.question_answer()) .map('messages', 'answer', ops.LLM.OpenAI(api_key=OPENAI_API_KEY, model_name='gpt-3.5-turbo', temperature=0.8)) .map(('collection_name', 'session', 'query', 'answer'), 'new_history', ops.chat_message_histories.sql(method='add')) .output('query', 'history', 'answer') ) ``` Let's test this pipeline before connecting it to a GUI: ```python new_query = 'Where did Frodo take the ring?' DataCollection(chat(COLLECTION_NAME, new_query, session_id)).show() ``` The pipeline works. Let's put together a Gradio interface. Gradio GUI First, you need functions to create a session identifier and to respond to queries from the interface: These functions create a session ID using a UUID, and accept a session and query for the query pipeline: ```python import uuid import io def create_session_id(): uid = str(uuid.uuid4()) suid = ''.join(uid.split('-')) return 'sess_' + suid def respond(session, query): res = chat(COLLECTION_NAME, query, session).get_dict() answer = res['answer'] response = res['history'] response.append((query, answer)) return response ``` Next, the Gradio interface uses these functions to build a chatbot: It uses the Blocks API to create a ChatBot interface. The Send Message button uses the respond function to send requests to ChatGPT: ```python import gradio as gr with gr.Blocks() as demo: session_id = gr.State(create_session_id) with gr.Row(): with gr.Column(scale=2): gr.Markdown('''## Chat''') conversation = gr.Chatbot(label='conversation').style(height=300) question = gr.Textbox(label='question', value=None) send_btn = gr.Button('Send Message') send_btn.click( fn=respond, inputs=[ session_id, question ], outputs=conversation, ) demo.launch(server_name='127.0.0.1', server_port=8902) ``` Here it is: Now, you have an intelligent chatbot! Summary In this post, we created Towhee pipelines to ingest unstructured data, process it for embeddings, and store those embeddings in Milvus. Then, we created a query pipeline for the chat function and connected the chatbot with an LLM. Finally, we got an intelligent chatbot. This tutorial demonstrates how easy it is to build applications with Milvus. Milvus brings numerous advantages when integrated into applications, especially those relying on machine learning and artificial intelligence. It offers highly efficient, scalable, and reliable vector similarity search and analytics capabilities critical in applications like chatbots, recommendation systems, and image or text recognition.

Building LLM Augmented Apps with Zilliz Cloud

Date published
Aug. 17, 2023

Author(s)
Steffi Li

Language
English

Word count
1272

Hacker News points
None found.

What's this blog post about?

The release of GPT-3.5 and GPT-4 has revolutionized how users interact with data and applications, providing more natural and intuitive communication interfaces. However, implementing LLMs like ChatGPT in applications presents challenges such as lack of private data access, hallucination, outdated information, high costs, slow performance, and immutable pre-training data. Zilliz Cloud and GPTCache are innovative solutions that address these issues by improving accuracy, timeliness, cost-efficiency, and performance. The CVP Stack (ChatGPT/LLMs + a vector database + prompt-as-code) offers a robust framework for building LLM applications. OSS Chat is an example of a successful AI chatbot built with the CVP stack using Akcio and Zilliz Cloud. To learn more about these technologies, join the upcoming webinar on September 7.

Using AI to Find Your Celebrity Stylist (Part II)

Date published
Aug. 11, 2023

Author(s)
Yujian Tang

Language
English

Word count
2528

Hacker News points
None found.

What's this blog post about?

In this tutorial, we extended our first celebrity-style project by using Milvus' new dynamic schema, filtering out certain segmentation IDs, and keeping track of the bounding boxes of our matches. We also sorted our search results to return the top three results based on the number of matches. Milvus' new dynamic schema allows us to add extra fields when we upload data using a dictionary format, changing the way we were initially batch-uploading a list of lists. It also facilitated adding crop coordinates without changing the schema. As a new preprocessing step, we filtered out certain IDs that aren't clothing-related based on the model card in Hugging Face. We filter these IDs out in the get_masks function. Fun fact, the obj_ids object in that function is actually a tensor. We also kept track of the bounding boxes. We moved the embedding step to the image cropping function and returned the embeddings with the bounding boxes and segmentation IDs. Then, we saved these embeddings into Milvus using a dynamic schema. At query time, we aggregated all the returned images by the number of bounding boxes they contained, allowing us to find the closest matching celebrity image via different articles of clothing. Now it's up to you. You can take my suggestions and make something else out of it, such as a fashion recommender system, a better style comparison system for you and your friends, or a generative fashion AI app.

Using AI to Find Your Celebrity Stylist (Part I)

Date published
Aug. 8, 2023

Author(s)
Yujian Tang

Language
English

Word count
2587

Hacker News points
None found.

What's this blog post about?

The article discusses the use of AI in fashion, specifically focusing on a project called "Fashion AI" that utilizes a fine-tuned model to segment clothing in images. It explains how the project involves cropping out each labeled article and resizing the images to the same size before storing the embeddings generated from those images in Milvus, an open-source vector database. The article also provides detailed steps on how to generate image segmentation for fashion items, add your image data to Milvus, and find out which celebrity your dress is most like using this technology.

Zilliz Cloud Expands to AWS and GCP Singapore

Date published
Aug. 7, 2023

Author(s)
Steffi Li

Language
English

Word count
338

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has expanded its services to AWS and GCP Singapore regions, following the positive response from users since its launch in April 2023. This expansion aims to meet increasing demand and provide greater flexibility for customers by offering more deployment options. As a result, Zilliz Cloud is now the first fully managed vector database available on AWS in the APAC region. The company invites users to explore new possibilities with this expansion and offers free trials of its Starter Plan and Standard plan with up to $200 worth of credits.

Retrieval Augmented Generation with Citations

Date published
Aug. 4, 2023

Author(s)
Yujian Tang

Language
English

Word count
1209

Hacker News points
None found.

What's this blog post about?

This tutorial explains how to implement retrieval augmented generation (RAG) with citations using LlamaIndex and Milvus. RAG is a technique used in large language model (LLM) applications to supplement their knowledge, addressing the lack of up-to-date or domain-specific information. The process involves using a vector database like Milvus to inject knowledge into an app. Citations and attributions are crucial for determining trustworthy answers as more data is added. LlamaIndex and Milvus can be used together to create a citation query engine, allowing users to retrieve information with citations or attributions. The tutorial demonstrates this process using Python libraries and provides code examples for scraping data from Wikipedia, setting up the vector store in LlamaIndex, and querying the engine with citations.

What Is a Real Vector Database?

Date published
Aug. 3, 2023

Author(s)
Fendy Feng

Language
English

Word count
1135

Hacker News points
None found.

What's this blog post about?

The emergence of ChatGPT has signaled the start of a new era in artificial intelligence (AI), with vector databases becoming an essential infrastructure. Vector databases store and retrieve unstructured data such as images, audio, videos, and text through high dimensional values called embeddings. They are frequently used for similarity searches using the Approximate Nearest Neighbor (ANN) algorithm. Specialized vector databases like Milvus and Zilliz Cloud offer many user-friendly features and are a more optimal solution for unstructured data storage and retrieval compared to vector search libraries. Vector databases are becoming vital infrastructure for AI-related tech stacks, such as LLM augmentation, recommender systems, image/audio/video/text similarity searches, anomaly detection, question-answering systems, and molecular similarity searches. To choose the most suitable vector database for your project, VectorDBBench is an open-source benchmarking tool that evaluates various vector database systems regarding QPS, latency, capacity, and other metrics.

Zilliz Cloud: a Fully-Managed Vector Database That Minimizes Users’ Costs for Building AI Apps

Date published
Aug. 1, 2023

Author(s)
James Luan

Language
English

Word count
1037

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud, a fully-managed vector database, aims to minimize users' costs for building AI applications. The latest release of Zilliz Cloud offers new features such as partition key, dynamic schema, and JSON support, making it more accessible and affordable for developers. By minimizing development, hardware, and maintenance costs, Zilliz Cloud enables traditional companies and startups to create innovative AI applications. Future updates will introduce unstructured data processing pipelines, support for complex aggregation functions, and global expansion of services.

Getting Started With the Milvus JavaScript Client

Date published
July 28, 2023

Author(s)
Eric Goebelbecker

Language
English

Word count
1833

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source database designed for vector search, offering robust scalability for various loads. It's ideal for machine learning deployments and includes best-of-class tooling like the JavaScript client. In this tutorial, we'll guide you through setting up a development environment with Milvus Lite and the Milvus node.js SDK(Client). We'll cover connecting to a server, creating databases and collections, inserting data, performing queries and searches, and more. With these tools, working with vector data in JavaScript using Milvus becomes simple and efficient.

Breaking Barriers: Democratizing Access to Vector Databases for All

Date published
July 27, 2023

Author(s)
Fendy Feng

Language
English

Word count
1340

Hacker News points
None found.

What's this blog post about?

Vector databases, crucial infrastructure for AI applications and large language models (LLMs), have gained widespread attention from a broader user base. Unlike traditional relational or NoSQL databases that store structured data, vector databases are purpose-built to store and manage unstructured data in numeric representations called embeddings. They enable similarity searches using the approximate nearest neighbor (ANN) algorithm, making them valuable for various use cases such as recommender systems, anomaly detection, and question-and-answer systems. The democratization of vector databases is essential to make progress in AI technology. However, only some developers have equal access due to barriers like proprietary technology, complex architecture and deployment, high costs, and poor user experience. To improve vector database democratization, it's crucial to evangelize knowledge, expertise, and technologies; open the source code to all developers; provide fully managed vector database services; offer free cloud options for individual developers and small teams; and prioritize a great user experience that meets users' needs. Choosing the right vector database for your project can be challenging due to the many available options. VectorDBBench, an open-source benchmarking tool, thoroughly evaluates and compares different vector database systems based on critical metrics such as queries per second (QPS), latency, throughput, and capacity.

Yujian Tang: Why I Joined Zilliz as Developer Advocate

Date published
July 26, 2023

Author(s)
Yujian Tang

Language
English

Word count
706

Hacker News points
None found.

What's this blog post about?

Yujian Tang, a developer advocate at Zilliz, has an extensive background in computer science, statistics, and neuroscience. He previously worked as a software engineer at Amazon and researched machine learning. Tang chose to join Zilliz due to their focus on vector databases and the company's commitment to open-source ethos. As a developer advocate, he works with cutting-edge AI technologies and hosts meetups and conferences. He encourages others interested in DevRel roles to consider joining Zilliz.

Getting Started with the Zilliz REST API

Date published
July 25, 2023

Author(s)
Eric Goebelbecker

Language
English

Word count
1752

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud is a comprehensive vector database service that accelerates AI and analytics applications at scale. It's built on Milvus, an open-source vector database capable of handling billions of vector embeddings. The use cases for Milvus and Zilliz Cloud are broad and varied, including powering recommendation systems and building AI models in healthcare. The Zilliz REST API provides methods for managing clusters, collections, and vector data, allowing users to create, list, describe, drop, insert, delete, query, and search collections.

Zilliz Cloud: Igniting Vector Searching with Rocket-Like Speed

Date published
July 19, 2023

Author(s)
Li Liu

Language
English

Word count
978

Hacker News points
None found.

What's this blog post about?

Zilliz recently launched an updated version of its cloud platform, introducing new features such as a free tier, dynamic schema and partition keys, and more affordable pricing plans. The latest update has significantly improved performance, making it twice as fast as the previous version and three to ten times faster than other vector databases like Milvus. Zilliz Cloud's speed is attributed to its robust vector indexing engine, optimized code structure, and AutoIndex feature for stable recall rates.

Frank Liu: Why I Joined a Vector Database Company

Date published
July 18, 2023

Author(s)
Frank Liu

Language
English

Word count
928

Hacker News points
None found.

What's this blog post about?

The text discusses the importance of machine learning models and their corresponding embeddings, which are high-dimensional vectors that provide an abstract way to represent input data in the model. It explains how embeddings have been used in various applications such as image recognition and semantic search. The author shares their personal journey working with embeddings and vector search, highlighting their experiences at Yahoo and a startup they founded. They also discuss Zilliz's mission to build an affordable and scalable vector search solution for the enterprise AI infrastructure market. The text ends by inviting readers to join Zilliz in its efforts to democratize enterprise AI infrastructure.

What's New in Milvus 2.2.10 and 2.2.11

Date published
July 14, 2023

Author(s)
Steffi Li

Language
English

Word count
291

Hacker News points
None found.

What's this blog post about?

Milvus has released versions 2.2.10 and 2.2.11, which include enhancements to improve functionality and user experience. Updates have been made based on community feedback, with a focus on performance and security improvements. The latest versions introduce the 'FlushAll' function and Database API for RBAC capabilities, optimize disk usage for RocksMq by enabling zstd compression, and replace CGO payload writer with Go payload writer to reduce memory usage. Additionally, several bug fixes and performance enhancements have been made in these releases.

Democratizing Vector Databases: Empowering Access & Equality

Date published
July 12, 2023

Author(s)
Yujian Tang

Language
English

Word count
1040

Hacker News points
None found.

What's this blog post about?

The democratization of technology refers to making it widely available and accessible, particularly in the context of software engineering. This involves using one's knowledge to simplify the creation, adoption, and understanding of technological advances for others. In this article, the author discusses the process of democratizing vector databases, which are complex tools that have traditionally only been available to developers at large enterprises. The author highlights three pillars of technology democratization: education, increasing accessibility, and evangelism. By open-sourcing projects like Milvus, providing educational resources, and offering free tiers for cloud services, companies can help expand the adoption of vector databases and other advanced technologies.

Filip Haltmayer: Why I Joined Zilliz as Software Engineer

Date published
July 10, 2023

Author(s)
Filip Haltmayer

Language
English

Word count
701

Hacker News points
None found.

What's this blog post about?

Filip Haltmayer, a software engineer at Zilliz in Redwood City, California, shares his journey into the company that leads in AI and vector search technology. His passion for software engineering led him to focus on distributed systems and machine learning during university. After graduation, he worked on personal projects in these areas before joining Zilliz. The technical interview with Zilliz aligned with his interests, and he was impressed by the team's intelligence and shared passion for pushing boundaries in vector search technology. Two years later, Haltmayer remains happy at Zilliz as it continues to grow and contribute significantly to the field of AI and vector searching.

Getting Started with PyMilvus

Date published
July 7, 2023

Author(s)
Eric Goebelbecker

Language
English

Word count
1806

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, paired with PyMilvus - its Python SDK, is a powerful tool for handling large data sets and performing advanced computations and searches. This tutorial guides you in installing and setting up a development environment for using Milvus and PyMilvus. It then walks through example code for analyzing audio files, storing their data in Milvus, and using it to compare audio samples for similarities. The setup includes creating a virtual environment, installing Python dependencies, starting Redis, and installing and starting Milvus Lite. Finally, the tutorial demonstrates how to connect to Redis and Milvus, create a collection, store audio data, and search for similarities.

Setting Up With Facebook AI Similarity Search (FAISS)

Date published
July 4, 2023

Author(s)
Keshav Malik

Language
English

Word count
2231

Hacker News points
None found.

What's this blog post about?

Facebook's AI Similarity Search (FAISS) is a library that provides efficient and reliable solutions to similarity search problems, especially when dealing with large-scale data. It functions on the concept of "vector similarity" and can handle millions or even billions of vectors quickly and accurately. FAISS has various applications, from image recognition and text retrieval to clustering and data analysis. To set up FAISS, you need Conda installed on your system. Once installed, FAISS can be used for tasks such as searching for similar text data in the Stanford Question Answering Dataset (SQuAD). Best practices include understanding your data, choosing the right index, preprocessing your data effectively, batching your queries, and tuning your parameters. Compared to FAISS, purpose-built vector databases like Milvus offer more advanced capabilities for scalable similarity search and AI applications.

Webinar Recap: Retrieval Techniques for Accessing the Most Relevant Context for LLM Applications

Date published
July 3, 2023

Author(s)
Fendy Feng

Language
English

Word count
1635

Hacker News points
None found.

What's this blog post about?

In a recent webinar, Harrison Chase and Filip Haltmayer discussed retrieval techniques for accessing the most relevant context for large language model (LLM) applications. Retrieval involves extracting information from connected external sources and incorporating it into queries to provide context. Semantic search is one of the most critical use cases for retrieval, which functions within a typical CVP architecture (ChatGPT+Vector store+Prompt as code). The webinar also covered edge cases of semantic searches, such as repeated information, conflicting information, temporality, metadata querying, and multi-hop questions. Various solutions to these challenges were proposed during the discussion.

How to Select the Most Appropriate CU Type and Size for Your Business?

Date published
June 30, 2023

Author(s)
Robert Guo

Language
English

Word count
1164

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud offers three types of Compute Units (CUs) - Performance-optimized, Capacity-optimized, and Cost-optimized. The Performance-optimized CU is ideal for rapid response time applications with high throughput requirements such as Generative AI, Recommender systems, Search engines, Chatbots, Content moderation, Augmenting LLMs' knowledge base, and Anti-fraud systems. Capacity-optimized CUs are suitable for handling large-scale unstructured data searches like text, images, videos, and molecular structures, copyright violations detection, and identity verification. Cost-optimized CUs are perfect for offline tasks with a tight budget but higher search latency. The performance comparison shows that the Performance-optimized CU outperforms others in terms of latency and throughput. Capacity evaluation results indicate that capacity-optimized and cost-optimized CUs have equal capacities, five times larger than the performance-optimized CU. Examples are provided to help businesses choose the most suitable option for their needs.

Persistent Vector Storage for LlamaIndex

Date published
June 27, 2023

Author(s)
Yujian Tang

Language
English

Word count
1040

Hacker News points
None found.

What's this blog post about?

This article discusses the challenges and solutions in building applications using large language models (LLMs) such as OpenAI's ChatGPT. The three main challenges are high costs, lack of up-to-date information, and need for domain-specific knowledge. Two proposed frameworks to address these issues are fine-tuning and caching + injection. LlamaIndex is a powerful tool that can abstract much of the latter framework. The article introduces LlamaIndex as a "black box around your Data and an LLM" and explains its four main indexing patterns: list, vector store, tree, and keyword indices. It then demonstrates how to create and save a persistent vector index using LlamaIndex with both local and cloud vector databases (Milvus Lite and Zilliz). In summary, the article provides an overview of LlamaIndex, its applications in LLM-based applications, and offers guidance on creating and managing persistent vector store indices for real-world use cases.

Enhancing ChatGPT's Intelligence and Efficiency: The Power of LangChain and Milvus

Date published
June 26, 2023

Author(s)
Silvia Chen

Language
English

Word count
2219

Hacker News points
None found.

What's this blog post about?

The combination of LangChain and Milvus can enhance ChatGPT's intelligence and efficiency by harnessing vector stores' power. LangChain is a framework for developing applications powered by language models, while Milvus is a vector database that enables semantic search functionality. By integrating these tools, developers can create more reliable AI-Generated Content (AIGC) applications and address hallucination problems in ChatGPT. Additionally, using GPTCache and fine-tuning embedding models and prompts can improve the performance and search quality of AIGC applications.

The Philosophy Behind Zilliz Cloud’s Product Experience Optimization

Date published
June 20, 2023

Author(s)
Koko Lv

Language
English

Word count
998

Hacker News points
None found.

What's this blog post about?

The latest version of Zilliz Cloud introduces design optimizations to improve the product experience. Key updates include prioritizing ease of use, streamlining workflows with clear guidance, valuing user feedback, ensuring visually enjoyable experiences, and offering a smooth user journey. These enhancements aim to provide users with an intuitive interface and seamless navigation while using Zilliz Cloud's vector retrieval capabilities. The company encourages users to share their suggestions or ideas for further improvements through the support portal, LinkedIn, Twitter, or by contacting engineers directly.

Query Multiple Documents Using LlamaIndex, LangChain, and Milvus

Date published
June 19, 2023

Author(s)
Yujian Tang

Language
English

Word count
1974

Hacker News points
None found.

What's this blog post about?

This tutorial demonstrates how to use Large Language Models (LLMs) like GPT in production by querying multiple documents using LlamaIndex, LangChain, and Milvus. The process involves setting up a Jupyter Notebook, building a Document Query Engine with LlamaIndex, starting the vector database, gathering documents, creating document indices in LlamaIndex, performing decomposable querying over your documents, comparing non-decomposed queries, and summarizing how to do multi-document querying using LlamaIndex. The use of decomposable queries allows for breaking down complex queries into simpler ones that can be answered by a single data source.

Improved Team Collaboration with Zilliz Cloud’s New Organizations and Roles Feature

Date published
June 16, 2023

Author(s)
Sarah Tang

Language
English

Word count
886

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud, a cloud service offering fast and scalable vector retrieval capabilities, has introduced the Organizations and Roles feature to simplify team access and permission management. The new feature includes three roles: Organization Owner, Organization Member, and Project Owner, each with unique access and permissions. This update aims to improve collaboration, security, and flexibility in users' workflows. To get started, users can sign up for a free account or log into their existing one, create an organization, invite new members, and manage billings collectively.

Introducing an Open Source Vector Database Benchmark Tool for Choosing the Ideal Vector Database for Your Project

Date published
June 16, 2023

Author(s)
Li Liu

Language
English

Word count
1134

Hacker News points
None found.

What's this blog post about?

The new open-source Vector Database Benchmark Tool is designed to help developers choose the ideal vector database for their projects. This tool enables users to measure performance across critical metrics and compare different options. Key features include flexibility, realistic workload simulation, interactive reports and visualization, and open-source community collaboration. VectorDBBench, written in Python, supports six vector databases: Milvus, Zilliz, Pinecone, Weaviate, Qdrant, and Elasticsearch. Users can download the tool from GitHub and install it using pip. The tool is actively maintained by a community of developers committed to improving its features and performance.

Zilliz Cloud Latest Update: A Game-Changer Bringing Elite Performance within Reach of All Developers

Date published
June 14, 2023

Author(s)
Robert Guo

Language
English

Word count
940

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has released an update that introduces new features and more affordable pricing options, making it accessible to all developers regardless of budget. The latest release includes a free tier option with up to two collections handling 500,000 vectors each. Various pricing plans are available: Starter, Standard, Enterprise, and Self-hosted. A new Cost-Optimized CU offers the same storage capacity as the existing Capacity-Optimized CU but costs about 30% less. The Organizations and Roles feature enables users to manage team access and permissions easily. Zilliz Cloud now supports JSON data types, enabling users to store and manage JSON data alongside Approximate Nearest Neighbor (ANN) Search capabilities. Dynamic schema support is also available. A new benchmark tool, VectorDBBench, allows users to measure the performance of vector database solutions against other offerings in the market with their data.

Prompting in LangChain

Date published
June 12, 2023

Author(s)
Yujian Tang

Language
English

Word count
1472

Hacker News points
None found.

What's this blog post about?

The recent emergence of Language Learning Models (LLMs) has introduced new tools, such as the LLM framework called LangChain. This versatile tool offers various features like different prompting methods, maintaining conversational context, and connecting to external tools. Prompting is a crucial task in building AI applications with LLMs, and this article extensively explores how to use LangChain for more complex prompts. The text covers: 1. Simple Prompts in LangChain: This section demonstrates the basic usage of LangChain prompting by creating a single prompt using the `PromptTemplate` object. It also explains how to add an LLM and create an `LLMChain`. 2. Multi Question Prompts: The article shows how to handle multiple questions within a single prompt using the same `PromptTemplate` object. 3. Few Shot Learning with LangChain Prompts: This section introduces "few shot learning," where users can teach AI how to behave by providing examples of desired responses. It demonstrates this feature using the `FewShotPromptTemplate`. 4. Token Limiting Your LangChain Prompts: To manage token usage and keep costs down, the article explains how to use the `LengthBasedExampleSelector` object to limit tokens in queries. 5. A Summary of Prompting in LangChain: The text concludes by summarizing the key points covered in the article about prompting with LangChain.

Auto GPT Explained: A Comprehensive Auto-GPT Guide For Your Unique Use Case

Date published
June 8, 2023

Author(s)
Yujian Tang

Language
English

Word count
1789

Hacker News points
None found.

What's this blog post about?

Auto-GPT is an open-source, autonomous AI application that utilizes large language models (LLMs) to perform tasks such as browsing the internet, speaking via text-to-speech tools, writing code, and keeping track of its inputs and outputs. It has garnered significant attention due to its potential for automating mundane tasks and enhancing productivity. This article provides a comprehensive guide on setting up Auto-GPT, configuring it, running tasks, and adding memory using Milvus vector database. The integration of Milvus as a backend storage solution allows users to search, retrieve, or edit data more efficiently than the default JSON file format.

What's New in Milvus version 2.2.9

Date published
June 6, 2023

Author(s)
Chris Churilo

Language
English

Word count
357

Hacker News points
None found.

What's this blog post about?

The Milvus community has released Milvus 2.2.9, which includes new features such as JSON support, dynamic schema handling, and partition key usage. Additionally, the update allows for more efficient resource utilization by removing the limit on the number of partitions. Bug fixes and performance enhancements are also included in this release. For a complete list of changes, check out the release notes.

Get Ready for GPT-4 with GPTCache & Milvus, Save Big on Multimodal AI

Date published
May 31, 2023

Author(s)
Jael Gu

Language
English

Word count
2734

Hacker News points
None found.

What's this blog post about?

OpenAI's ChatGPT, powered by GPT-3.5, has revolutionized natural language processing (NLP) and sparked interest in large language models (LLMs). As the adoption of LLMs grows across various industries, so does the need for more advanced AI models that can process multimodal data. The tech world is buzzing with anticipation for GPT-4, which promises to be even more powerful by enabling visual inputs. To prepare for this upcoming revolution, Zilliz has introduced GPTCache integrated with Milvus - a game-changing solution that can help businesses save big on multimodal AI. Multimodal AI refers to integrating multiple modes of perception and communication, such as speech, vision, language, and gesture, to create more intelligent and effective AI systems. This approach allows AI models to better understand and interpret human interactions and environments and generate more accurate and nuanced responses. Multimodal AI has applications in various fields, including healthcare, education, entertainment, and transportation. GPTCache is a project developed to optimize response time and reduce expenses for API calls associated with large models. It enables the system to search for potential answers in the cache first before sending a request to a large model. GPTCache speeds up the entire process and helps reduce the costs of running large models. Semantic cache stores and retrieves knowledge representations of concepts. It is designed to store and retrieve semantic information or knowledge in a structured way. Thus, an AI system can better understand and respond to queries or requests. The idea behind a semantic cache is to provide faster access to relevant information by providing precomputed answers to commonly asked questions or queries, which can help improve the performance and efficiency of AI applications. One of the cornerstones of a semantic cache such as GPTCache is the vector database. Specifically, the embedding generator of GPTCache converts data to embeddings for vector storage and semantic search. Storing vectors in a vector database, such as Milvus, not only supports storage for a large data scale but also helps speed up and improve the performance of similarity search. This allows for more efficient retrieval of potential answers from the cache. The Milvus ecosystem provides helpful tools for database monitoring, data migration, and data size estimation. For more straightforward implementation and maintenance of Milvus, there is a cloud-native service Zilliz Cloud. The combination of Milvus with GPTCache offers a powerful solution for enhancing the functionality and performance of multimodal AI applications. Temperature in machine learning has become a valuable tool to balance randomness and coherence and align with the user's or application's specific needs and preferences. The temperature in GPTCache mainly retains the general concept of temperature in machine learning. It is achieved through 3 options in the workflow: 1. Select after evaluation 2. Call model without cache 3. Edit result from cache GPTCache and Milvus represent an exciting and innovative approach to building intelligent multimodal systems. The following examples showcase how GPTCache and Milvus have been implemented in multimodal situations: 1. Text-to-Image: Image Generation 2. Image-to-Text: Image Captioning 3. Audio-to-Text: Speech Transcription With its support for unstructured data, Milvus is an ideal solution for building and scaling multimodal applications. Furthermore, adding more features in GPTCache, such as session management, context awareness, and server support, further enhances the capabilities of multimodal AI. With these advancements, multimodal AI models have more potential uses and scenarios.

GPTCache, LangChain, Strong Alliance

Date published
May 25, 2023

Author(s)
Sim Fu

Language
English

Word count
710

Hacker News points
None found.

What's this blog post about?

The GPTCache project aims to build a semantic cache for storing large language model (LLM) responses, addressing the challenges of increasing costs and slow response times associated with high traffic levels. LangChain is a library that assists in developing applications combining LLMs with other computational or knowledge sources. Before integrating GPTCache, LangChain's cache was based on string matching, including Memory Cache, SQLite Cache, and Redis Cache. The current condition for hitting the cache requires identical questions, which has limited cache utilization rate. Integration of GPTCache significantly improves cache functionality by performing embedding operations to obtain vectors and conducting vector approximation searches in cache storage. This increases the cache hit rate, reduces LLM usage costs, and speeds up response times.

Data Mastery Made Easy: Exploring the Magic of Vector Databases in Jupyter Notebooks

Date published
May 24, 2023

Author(s)
Yujian Tang

Language
English

Word count
908

Hacker News points
None found.

What's this blog post about?

This tutorial explores the use of vector databases in Jupyter Notebooks, particularly Milvus Lite. Vector databases are useful for working with unstructured data like images, text, or video and can help solve problems faced by large language models (LLMs) such as a lack of domain knowledge and up-to-date data. They also power similarity search applications, product recommendations, reverse image search, and semantic text search. The tutorial covers the basics of vector databases, Milvus Lite, and how to use them in Jupyter Notebooks. It provides examples for using a standalone vector database instance like Milvus Standalone and offers resources for understanding vector databases further.

Ultimate Guide to Getting Started with LangChain

Date published
May 22, 2023

Author(s)
Yujian Tang

Language
English

Word count
1446

Hacker News points
None found.

What's this blog post about?

LangChain is a framework that enables the creation of applications using large language models (LLMs) like GPT. It provides functionalities such as token management and context management, allowing users to build with the CVP Framework. The two core LangChain functionalities for LLMs are data-awareness and agency. One primary use case is querying text data, which can be done using documents, vector stores, or GPT interactions. In this tutorial, we covered how to interact with GPT using LangChain and queried a document for semantic meaning using LangChain with a vector store.

What is Pymilvus?

Date published
May 20, 2023

Author(s)
Filip Haltmayer

Language
English

Word count
1159

Hacker News points
None found.

What's this blog post about?

Pymilvus is a Python SDK built for Milvus and Zilliz Cloud, offering access to all features provided by Milvus. However, users have faced issues with the complexity of configuration options available in the vector database system. To address this, MilvusClient was introduced as an attempt to simplify the API for most users. It offers functions such as insert_data(), upsert_data(), search_data(), query_data(), get_vectors_by_pk(), delete_by_pk(), add_partition(), and remove_partition(). The main goal of MilvusClient is to provide easy-to-use operations that may not exist or are unoptimized on the Pymilvus side. As Pymilvus improves, these operations can be optimized behind the scenes while maintaining a simple API for users.

Using a Vector Database to Search White House Speeches

Date published
May 19, 2023

Author(s)
Yujian Tang

Language
English

Word count
1967

Hacker News points
None found.

What's this blog post about?

This tutorial demonstrates how to use semantic search with a vector database to analyze speeches given by the Biden administration during their first two years in office. The dataset used is "The White House (Speeches and Remarks) 12/10/2022" found on Kaggle. The process involves cleaning the data, setting up a vector database using Milvus Lite, getting vector embeddings from speeches, populating the vector database, and performing semantic searches based on descriptions. Semantic search allows for finding speeches with similar content rather than just matching exact phrases or sentences.

Getting Started with LlamaIndex

Date published
May 17, 2023

Author(s)
Yujian Tang

Language
English

Word count
1793

Hacker News points
None found.

What's this blog post about?

LlamaIndex is a user-friendly, flexible data framework that connects private, customized data sources to large language models (LLMs). It helps address LLMs' lack of domain-specific knowledge by injecting data. The indexes in LlamaIndex include list index, vector store index, tree index, and keyword index. Each index is made up of "nodes" that represent a chunk of text from a document. LlamaIndex can build many types of indexes depending on the task at hand. It offers an efficient way to query large amounts of data for certain keywords or introduces similarity into LLM applications. The Basics of How to Use LlamaIndex section covers loading a text file, querying the vector store index, and saving and loading an index. Projects that can be created with LlamaIndex include chatbots, web apps, and more.

Revolutionizing Autonomous AI: Harnessing Vector Databases to Empower Auto-GPT

Date published
May 16, 2023

Author(s)
Sim Fu

Language
English

Word count
1019

Hacker News points
None found.

What's this blog post about?

Auto-GPT is an experimental open-source project that combines a GPT language model with other tools to create an AI system capable of working independently without human intervention. It consists of two core parts: an LLM and a command set, which function as its "brain" and "hands" respectively. However, Auto-GPT has limitations in understanding and retaining extensive contextual information due to the token limit of the GPT model it leverages. Integrating Auto-GPT with a vector database like Milvus can enhance its memory and contextual understanding by converting commands and execution results into embeddings and storing them in the vector database. This integration allows for more precise information retrieval, improving the system's ability to generate aligned commands. Despite some limitations, such as unfiltered top-k results and inability to customize the embedding model, Auto-GPT has immense potential when combined with vector databases like Milvus, pushing the boundaries of AI technology and AIGC systems.

Webinar Recap: Boost Your LLM with Private Data Using LlamaIndex

Date published
May 15, 2023

Author(s)
Fendy Feng

Language
English

Word count
1267

Hacker News points
None found.

What's this blog post about?

The popularity of large language models (LLMs) like ChatGPT has demonstrated their capabilities in generating knowledge and reasoning. However, these LLMs are pre-trained on publicly available data, which may not provide specific answers and results relevant to a business. LlamaIndex is one solution that can augment LLMs with private data by providing a simple, flexible, centralized interface connecting external data and LLMs. In a recent webinar, Jerry Liu, Co-founder and CEO of LlamaIndex, discussed how LlamaIndex could boost LLMs with private data. Two methods to enhance LLMs with private data were presented: fine-tuning and in-context learning. Fine-tuning requires retraining the network with private data but can be costly and lack transparency. In contrast, in-context learning involves pairing a pre-trained model with external knowledge and a retrieval model to add context to the input prompt. LlamaIndex is an open-source tool that provides central data management and query interface for LLM applications. It contains three main components: data connectors for ingesting data from various sources, data indices for structuring data for different use cases, and a query interface for inputting prompts and receiving knowledge-augmented output. LlamaIndex also manages interactions between the language model and private data to provide accurate and desired results. It operates like a black box, taking in detailed query descriptions and providing rich responses that include references and actions. The vector store index is a popular mode of retrieval and synthesis that pairs a vector store with a language model. LlamaIndex provides numerous integrations, including the integration of Milvus and LlamaIndex. Milvus is an open-source vector database capable of handling vast datasets containing millions, billions, or even trillions of vectors. With this integration, Milvus acts as the backend vector store for embeddings and text. LlamaIndex has various use cases, including semantic search, summarization, text to SQL (structured data), synthesis over heterogeneous data, compare/contrast queries, multi-step queries, exploiting temporal relationships, and recency filtering/outdated nodes.

Zilliz Cloud: a New Level of Usability and Performance

Date published
May 4, 2023

Author(s)
Sarah Tang

Language
English

Word count
614

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has released an update that introduces six new features and enhancements, aiming to provide a more robust and cost-effective platform with an enhanced user experience. The latest release includes the Pricing Calculator for better cost estimates, improved system resiliency with data backup and restore on GCP, removal of storage quota for optimal user experience, automatic suspension of inactive databases for credit saving, custom timezone support for more accurate timestamps, and collection renaming for easier database management. Other improvements include a better billing interface, renamed CU types, and additional features to assist users in getting started with Zilliz Cloud.

Milvus 2.2.6: New Features and Updates

Date published
April 28, 2023

Author(s)
Chris Churilo

Language
English

Word count
93

Hacker News points
None found.

What's this blog post about?

Milvus version 2.2.6 has been released with critical issues addressed from version 2.2.5. The new release includes bug fixes and performance enhancements, as detailed in the release notes. Users are advised to upgrade to this version for improved functionality. Key resources include PyPI package, documentation, Docker image, and GitHub release page.

The Fight for AI Supremacy

Date published
April 25, 2023

Author(s)
Filip Haltmayer

Language
English

Word count
1211

Hacker News points
None found.

What's this blog post about?

LangChain is a framework designed to enhance the capabilities of Large Language Models (LLMs) by enabling users to chain together different computations and knowledge. It allows for the creation of domain-specific chatbots, action agents for specific computation, and more. Milvus, an open-source vector database, plays a crucial role in LangChain's integration as it enables efficient storage and retrieval of large documents or collections of documents. The integration involves extending the VectorStore class to implement functions such as add_texts(), similarity_search(), max_marginal_relevance_search(), and from_text(). However, challenges arise due to Milvus' inability to handle JSON natively, which may require additional work when dealing with existing collections or inserting data. Overall, LangChain offers a promising solution for improving LLMs' usefulness by providing working memory and knowledge base integration.

Yet another cache, but for ChatGPT

Date published
April 11, 2023

Author(s)
James Luan

Language
English

Word count
1949

Hacker News points
None found.

What's this blog post about?

ChatGPT is an impressive technology that enables developers to create game-changing applications. However, the performance and cost of language model models (LLMs) are significant issues that hinder their widespread application in various fields. To address this issue, a cache layer called GPTCache was developed for LLM-generated responses. This caching layer is similar to Redis and Memcache and can decrease expenses for generating content and provide faster real-time responses. With the help of GPTCache, developers can boost their LLM applications 100 times faster. The cache system reduces the number of ChatGPT calls by taking advantage of temporal and spatial locality in user access for AIGC applications.

Caching LLM Queries for performance & cost improvements

Date published
April 10, 2023

Author(s)
Chris Churilo

Language
English

Word count
1079

Hacker News points
None found.

What's this blog post about?

GPTCache is an open-source semantic cache designed to improve the efficiency and speed of GPT-based applications by storing responses generated by language models. It allows users to customize the cache according to their needs, including options for embedding functions, similarity evaluation functions, storage location, and eviction policy management. The tool supports multiple popular databases for cache storage and provides a range of vector store options for finding the most similar requests based on extracted embeddings from input requests. GPTCache aims to provide flexibility and cater to a wider range of use cases by supporting multiple APIs and vector stores.

New Support for Backup and Restore of Zilliz Cloud Databases

Date published
April 7, 2023

Author(s)
Sarah Tang

Language
English

Word count
479

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud has introduced Backup and Restore feature, allowing users to easily back up their important data and restore it in case of an unexpected loss. The new feature provides easy-to-use interface, automated backups, secure storage, and efficiency for big data. Access to this feature is limited to enterprise users subscribing to the Zilliz Cloud Enterprise plan. Future plans include flexible recovery options, point-in-time recovery, and cross-region backup. Backup & Restore are now available for Zilliz Cloud Enterprise Plan with pricing based on usage: $0.025 GB and free storage retention within 30 days.

Accelerate your migration experience from Milvus to Zilliz Cloud

Date published
April 6, 2023

Author(s)
Sarah Tang

Language
English

Word count
398

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced a new migration feature that allows customers to seamlessly move their local Milvus database to the fully managed cloud service, Zilliz Cloud. This feature ensures data safety and security during the migration process. The migration tool is free for Enterprise users subscribed to the Zilliz Cloud enterprise plan. Users can start a 30-day free trial with $100 worth of credit to explore the new features in Zilliz Cloud.

Zilliz Cloud Expands with Multi-Cloud Support

Date published
April 5, 2023

Author(s)
Emily Kurze

Language
English

Word count
483

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud, a vector database-as-a-service, is designed to help developers focus on creating innovative AI applications by handling the infrastructure and storage of embeddings. The platform offers multi-cloud and multi-region availability, currently supporting AWS and Google Cloud with plans for future expansions. Users can quickly scale their vector search storage capacity without re-provisioning hardware and benefit from streamlined procurement, consolidated billing, and leveraging pre-committed AWS spend through the AWS Marketplace. Zilliz Cloud is also available on Google Cloud as an official partner, with more regions and cloud providers planned for future releases to support developers' diverse needs.

ChatGPT+ Vector database + prompt-as-code - The CVP Stack

Date published
April 4, 2023

Author(s)
James Luan

Language
English

Word count
1242

Hacker News points
None found.

What's this blog post about?

Zilliz has introduced OSS Chat, a chatbot designed to provide technical knowledge about open-source projects. Built using OpenAI's ChatGPT and a vector database, the service currently supports Hugging Face, Pytorch, and Milvus but plans to expand to more projects in the future. The new AI stack, called CVP Stack (ChatGPT+Vector database+prompt-as-code), is aimed at overcoming ChatGPT's limitations by using a vector database for accurate information retrieval. OSS Chat demonstrates this approach by leveraging GitHub repositories and their associated docs pages as the source of truth, converting data into embeddings, and storing them in Zilliz. When users interact with OSS Chat, it triggers a similarity search in Zilliz to find relevant matches and feeds the retrieved data into ChatGPT for precise responses.

Zilliz Cloud, the new billion-scale offering

Date published
April 4, 2023

Author(s)
Robert Guo

Language
English

Word count
970

Hacker News points
None found.

What's this blog post about?

Zilliz has announced the general availability of an update to its cloud vector database service, raising the standard for usability, security, performance, and capability. The latest version supports billion-scale vector collections and offers a 2.5x reduction in search latency compared to the original release. Additionally, Zilliz Cloud is now available on Google Cloud Platform (GCP) and AWS Marketplace. New features include rolling upgrades, backup and restore functionality, recycler bin for data security, and database migration toolkits from open-source Milvus.

What's New in Milvus version 2.2.5

Date published
March 30, 2023

Author(s)
Chris Churilo

Language
English

Word count
233

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, has released version 2.2.5 with new features and improvements. Key updates include a security fix for MinIO (MinIO CVE-2023-28432) by updating to the latest release, and the addition of a First/Random replica selection policy that selects replicas in a round-robin fashion, improving throughput. The release also includes bug fixes and performance enhancements. For more information, check out the release notes or download Milvus to get started.

ChatGPT retrieval plugin with Zilliz and Milvus

Date published
March 23, 2023

Author(s)
Filip Haltmayer

Language
English

Word count
811

Hacker News points
None found.

What's this blog post about?

OpenAI has open-sourced the code for a knowledge base retrieval plugin, allowing ChatGPT to augment its information by retrieving knowledge-based data from relevant document snippets. The plugin uses OpenAI's text-embedding-ada-002 embeddings model and stores the embeddings into a vector database like Milvus or Zilliz. Enterprises can benefit from this plugin by making their internal documents available to employees through ChatGPT, ensuring accurate and up-to-date information retrieval. The plugin also supports continuous processing and storage of documents from various data sources using incoming webhooks. Additionally, the memory feature allows ChatGPT to remember information from conversations and store it in a vector database for later use.

Milvus support for multiple Index types

Date published
March 23, 2023

Author(s)
Chris Churilo

Language
English

Word count
690

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database that supports eight types of Indexes to optimize data querying and retrieval. These include FLAT, IVF_FLAT, IV_SQ8, HNSW Index, IVF_PQ, ANNOY, BIN_FLAT, and BIN_IVF_FLAT. Each Index type is best suited for specific scenarios based on factors such as data dimensions, dataset size, search efficiency requirements, and available resources. Choosing the right Index type can significantly improve search performance in AI applications.

What’s New In Milvus 2.3 Beta - 10X faster with GPUs

Date published
March 21, 2023

Author(s)
Chris Churilo

Language
English

Word count
774

Hacker News points
None found.

What's this blog post about?

The Beta release of Milvus 2.3 introduces new features and improvements aimed at boosting the performance of AI-powered applications. Key features include support for GPU acceleration, RAFT-based integration, range search capabilities, mmap file I/O, incremental backups, and change data capture (CDC). These enhancements enable faster and more efficient vector data searches, improved productivity, and better overall performance of AI systems. The release also includes bug fixes and improvements for a smoother user experience.

Milvus Performance Evaluation 2023

Date published
March 17, 2023

Author(s)
Chris Churilo

Language
English

Word count
554

Hacker News points
None found.

What's this blog post about?

Developers often ask how Milvus compares to previous versions for embedding workloads, with concerns about performance degradation. Benchmarks conducted on Milvus v2.2.3 vs. v2.2.0 and v2.0.0 show that the latest version significantly improves search and indexing speeds. Specifically, Milvus 2.2.3 achieved a 2.5x reduction in search latency compared to the original Milvus 2.0.0 release and a 4.5x increase in QPS. The performance evaluation technical paper provides detailed methodology and results. Periodic re-running of benchmarks will update the findings, with all code available on Github for further verification or suggestions.

What’s New In Milvus 2.2.4

Date published
March 17, 2023

Author(s)
Chris Churilo

Language
English

Word count
319

Hacker News points
None found.

What's this blog post about?

Milvus 2.2.4 has been released, featuring resource grouping for QueryNodes to improve performance and better manage resources in multi-tenant scenarios. Additionally, enhancements include collection renaming, Google Cloud Storage support, and a new option (ignore_growing) for search and query APIs. The release also includes bug fixes and performance improvements. For more information, check the release notes or download Milvus to get started.

How Zilliz Cloud Protects Your Data

Date published
March 9, 2023

Author(s)
Frank Liu

Language
English

Word count
1006

Hacker News points
None found.

What's this blog post about?

The text discusses the importance of data protection, security, and availability when moving vector search workloads to the cloud. It highlights three pillars of information security - confidentiality, integrity, and availability. The text also mentions common data management mistakes and how Zilliz Cloud offers features to protect users' data and services by ensuring confidentiality, integrity, and availability.

What’s New In Milvus 2.2.3

Date published
Feb. 27, 2023

Author(s)
Chris Churilo

Language
English

Word count
253

Hacker News points
None found.

What's this blog post about?

Milvus, an open-source vector database, has released version 2.2.3 with new features and improvements. The release includes Rolling Upgrade support for minimizing service disruptions during upgrades and Coordinator High Availability (HA) to ensure quick failure recovery times. Additionally, enhancements have been made to bulk-insert, memory usage reduction, monitoring metrics optimization, and Meta storage performance. However, a breaking change has reduced the maximum number of fields in a collection from 256 to 64. The release also includes bug fixes and improvements.

How to Integrate OpenAI Embedding API with Zilliz Cloud

Date published
Jan. 11, 2023

Author(s)
Frank Liu

Language
English

Word count
562

Hacker News points
None found.

What's this blog post about?

In 2018, Zilliz developed Milvus, a vector database designed to enhance search and storage capabilities. The initial focus was on improving the user experience, reliability, performance, and scalability of the platform. As a result, the Milvus community has grown significantly in terms of users, contributors, and stars (nearing 15000). Recently, the community emphasized the need to expand the vector database ecosystem by incorporating visualizations, tools, connectors, etc., with embedding model integrations being one of the most requested features. To address this demand, Zilliz will provide integration examples for Milvus and Zilliz Cloud with open-source or paid embedding models. Additionally, they have launched Towhee, a project that integrates hundreds of open-source models, embedding APIs, and in-house models to create end-to-end search pipelines backed by Milvus or Zilliz Cloud. The company plans to continue its support for the Milvus project while also focusing on integration and partnerships with the broader machine learning ecosystem.


2022

The Next Stop for Vector Databases: 8 Predictions for 2023

Date published
Dec. 9, 2022

Author(s)
James Luan

Language
English

Word count
1550

Hacker News points
None found.

What's this blog post about?

In 2022, there was significant growth in the field of vector databases with multiple open-source products and cloud-based services emerging. This trend is expected to continue into 2023 as capital markets invest in these technologies. Key predictions for 2023 include differentiation and specialization among vector databases, a move towards a unified query interface, further integration of vector databases with traditional ones, significant cost reduction in vector databases, the emergence of the first serverless vector database, rise of open-source tools for vector databases, early adoption of AI for Database (AI4DB) in vector databases, and the second commercial company emerging from open-source Milvus. These developments indicate a promising year for vector databases, making them more cost-effective and efficient.

All You Need to Know About ANN Machine Learning

Date published
Dec. 1, 2022

Author(s)
Zilliz

Language
English

Word count
2060

Hacker News points
None found.

What's this blog post about?

An Artificial Neural Network (ANN) is a machine learning model inspired by the structure and functions of the human brain. It consists of an input layer, several hidden layers, and an output layer. The most common types of ANNs include feed forward neural networks, convolutional neural networks, and recurrent neural networks. Applications of ANNs span across various industries such as speech recognition, image recognition, text classification, forecasting, and social media analysis. While ANNs offer numerous advantages like parallel processing capability and wide applications, they also face challenges such as scalability, testing, verification, and integration into modern environments. Vector databases are crucial for managing massive embedding vectors generated by deep neural networks and other machine learning models, which can be stored in a vector database offered by Zilliz.

Understanding K-means Clustering in Machine Learning

Date published
Oct. 26, 2022

Author(s)
Zilliz

Language
English

Word count
2219

Hacker News points
None found.

What's this blog post about?

K-means clustering is an unsupervised machine learning algorithm that groups objects based on attributes. It is widely used in various industries, such as customer segmentation, recommendation engines, and similarity search. The algorithm works by calculating the distance of each data element from the geometric center of a cluster and reconfiguring the cluster if it finds a point belonging to a specific cluster closer to the centroid of another cluster. K-means clustering is useful in areas such as image processing, information retrieval, recommendation engines, and data compression. The number of clusters can be chosen using methods like the elbow method or the silhouette method. Zilliz offers a one-stop solution for challenges in handling unstructured data, especially for enterprises that build AI/ML applications that leverage vector similarity search.

What is K-Nearest Neighbors (KNN) Algorithm in Machine Learning? An Essential Guide

Date published
Oct. 17, 2022

Author(s)
Zilliz

Language
English

Word count
1634

Hacker News points
None found.

What's this blog post about?

The K-Nearest Neighbors (KNN) algorithm is a supervised machine learning technique used for classification and regression problems. It is categorized as a lazy learner, meaning it only stores the training dataset without going through a training stage. KNN works by estimating the likelihood that an unobserved data point will belong to one of two groups based on its nearest neighbors in the dataset. The algorithm uses a voting mechanism where the class with the most votes is assigned to the relevant data point. Different distance metrics can be used to determine whether or not a data point is a neighbor, such as Euclidean, Manhattan, Hamming, Cosine, Jaccard, and Minkowski distances. KNN can be improved by normalizing data on the same scale, tuning hyperparameters like K and distance metric, and using techniques like cross-validation to test different values of K. The algorithm is time-efficient, simple to tune, and easily adaptable to multi-class problems but may not perform well with high-dimensional or unbalanced data.

From Text to Image: Fundamentals of CLIP

Date published
Oct. 4, 2022

Author(s)
Rentong Guo

Language
English

Word count
1508

Hacker News points
None found.

What's this blog post about?

This blog introduces the fundamentals of CLIP, an advanced text-to-image service developed by OpenAI. It explains how search algorithms and semantic similarity are used to match texts with images. The process involves mapping the semantics of texts and images into a high-dimensional space where vectors representing similar semantics have small distances between them. A typical text-to-image service consists of three parts: request side (texts), search algorithm, and underlying databases (images). CLIP helps in creating a unified semantic space for both texts and images, enabling efficient cross-modal search. The next article will demonstrate how to build a prototype text-to-image service using these concepts.

Anatomy of A Cloud Native Vector Database Management System

Date published
Sept. 15, 2022

Author(s)
Xiaomeng Yi

Language
English

Word count
3051

Hacker News points
None found.

What's this blog post about?

The paper "Manu: A Cloud Native Vector Database Management System" discusses the design philosophy and principles behind Manu, a cloud native database purpose built for vector data management. The authors identify four common business requirements for vector databases that are difficult to address under the initial framework: ever-changing requirements, flexible consistency policy, component-level elasticity, and simpler transaction processing model. To meet these needs, they propose five broad objectives for Manu: long-term evolvability, tunable consistency, good elasticity, high availability, and high performance. The paper then delves into the architecture of Manu, which adopts a four-layer design that enables decoupling of read from write, stateless from stateful, and storage from computing. It also explains the data processing workflow inside Manu, including data insertion, index building, and query execution. The authors conduct an overall system performance evaluation and compare Manu with other vector search systems in terms of query performance. They conclude by discussing future directions for research into cloud-native vector database management systems.

ArXiv Scientific Papers Vector Similarity Search with Milvus 2.1

Date published
Aug. 9, 2022

Author(s)
Marie Stephen Leo

Language
English

Word count
3034

Hacker News points
None found.

What's this blog post about?

In this post, the author demonstrates how to build a semantic similarity search engine for scientific papers using open-source tools like ArXiv, Dask, sentence-transformers, and Milvus vector database. The process involves setting up an environment, downloading the arXiv dataset from Kaggle, loading data into Python using Dask, implementing a scientific paper semantic similarity search application using Milvus vector database, and running queries to find similar papers. This approach can be used as a template for building any NLP semantic similarity search engine, not just scientific papers. The author also provides an overview of the SPECTRE model, which is used to convert texts into embeddings.

Introducing Zilliz Cloud : Fully-managed Vector Database Cloud Service in Preview

Date published
Aug. 3, 2022

Author(s)
Zilliz

Language
English

Word count
497

Hacker News points
None found.

What's this blog post about?

Zilliz Cloud, a fully-managed vector database cloud service built around Milvus, has been launched in preview mode for early access application. The service is designed to manage and process feature vectors at scale and in real-time, addressing the needs of modern AI algorithms that represent the deep semantics of unstructured data with feature vectors. Zilliz Cloud supports much-desired Milvus features while relieving users from managing their own data infrastructure. The service is designed for enterprise-level AI development and offers a fully-managed experience, high performance, elastic deployment, and enterprise-level security. Currently in private preview, interested parties can apply for early access by filling out a form on the Zilliz website.

Podcast: Using AI to Supercharge Data-Driven Applications with Zilliz

Date published
June 16, 2022

Author(s)
Rosie Zhang

Language
English

Word count
260

Hacker News points
None found.

What's this blog post about?

In the latest episode of That Digital Show, Frank Liu from Zilliz discusses how AI and machine learning are being used to extract value from unstructured data. Traditional databases struggle with handling large volumes of unstructured data, which makes up around 80% of the world's data. Zilliz is an open-source vector database that helps developers understand and analyze this type of data more effectively. The conversation also covers challenges in data operations, how databases have evolved to tackle these issues, and some interesting use cases for Milvus, a powerful vector search engine developed by Zilliz.

Visualize Reverse Image Search with Feder

Date published
May 25, 2022

Author(s)
Min Tian, transcreated by Angela Ni.

Language
English

Word count
1249

Hacker News points
None found.

What's this blog post about?

Reverse image search is an application of vector search or approximate nearest neighbor search. In this process, indexes are built to accelerate the search on large datasets. This article discusses how to visualize reverse image search with Feder using the example of IVF_FLAT index. The IVF_FLAT index divides vectors in the vector space into different clusters based on vector distance. During a vector similarity search, users need to provide a target vector and the configuration of search parameters. Feder then visualizes the whole search process. In this use case, we use VOC 2012 dataset with an nlist of 256 to build an IVF_FLAT index. The system first calculates the distance between the target vector and the centroid of each cluster to find the nearest clusters. Then it compares the distance between the target vector and all vectors in the nprobe clusters for a fine search. Feder provides two visualization modes for the fine search process, one based on cluster and vector distance, and the other is the projection for dimension reduction mode. The value of index building parameters influences how the vector space is divided, and the nprobe parameter can be used to achieve tradeoff between search efficiency and accuracy.

Feder: A Powerful Visualization Tool for Vector Similarity Search

Date published
May 6, 2022

Author(s)
Min Tian, transcreated by Angela Ni.

Language
English

Word count
1145

Hacker News points
None found.

What's this blog post about?

Feder is a tool that enables users to visualize approximate nearest neighbor search (ANNS) algorithms, specifically indexes like IVF_FLAT and HNSW. It helps users understand the structure of different indexes, how data are organized using each type of index, and how parameter configuration influences the indexing structure. Feder currently supports the HNSW from hnswlib but plans to support more indexes in the future. The tool is built with JavaScript and Python, allowing users to visualize index structures and search processes under IPython Notebook or as an HTML file for web service use.

Manage Your Milvus Vector Database with One-click Simplicity

Date published
March 10, 2022

Author(s)
Zilliz

Language
English

Word count
851

Hacker News points
None found.

What's this blog post about?

Zhen Chen and Licen Wang have written an article about the open-source graphical user interface (GUI) called Attu, specifically designed for Milvus 2.0, an AI-oriented vector database system. The article provides a step-by-step guide on how to perform a vector similarity search using Attu and Milvus 2.0. Attu offers installers for Windows OS, macOS, and Linux OS, as well as plugins for expansion of customized functionalities. It also provides complete system topology information for easier understanding and administration of Milvus instance. The article demonstrates how to install Attu via GitHub or Docker, and how to use its features such as the Overview page, Collection page, Vector Search page, and System View page. The authors encourage users to develop their own plugins in Attu to suit their application scenarios. They also invite feedback from users to help optimize Attu for a better user experience.

Zilliz Triumphed in Billion-Scale ANN Search Challenge of NeurIPS 2021

Date published
Jan. 21, 2022

Author(s)
Zilliz

Language
English

Word count
379

Hacker News points
None found.

What's this blog post about?

On December 6th, 2021, Zilliz's research team won the first Approximate Nearest Neighbor (ANN) Search Challenge at NeurIPS 2021 with their Disk Performance Optimization Algorithm. The challenge focused on leveraging ANN search on billion-scale datasets and attracted participants from top institutions and companies. Zilliz's solution, BBAnn, performed exceptionally well in the SimSearchNet++ dataset, retrieving 88.573% of all relevant results compared to a baseline of 16.274%. The team plans to implement this achievement in Milvus, an open-source vector database with applications in new drug discovery, recommender systems, chatbots, and more.


2021

Get started with Milvus_CLI

Date published
Dec. 31, 2021

Author(s)
ChenZhuanghong & Chenzhen

Language
English

Word count
697

Hacker News points
None found.

What's this blog post about?

Milvus_CLI is a command-line tool designed to simplify the use of the Milvus vector database. It supports various operations such as database connection, data import and export, and vector calculation using interactive commands in shells. The latest version of Milvus_CLI includes features like support for all platforms, online and offline installation with pip, portability, built on Milvus SDK for Python, help docs, and auto-complete. Users can install Milvus_CLI either online or offline using the provided commands. The tool also provides various usage examples such as connecting to Milvus, creating a collection, listing collections, calculating vector distances, and deleting a collection.

Accelerating Candidate Generation in Recommender Systems Using Milvus paired with PaddlePaddle

Date published
Nov. 26, 2021

Author(s)
Yunmei

Language
English

Word count
2670

Hacker News points
None found.

What's this blog post about?

This article introduces an open-source vector database, Milvus, paired with PaddlePaddle, a deep learning platform, to address the issues faced in developing recommender systems. The basic workflow of a recommender system involves candidate generation and ranking stages. The product recommender system project uses three components: MIND (Multi-Interest Network with Dynamic Routing for Recommendation at Tmall), PaddleRec, and Milvus. MIND is an algorithm developed by Alibaba Group that processes multiple interests of one user during the candidate generation stage. PaddleRec is a large-scale search model library for recommendation, while Milvus is a vector database featuring a cloud-native architecture used for vector similarity search and vector management in this project. The system implementation involves data processing, model training, model testing, generating product item candidates, and data storage and search.

Frustrated with New Data? Our Vector Database can Help

Date published
Nov. 8, 2021

Author(s)
Zilliz

Language
English

Word count
3015

Hacker News points
None found.

What's this blog post about?

In the era of Big Data, unstructured data represents roughly 80-90% of all stored data. Traditional analytical methods fail to pull out useful information from these growing data lakes. To address this issue, researchers are focusing on building general-purpose vector database systems that can handle high-dimensional vector data and support advanced query semantics. The article discusses the design and challenges faced when building such a system, including optimizing cost-to-performance ratio relative to load, automated system configuration and tuning, and supporting advanced query semantics. It also introduces Milvus, an AI-oriented general-purpose vector database system developed by Zilliz's Research and Developement team.

Zilliz CEO Shared Start-up Experience in 2021 SYNC

Date published
Oct. 30, 2021

Author(s)
Zilliz

Language
English

Word count
362

Hacker News points
None found.

What's this blog post about?

The SYNC 2021 conference, hosted by PingWest and themed on "Reshape the Future", recently concluded successfully. In the session "New Opportunities: How Asian Entrepreneurs Change the World", Charles Xie, CEO and founder of Zilliz, shared his experiences in founding Zilliz and achieving business success. Other presenters included Brad Bao, Co-founder and Chairman of Lime, Jun Pei, CEO and Co-founder of Cepton, and Lake Dai, Partner at LDV Partners Adjunct. Charles Xie is an experienced database expert who previously worked for Oracle's US headquarters before founding Zilliz, a company specializing in AI unstructured data processing and analysis systems. With $43 million financing led by Hillhouse Ventures, Zilliz set a record for the largest single Series B financing in the world of open source infrastructure software. Charles encouraged young entrepreneurs to challenge themselves and stay true to their ideas. Currently, Zilliz is developing a market and hiring talents in Silicon Valley.

Building a Video Analysis System with Milvus Vector Database

Date published
Oct. 9, 2021

Author(s)
Shiyu Chen

Language
English

Word count
1231

Hacker News points
None found.

What's this blog post about?

The text discusses the "tip of the tongue" (TOT) phenomenon experienced while watching movies and introduces an idea to build a video content analysis engine based on Milvus. It explains how object detection, feature extraction, and vector analysis can be used in this process. Key technologies mentioned include OpenCV for frame extraction, YOLOv3 for object detection, ResNet-50 for feature extraction, and Milvus as a vector database for analyzing extracted feature vectors. The text also provides an overview of the deployment process and concludes with the benefits of using Milvus in various fields such as image processing, computer vision, natural language processing, speech recognition, recommender systems, and new drug discovery.

Combine AI Models for Image Search using ONNX and Milvus

Date published
Sept. 26, 2021

Author(s)
Zilliz

Language
English

Word count
1014

Hacker News points
None found.

What's this blog post about?

Open Neural Network Exchange (ONNX) is an open format that represents machine learning models, enabling AI developers to use models with various frameworks, tools, runtimes, and compilers. Milvus is an open-source vector database designed for massive unstructured data analysis. This article introduces how to use multiple models for image search based on ONNX and Milvus, using VGG16 and ResNet50 models as examples. The process involves converting pre-trained AI models into the ONNX format, extracting feature vectors from images using these models, storing vector data in Milvus, and searching for similar images based on Euclidean distance calculations between vectors.

DiskANN: A Disk-based ANNS Solution with High Recall and High QPS on Billion-scale Dataset

Date published
Sept. 24, 2021

Author(s)
Zilliz

Language
English

Word count
3689

Hacker News points
None found.

What's this blog post about?

"DiskANN: A Disk-based ANNS Solution with High Recall and High QPS on Billion-scale Dataset" is a paper published in NeurIPS 2019 that introduces an efficient method for index building and search on billion-scale datasets using a single machine. The proposed scheme, called DiskANN, builds a graph-based index on the dataset SIFT-1B with a single machine having 64GB of RAM and a 16-core CPU, achieving over 95% recall@1 at more than 5000 queries per second (QPS) with an average latency lower than 3ms. The paper also introduces Vamana, a new graph-based algorithm that minimizes the number of disk accesses and enhances search performance. DiskANN effectively supports search on large-scale datasets by overcoming memory restrictions in a single machine.

DNA Sequence Classification based on Milvus

Date published
Sept. 6, 2021

Author(s)

Language
English

Word count
1305

Hacker News points
None found.

What's this blog post about?

Mengjia Gu, a data engineer at Zilliz and open-source community member of Milvus, discusses the application of vector databases in DNA sequence classification. Traditional sequence alignment methods are unsuitable for large datasets, making vectorization a more efficient choice. The open-source vector database Milvus is designed to store vectors of nucleic acid sequences and perform high-efficiency retrieval, reducing research costs. By converting long DNA sequences into k-mer lists, data can be vectorized and used in machine learning models for gene classification. Milvus' approximate nearest neighbor search algorithm enables efficient management of unstructured data and recalling similar results among trillions of vectors within milliseconds. The author provides a demo showcasing the use of Milvus in building a DNA sequence classification system, highlighting its potential applications in genetic research and practice.

Zilliz attended VLDB Workshop 2021

Date published
Aug. 27, 2021

Author(s)
Zilliz

Language
English

Word count
510

Hacker News points
None found.

What's this blog post about?

In 2021, significant advancements were made in the database industry. Zilliz, a leading company in this field, shared its latest research progress and achievements at VLDB Workshop 2021. The company introduced Milvus, an open-source vector database developed with machine learning methods. Milvus is designed for handling massive feature vectors and provides a complete framework for vector data update, indexing, and similarity search. It has been widely used in artificial intelligence applications and its performance surpasses that of other products. The research team behind Milvus also presented their design concept for the 2.0 version, which includes cloud-native, log-as-data, and unified batch-and-stream processing features.

Paper Reading|HM-ANN: When ANNS Meets Heterogeneous Memory

Date published
Aug. 26, 2021

Author(s)
Jigao Luo

Language
English

Word count
1789

Hacker News points
None found.

What's this blog post about?

The research paper "HM-ANN: Efficient Billion-Point Nearest Neighbor Search on Heterogenous Memory" proposes a novel algorithm called HM-ANN for graph-based similarity search. This algorithm considers both memory heterogeneity and data heterogeneity in modern hardware settings, enabling billion-scale similarity search on a single machine without compression technologies. The paper discusses the challenges of existing approximate nearest neighbor (ANN) search solutions due to limited dynamic random-access memory (DRAM) capacity and presents HM-ANN as an efficient alternative that achieves low search latency and high search accuracy, especially when the dataset cannot fit into DRAM.

Building a Personalized Product Recommender System with Vipshop and Milvus

Date published
July 29, 2021

Author(s)
Zilliz

Language
English

Word count
1655

Hacker News points
None found.

What's this blog post about?

Vipshop, an online discount retailer in China, built a personalized search recommendation system to optimize their customers' shopping experience. The core function of the e-commerce search recommendation system is to retrieve suitable products from a large number of products and display them to users according to their search intent and preference. To achieve this, Vipshop used Milvus, an open source vector database, which supports distributed deployment, multi-language SDKs, read/write separation, etc., compared to the commonly used standalone Faiss. The overall architecture consists of two main parts: write process and read process. Data such as product information, user search intent, and user preferences are all unstructured data that were converted into feature vectors using various deep learning models and imported into Milvus. With the excellent performance of Milvus, Vipshop's e-commerce search recommendation system can efficiently query the TopK vectors that are similar to the target vectors. The average latency for recalling TopK vectors is about 30 ms.

Audio Retrieval Based on Milvus

Date published
July 27, 2021

Author(s)
Shiyu Chen

Language
English

Word count
1090

Hacker News points
None found.

What's this blog post about?

Sound is an information dense data type, with 83% of Americans ages 12 or older listening to terrestrial radio in a given week in 2020. Sound can be classified into three categories: speech, music, and waveform. Audio retrieval systems are used for searching and monitoring online media in real-time to prevent intellectual property infringement and classify audio data. Feature extraction is crucial for audio similarity search, with deep learning-based models showing lower error rates than traditional ones. Milvus, an open-source vector database, can efficiently process feature vectors extracted by AI models and provides various common vector similarity calculations. The article demonstrates how to use an audio retrieval system powered by Milvus for non-speech audio data processing.

Quickly Test and Deploy Vector Search Solutions with the Milvus 2.0 Bootcamp

Date published
July 13, 2021

Author(s)
Zilliz

Language
English

Word count
1218

Hacker News points
None found.

What's this blog post about?

The new and improved Milvus 2.0 bootcamp offers updated guides and easier to follow code examples for testing, deploying, and building vector search solutions. Users can stress test their systems against 1 million and 100 million dataset benchmarks, explore popular vector similarity search use cases such as image, video, audio, recommendation system, molecular search, and question answering system. The bootcamp also provides quick deployment solutions for fully built applications on any system and scenario-specific notebooks to easily deploy pre-configured applications. Additionally, users can learn how to deploy Milvus in different environments like Mishards, Kubernetes, and load balancing setups.

Building a Milvus Cluster Based on JuiceFS

Date published
June 15, 2021

Author(s)
Changjian Gao and Jingjing Jia

Language
English

Word count
1094

Hacker News points
None found.

What's this blog post about?

Collaborations between open-source communities have led to the integration of Milvus, the world's most popular vector database, and JuiceFS, a high-performance distributed POSIX file system designed for cloud-native environments. JuiceFS is commonly used for solving big data challenges, building AI applications, and log collection. A Milvus cluster built with JuiceFS works by splitting upstream requests using Mishards to cascade the requests down to its sub-modules. Benchmark testing reveals that JuiceFS offers major advantages over Amazon Elastic File System (EFS), including higher IOPS and I/O throughput in both single- and multi-job scenarios. The Milvus cluster built on JuiceFS offers high performance and flexible storage capacity, making it a valuable tool for AI applications.

Building an Intelligent News Recommendation System Inside Sohu News App

Date published
June 7, 2021

Author(s)
Zilliz

Language
English

Word count
1409

Hacker News points
None found.

What's this blog post about?

Sohu, a NASDAQ-listed Chinese online media company, has built an intelligent news recommendation system inside its news app using semantic vector search. The system uses user profiles built from browsing history to fine-tune personalized content recommendations over time, improving user experience and engagement. It leverages Milvus, an open-source vector database built by Zilliz, to process massive datasets efficiently and accurately, reducing memory usage during search and supporting high-performance deployments. The recommendation system relies on the Deep Structured Semantic Model (DSSM), which uses two neural networks to represent user queries and news articles as vectors. It also utilizes BERT-as-service for encoding news articles into semantic vectors, extracting semantically similar tags from user profiles, and identifying misclassified short text. The use of Milvus has significantly improved the real-time performance of Sohu's news recommendation system and increased efficiency in identifying misclassified short text.

Accelerating Compilation 2.5X with Dependency Decoupling & Testing Containerization

Date published
May 28, 2021

Author(s)
Zhifeng Zhang

Language
English

Word count
1514

Hacker News points
None found.

What's this blog post about?

The text discusses the challenges faced during large-scale AI or MLOps projects due to complex dependencies and evolving compilation environments. It highlights common issues such as prohibitively long compilation times, complex compilation environments, and third-party dependency download failures. To address these issues, the article recommends decoupling project dependencies and implementing testing containerization. By doing so, it managed to decrease average compile time by 60% in an open-source embeddings similarity search project called Milvus. The text also provides detailed steps on how to decouple dependencies and optimize compilation between components, as well as within components. It concludes with further optimization measures such as regular cleanup of cache files and selective compile caching. Additionally, it emphasizes the benefits of leveraging containerized testing for reducing errors, improving stability, and reliability.

Accelerating AI in Finance with Milvus, an Open-Source Vector Database

Date published
May 19, 2021

Author(s)
Zilliz

Language
English

Word count
674

Hacker News points
None found.

What's this blog post about?

The financial industry has been an early adopter of open-source software for big data processing and analytics, with banks using platforms like Apache Hadoop, MySQL, MongoDB, and PostgreSQL. With the rise of artificial intelligence (AI), vector databases such as Milvus have become essential tools in managing vector data and enabling similarity searches on massive datasets. Applications of AI in finance include algorithmic trading, portfolio optimization, Robo-advising, virtual customer assistants, market impact analysis, regulatory compliance, and stress testing. Key areas where vector data is leveraged by banks and financial companies are enhancing customer experience with banking chatbots, boosting sales with recommender systems, and analyzing earnings reports and other unstructured financial data with semantic text mining.

Building a Search by Image Shopping Experience with VOVA and Milvus

Date published
May 13, 2021

Author(s)
Zilliz

Language
English

Word count
976

Hacker News points
None found.

What's this blog post about?

VOVA, an e-commerce platform focusing on affordability and user experience, has integrated image search functionality into its platform using Milvus. The system works in two stages: data import and query. It uses YOLO for target detection and ResNet for feature vector extraction from images. Milvus is used to conduct vector similarity searches within the extensive product image library. VOVA's shop by image tool allows users to search for products using uploaded photos, enhancing the overall shopping experience on their platform.

Making with Milvus: Detecting Android Viruses in Real Time for Trend Micro

Date published
April 23, 2021

Author(s)
Zilliz

Language
English

Word count
1459

Hacker News points
None found.

What's this blog post about?

Cybersecurity is a growing concern, with 86% of companies expressing data privacy concerns in 2020. Trend Micro, a global leader in hybrid cloud security, has developed an Android virus detection system called Trend Micro Mobile Security to protect users from malware. The system compares APKs (Android application packages) from the Google Play Store with a database of known malware using similarity search. Initially, Trend Micro used MySQL for its virus detection system but quickly outgrew it as the number of APKs with nefarious code in its database increased. Trend Micro then began searching for alternative vector similarity search solutions and eventually chose Milvus, an open-source vector database created by Zilliz. Milvus is highly flexible, reliable, and fast, offering a comprehensive set of intuitive APIs that allow developers to choose the ideal index type for their scenario. It also provides distributed solutions and monitoring services. Trend Micro's mobile security system uses Thash values to differentiate APKs and Thash values for vector similarity retrieval. Milvus is used to conduct instantaneous vector similarity search on massive vector datasets converted from Thash values, with corresponding Sha256 values queried in MySQL. The system architecture also includes a Redis caching layer to map Thash values to Sha256 values, significantly reducing query time. The monitoring and alert system for Trend Micro's mobile security system is compatible with Prometheus and uses Grafana to visualize various performance metrics. With the help of Milvus, the system performance was able to meet the performance criteria set by Trend Micro.

Build Semantic Search at Speed

Date published
April 19, 2021

Author(s)
Elizabeth Edmiston

Language
English

Word count
1023

Hacker News points
None found.

What's this blog post about?

Semantic search is an effective tool to help customers and employees find relevant products or information. However, slow semantic search can hinder its usefulness. To address this issue, Lucidworks has implemented semantic search using the semantic vector search approach. This involves encoding text into numerical vectors and using a vector search engine like Milvus to quickly find the best matches for customer searches or user queries. Milvus uses FAISS technology, which is also used by Facebook in its machine learning initiatives. The combination of Milvus and other components allows semantic search to be fast and efficient while handling large datasets.

How to Make 4 Popular AI Applications with Milvus

Date published
April 8, 2021

Author(s)
Zilliz

Language
English

Word count
1141

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database that supports efficient search of massive vector datasets created by AI models. It offers comprehensive APIs and support for multiple index libraries, accelerating machine learning application development and MLOps. Zilliz, the company behind Milvus, has developed demos showcasing its use in natural language processing (NLP), reverse image search, audio search, and computer vision. These include an AI-powered chatbot using BERT for NLP, a reverse image search system with VGG for feature extraction, an audio similarity search system with PANNs for pattern recognition, and a video object detection system leveraging OpenCV, YOLOv3, and ResNet50.

Operationalize AI at Scale with Software 2.0, MLOps, and Milvus

Date published
March 31, 2021

Author(s)
Zilliz

Language
English

Word count
1405

Hacker News points
None found.

What's this blog post about?

MLOps is a systemic approach to AI model life cycle management, which involves monitoring a machine learning model throughout its lifecycle and governing everything from underlying data to the effectiveness of a production system that relies on a particular model. It is necessary for building, maintaining, and deploying AI applications at scale. Key components of MLOps include continuous integration/continuous delivery (CI/CD), model development environments (MDE), champion-challenger testing, model versioning, model store and rollback. Milvus is an open-source vector data management platform that supports the transition to Software 2.0 and manages model life cycles with MLOps.

Making With Milvus: AI-Infused Proptech for Personalized Real Estate Search

Date published
March 18, 2021

Author(s)
Zilliz

Language
English

Word count
855

Hacker News points
None found.

What's this blog post about?

The application of artificial intelligence (AI) in real estate is transforming home search processes. With the help of AI, tech-savvy real estate professionals can assist clients in finding suitable homes faster and simplify property purchasing. The coronavirus pandemic has accelerated interest, adoption, and investment in property technology (proptech), indicating its growing role in the industry. This article explores how Beike utilized vector similarity search to develop a house hunting platform that provides personalized results and recommends listings in near real-time. Vector similarity search is useful for various AI, deep learning, and traditional vector calculation scenarios, as it helps make sense of unstructured data such as images, video, audio, behavior data, documents, and more. Beike uses Milvus, an open-source vector database, to manage its AI real estate platform. The platform converts property listing data into feature vectors, which are then fed into Milvus for indexing and storage. This enables efficient similarity searches based on user queries, improving the home search experience for house hunters and helping agents close deals faster.

Extracting Event Highlights Using iYUNDONG Sports App

Date published
March 15, 2021

Author(s)
Zilliz

Language
English

Word count
1164

Hacker News points
None found.

What's this blog post about?

iYUNDONG is an Internet company that aims to engage sport lovers and participants of events such as marathon races. It builds artificial intelligence (AI) tools that can analyze media captured during sporting events to automatically generate highlights. One key feature of the iYUNDONG sports App, called "Find me in motion," allows users who took part in a sport event to retrieve their photos or video clips from a massive media dataset by uploading a selfie. The app uses Milvus, an open-source vector database, to power its image retrieval system and achieve quick and large-scale vector search. iYUNDONG chose Milvus for its ability to support multiple indexes, efficiently reduce RAM usage, and regularly release new versions with powerful out-of-the-box features.

Making with Milvus: AI-Powered News Recommendation Inside Xiaomi's Mobile Browser

Date published
March 9, 2021

Author(s)
Zilliz

Language
English

Word count
1264

Hacker News points
None found.

What's this blog post about?

Xiaomi, the multinational electronics manufacturer, has built an AI-powered news recommendation engine into its mobile web browser using Milvus, an open-source vector database. The application's core data management platform is designed for similarity search and artificial intelligence. This system uses AI to suggest personalized content and cut through the noise of news by recommending relevant articles based on user search history and interests. Xiaomi selected BERT as the language representation model in its recommendation engine, which can be used as a general natural language understanding (NLU) model for various natural language processing tasks. The AI-powered content recommendation system relies on three key components: vectorization, ID mapping, and approximate nearest neighbor (ANN) service.

Building Personalized Recommender Systems with Milvus and PaddlePaddle

Date published
Feb. 24, 2021

Author(s)
Zilliz

Language
English

Word count
1090

Hacker News points
None found.

What's this blog post about?

This article discusses the creation of personalized recommender systems using Milvus and PaddlePaddle. The recommendation system is designed to help users find relevant information or products by analyzing their historical behavior. The MovieLens Million Dataset (ml-1m) is used as an example, which contains 1 million reviews of 4000 movies by 6000 users. A fusion recommendation model is implemented using PaddlePaddle's deep learning platform, and the movie feature vectors generated by the model are stored in Milvus, a vector similarity search engine. The user features are used as target vectors for searching within Milvus to obtain recommended movies. The main process involves training the model, preprocessing data, and implementing the personalized recommender system with Milvus. This combination of technologies allows for efficient and accurate recommendations based on user interests and needs.

How we used semantic search to make our search 10x smarter

Date published
Jan. 29, 2021

Author(s)
Rahul Yadav

Language
English

Word count
1060

Hacker News points
None found.

What's this blog post about?

Tokopedia has introduced similarity search to improve the relevance of its product search results. The platform uses Elasticsearch for keyword-based search, which ranks products based on their frequency and proximity in a document. To enhance meaning comparison, they adopted vector representation, encoding words by their probable context. Milvus was chosen as the feature vector search engine due to its ease of use and support for more indexes. The platform deployed one writable node, two read-only nodes, and one Mishards middleware instance in Google Cloud Platform (GCP) using Milvus Ansible. Indexing plays a crucial role in accelerating similarity searches on large datasets by organizing data efficiently. Tokopedia plans to improve the model's performance for obtaining embeddings and run multiple learning models simultaneously for future experiments like image search and video search.

Vector Similarity Search Hides in Plain View

Date published
Jan. 5, 2021

Author(s)
Zilliz

Language
English

Word count
1542

Hacker News points
None found.

What's this blog post about?

Artificial intelligence (AI) has the potential to revolutionize various industries and tasks. One example is race timing, where AI can replace traditional chip timers with video cameras and machine learning algorithms. This technology, known as vector similarity search, involves converting unstructured data into feature vectors using neural networks, then calculating similarities between these vectors. Vector similarity search has applications in e-commerce, security, recommendation engines, chatbots, image or video search, and chemical similarity search. Open-source software like Milvus and publicly available datasets make AI more accessible to developers and businesses.


2020

Building a Graph-based Recommendation System with Milvus, PinSage, DGL, and MovieLens Datasets

Date published
Dec. 1, 2020

Author(s)
Zilliz

Language
English

Word count
1415

Hacker News points
None found.

What's this blog post about?

This article explains how to build a graph-based recommendation system using open-source tools such as Milvus, PinSage, and DGL. Recommendation systems are algorithms that make relevant suggestions to users based on their preferences and behaviors. Two common approaches to building recommendation systems are collaborative filtering and content-based filtering. In this example, the author uses the MovieLens datasets to build a user-movie bipartite graph for classification purposes. The PinSage model is then used to generate embedding vectors of pins as feature vectors of the acquired movie information. These embeddings are loaded into Milvus, which returns corresponding IDs and enables vector similarity search. Finally, the system recommends movies most similar to user search queries.

Making Sense of Unstructured Data with Zilliz Founder and CEO Charles Xie

Date published
Nov. 19, 2020

Author(s)
Zilliz

Language
English

Word count
999

Hacker News points
None found.

What's this blog post about?

Charles Xie, founder and CEO of open-source software company Zilliz, discusses the importance of unstructured data processing and analysis platforms in today's world. With 80% of all data being unstructured, such as images, videos, audio, molecular structures, and gene sequences, but only 1% of this data getting analyzed due to processing complexities, Zilliz aims to extract value from unstructured data by building accessible tools for everyone. The company innovates through an open-source software development model, transparency in its culture, and a focus on teamwork. Despite the challenges posed by the COVID-19 pandemic, Zilliz has managed to maintain its operations and continue developing its products. Xie emphasizes the importance of trusting oneself and the people around them in handling stress and uncertainty. The company plans to stay ahead of competitors by focusing on breadth and depth in their offerings and establishing itself as a global leader in AI-powered unstructured data science software.

ArtLens AI: Share Your View

Date published
Sept. 11, 2020

Author(s)
Anna Faxon and Haley Kedziora

Language
English

Word count
911

Hacker News points
None found.

What's this blog post about?

The Cleveland Museum of Art (CMA) has launched ArtLens AI: Share Your View, an interactive tool that matches photos taken by users with art from the museum's collection. This initiative aims to provide a fun and engaging way for people to connect with art during these uncertain times. Users can upload their images on the CMA website or mention @ArtLensAI on Twitter to receive matching artwork. The tool uses machine learning and open-source vector similarity engine Milvus to recognize shapes, patterns, and objects in users' photos and find surprising matches from the museum's collection.

Item-based Collaborative Filtering for Music Recommender System

Date published
Sept. 7, 2020

Author(s)
Zilliz

Language
English

Word count
1286

Hacker News points
None found.

What's this blog post about?

Wanyin App, an AI-based music sharing community, implemented an item-based collaborative filtering (I2I CF) recommender system to sort out music of interest based on users' previous behavior. The system converts songs into mel-frequency cepstrum (MFC), designs a convolutional neural network (CNN) to extract feature embeddings, and uses Milvus as the feature vector similarity search engine for embedding similarity search. This approach helps in generating music recommendations through embedding similarity search and filtering duplicate songs accurately.

4 Steps to Building a Video Search System

Date published
Aug. 29, 2020

Author(s)
Zilliz

Language
English

Word count
856

Hacker News points
None found.

What's this blog post about?

The text describes a video search system that uses image similarity to retrieve videos from a repository. It explains the process of converting videos into embeddings, which involves extracting key frames and converting their features into vectors. The workflow includes importing videos using OpenCV library, cutting each video into frames, and inserting extracted vectors (embeddings) into Milvus. For searching, it uses the same VGG model to convert input images into feature vectors and inserts them into Milvus to find similar vectors. It then retrieves corresponding videos from Minio based on Redis correlations. The article also provides a sample dataset of 100,000 GIF files from Tumblr for building an end-to-end solution for video search. Deployment steps are outlined using Docker images and docker-compose.yml configuration file. Finally, the system's interface is displayed, allowing users to input target images and retrieve similar videos.

The Journey to Optimizing Billion-scale Image Search (2/2)

Date published
Aug. 10, 2020

Author(s)
Zilliz

Language
English

Word count
1987

Hacker News points
None found.

What's this blog post about?

The second-generation search-by-image system uses CNN + Milvus solution. Feature extraction is done using convolutional neural network (CNN) as the underlying technology. VGG16 model is used for image feature extraction, and Keras + TensorFlow are utilized for technical implementation. Milvus, an open-source vector search engine, is employed to store and manage feature vectors, calculate similarity, and return vector data in the nearest neighbor range. The system also includes image processing techniques such as normalization, bytes conversion, and black border removal.

The Journey to Optimizing Billion-scale Image Search (1/2)

Date published
Aug. 4, 2020

Author(s)
Zilliz

Language
English

Word count
1155

Hacker News points
None found.

What's this blog post about?

Yupoo Picture Manager, which manages tens of billions of images for its users, has an urgent need to quickly locate images within its growing gallery. To address this issue, the company developed a search by image service that underwent two evolutions. The first-generation system used Perceptual hash (pHash) algorithm for feature extraction and ElasticSearch for similarity calculation. However, it had limitations in handling images with altered integrity. The second-generation system introduced a new underlying technology to overcome these limitations.

Building an AI-Powered Writing Assistant for WPS Office

Date published
July 28, 2020

Author(s)
Zilliz

Language
English

Word count
1244

Hacker News points
None found.

What's this blog post about?

WPS Office is a productivity tool developed by Kingsoft, used by over 150 million users worldwide. The company's AI department built a smart writing assistant using semantic matching algorithms such as intent recognition and text clustering. This tool exists both as a web application and WeChat mini program that helps users quickly create outlines, individual paragraphs, and entire documents by inputting a title and selecting up to five keywords. The writing assistant's recommendation engine uses Milvus, an open-source similarity search engine, to power its core vector processing module. Building the WPS Office smart writing assistant involves making sense of unstructured textual data, using the TFIDF model for feature extraction, extracting features with a bi-directional LSTM-CNNs-CRF deep learning model, creating sentence embeddings using Infersent, and storing and querying vectors with Milvus. AI isn't replacing writers; it's helping them write more efficiently and effectively.

Building an Intelligent QA System with NLP and Milvus

Date published
May 12, 2020

Author(s)
Zilliz

Language
English

Word count
789

Hacker News points
None found.

What's this blog post about?

The Milvus Project is an open-source vector search engine designed to build question answering (QA) systems. It uses Google's BERT model and the Milvus vector search engine to create a Q&A bot based on semantic understanding. The system architecture includes data preparation, generating feature vectors using BERT, importing them into Milvus and PostgreSQL, and retrieving answers. This article provides step-by-step instructions for building an online Q&A system in the insurance industry. With high performance and scalability, Milvus can support a corpus of up to hundreds of millions of texts.

How Does Milvus Schedule Query Tasks

Date published
March 2, 2020

Author(s)
Zilliz

Language
English

Word count
1304

Hacker News points
None found.

What's this blog post about?

Milvus is an open-source vector database that supports massive-scale data search. It schedules query tasks by dividing the data into multiple data blocks and creating SearchTasks for each block. The tasks are assigned to computing devices based on their estimated completion times, with priority given to devices with shorter times. The results of each task are then merged to form the final search result. To optimize performance, Milvus uses an LRU cache to store frequently accessed data blocks and overlaps data loading and computation stages for better resource usage. It also considers different transmission speeds between GPUs when scheduling tasks. Future work includes exploring query optimization techniques and handling more complex hardware environments.

How to Select Index Parameters for IVF Index

Date published
Feb. 26, 2020

Author(s)
Zilliz

Language
English

Word count
661

Hacker News points
None found.

What's this blog post about?

In Best Practices for Milvus Configuration, some best practices for setting key parameters in Milvus clients are introduced to improve search performance. The index_file_size parameter affects data storage and search efficiency. Generally, increasing the value of index_file_size improves search performance but may cause large files to fail loading into GPU or CPU memory. For nlist and nprobe parameters, a trade-off between precision and efficiency is necessary when determining their values. The optimal values for these parameters depend on the dataset size and distribution.

Accelerating New Drug Discovery

Date published
Feb. 6, 2020

Author(s)
Zilliz

Language
English

Word count
682

Hacker News points
None found.

What's this blog post about?

Dolphin AI is an open-source similarity search engine designed to handle massive-scale feature vectors. It can be used in conjunction with RDKit, a chemoinformatics software suite, for high-performance chemical structure similarity searches. The system generates Morgan fingerprints using RDKit and then imports them into Milvus to build a chemical structure database. With different chemical fingerprints, Milvus can perform substructure search, similarity search, and exact search. This approach is faster and more efficient than traditional methods for discovering potentially available compounds in drug discovery research.


2019

Accelerating Similarity Search on Really Big Data with Vector Indexing

Date published
Dec. 5, 2019

Author(s)
Zilliz

Language
English

Word count
1849

Hacker News points
None found.

What's this blog post about?

This article discusses the role of vector indexing in accelerating similarity search and machine learning applications, particularly those that involve large datasets. It covers different types of vector inverted file (IVF) indexes and their suitability for various scenarios. The IVF_FLAT index is best suited for searching relatively small (million-scale) datasets when 100% recall is required. For scenarios where disk, CPU, or GPU memory resources are limited, the IVF_SQ8 index type is a better option as it can convert each FLOAT to UINT8 by performing scalar quantization, reducing memory consumption by 70-75%. The new hybrid GPU/CPU approach, IVF_SQ8H, offers even faster query performance compared to IVF_SQ8 with no loss in search accuracy. Finally, the article introduces Milvus, an open-source vector data management platform that can power similarity search applications across various fields.


By Matt Makai. 2021-2024.