Scaling Semantic Search at Vectara: Corpora`
Vectara's platform seamlessly integrates various subsystems for neural retrieval systems, providing scalability and reliability through intuitive APIs and replication across multiple availability zones. The company has achieved over 99.9% customer uptime for its query serving infrastructure while handling workloads as high as 40qps. Recently, Vectara successfully stress-tested its infrastructure with a customer account requiring 1 million corpora, addressing bottlenecks such as incremental loading of corpora, parallelism in retrieval from object storage subsystems, and efficient encoding and transmission of account metadata. By making these improvements, Vectara was able to bring an account replica of 400,000 corpora online in under 5 minutes without causing downtime. The platform's ability to support hundreds of thousands of corpora differs from other semantic search solutions in the market.