How We Achieved 99.99% Reliability At Vapi

Post Details

Company

Vapi

Date Published

Aug. 12, 2025

Author

Abhishek Sharma

Word Count

1,037

Company Posts That Month

5

Language

English

Hacker News Points

-

Post removed?

No

Source URL

vapi.ai/blog/how-we-achieved-99-99-reliability-at-vapi

Summary

Vapi undertook a comprehensive overhaul of its infrastructure to enhance uptime from 99.9% to 99.99%, focusing on minimizing downtime and improving resilience. This initiative included migrating their database from Supabase to Neon for better stability, implementing AWS Aurora for redundancy, and incorporating a caching layer that serves 80% of database requests to enhance speed and reliability. To address telephony issues, Vapi transitioned its SIP infrastructure to auto-scaling groups, eliminating bottlenecks and managing traffic spikes effectively. Their "fallbacks everywhere" philosophy ensures reliability by treating external providers as potential failure points and implementing automatic failover systems. Deployments were secured through a multi-cluster architecture with a Canary Manager that manages traffic and rolls back faulty updates automatically. Moreover, AWS Lambda burst workers were introduced to handle voice traffic spikes, utilizing a custom proxy for secure communication with the Kubernetes cluster. Vapi enhanced business logic reliability by integrating Temporal for durable execution of critical operations, and implemented process isolation, circuit breakers, and comprehensive monitoring to prevent failures. These improvements resulted in a 97% reduction in dropped calls, rapid failovers, and automated response to provider outages, establishing a robust foundation for trustworthy application development.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Serverless	5	610	170	73	-31%
Kubernetes	2	986	177	85	-38%
Real-time	1	4,334	965	217	-7%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.