Fluid compute: How we built serverless servers
Blog post from Vercel
Vercel's Fluid compute, a serverless computing approach introduced alongside Active CPU pricing, has transformed cost efficiency and resource utilization by minimizing cold starts and reducing expenses by up to 95% for over 45 billion weekly requests. Originating from the React team's development of React Server Components and the Next.js team's App Router, Fluid compute required significant infrastructure changes, including a new secure TCP-based protocol for streaming responses from AWS Lambda. This innovation allows multiplexing multiple concurrent requests to the same Lambda instance, breaking away from the traditional one-invocation-per-instance model, thereby optimizing resource usage and performance. The system's Rust-based core adapts to each instance's load profile by gathering metrics to maintain optimal resource utilization, while the new compute-resolver service aids in efficiently routing requests to reuse existing connections. Active CPU pricing further enhances cost savings by charging only for the active CPU time and provisioned memory used, making Vercel an attractive platform for diverse applications, from frontends to AI apps, with Fluid compute now the default for new projects.