How AI Gateway runs on Fluid compute

Post Details

Company

Vercel

Date Published

Nov. 6, 2025

Author

-

Word Count

1,532

Company Posts That Month

12

Language

English

Hacker News Points

-

Post removed?

No

Source URL

vercel.com/blog/how-ai-gateway-runs-on-fluid-compute

Summary

AI Gateway is a Node.js service designed to connect to various AI models through a single interface, effectively processing billions of tokens daily by leveraging Fluid compute for cost-efficient scaling. Fluid compute enhances concurrency and resource efficiency by allowing multiple simultaneous operations within a single instance, reducing the need for separate serverless instances for each invocation. AI Gateway operates within the Vercel infrastructure using a global delivery network that optimizes request routing through Anycast routing and Points of Presence (PoPs), ensuring low-latency and high-throughput communication with AI providers. By employing Active CPU Pricing, AI Gateway significantly cuts costs by only charging CPU rates when actively processing, while the rest of the time incurs lower memory-only charges. Vercel provides native observability and detailed metrics for developers to monitor performance, provider health, and costs in real-time, ensuring reliable and resilient operation across fluctuating model APIs. This setup allows developers to rapidly deploy AI features without the complexities of managing provider connections or underlying compute.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
Serverless	7	701	157	77	-20%
Observability	3	2,534	521	146	+9%
Real-time	3	4,542	1,005	235	-31%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.