Building an AI Gateway on Fastly Compute

Post Details

Company

Fastly

Date Published

March 10, 2026

Author

Jonathan Speek

Word Count

1,789

Language

English

Hacker News Points

-

Source URL

www.fastly.com/blog/building-an-ai-gateway-on-fastly-compute

Summary

AI applications often fail due to mundane issues like provider downtime or hardcoded model names, which become more problematic as AI workloads evolve into complex multi-step processes. This text introduces a proof-of-concept Edge AI Gateway built on Fastly Compute to address these challenges by creating a policy-driven routing layer between applications and Language Model (LLM) providers. The gateway allows an application to send a standard request, which is then classified at the edge to determine the appropriate provider and model based on factors like complexity and cost, without altering the main application. The system leverages Fastly's low-latency Compute service, using WebAssembly for rapid cold starts and secure sandboxing. The gateway uses a classification model, Mercury 2, to quickly decide on routing, offering advantages in cost and response time by avoiding unnecessary use of expensive models. Routing policies are stored in Fastly's KV Store, enabling seamless updates without redeployment, and credentials are securely managed using Fastly's Secret Store. Although still a proof-of-concept, the gateway showcases potential improvements in efficiency for managing multi-provider AI operations, with future capabilities including failover enhancements and caching.