Home / Companies / DigitalOcean / Blog / Post Details
Content Deep Dive

How We Built DigitalOcean Inference Router

Blog post from DigitalOcean

Post Details
Company
Date Published
Author
Adil Hafeez
Word Count
3,365
Language
English
Hacker News Points
-
Summary

DigitalOcean's Inference Router, developed by Adil Hafeez and his team, addresses the inefficiency of using a single model across various tasks in AI workflows by implementing an intelligent routing system that optimizes model selection based on task requirements, cost, and latency. This system, powered by the Plano engine, uses a 30B Mixture-of-Experts model to fine-tune task detection, outperforming models like GPT-5.1 in routing accuracy. By automatically matching each request to the most suitable model, it reduces costs and enhances performance without embedding complex routing logic in application code. The Inference Router offers preset configurations for common workflows, supports custom routing tasks, and employs a ranking engine that uses live cost and latency data to ensure optimal model selection. This infrastructure-level routing approach not only improves efficiency but also simplifies the integration process for developers, making it a scalable solution for running agentic AI systems on DigitalOcean's platform.