This article discusses the deployment of Ultravox, a breakthrough multimodal LLM designed to improve latency in voice applications. By integrating directly with Cerebrium's serverless AI infrastructure, developers can build and deploy highly responsive voice applications with minimal overhead. Ultravox is fundamentally different from traditional voice AI architectures due to its ability to process audio directly into an LLM without requiring a separate ASR stage. This design reduces latency and eliminates potential ASR errors, making it suitable for real-time customer support, interactive voice-based agents, and other applications where low-latency processing is crucial. The article also covers the prerequisites, setting up Ultravox on Cerebrium using PipeCat framework, and deploying the application with a simple deployment command.