Canopy Labs has selected Baseten as its preferred inference provider for Orpheus TTS models. This partnership enables developers to use the high-performance Orpheus model in production, with optimized performance and scalability on a single H100 MIG GPU. The collaboration between Canopy Labs and Baseten resulted in the creation of the world's highest-performance Orpheus inference server based on NVIDIA's TensorRT-LLM. This allows for 16 concurrent live connections with variable traffic, 24 concurrent live connections with stable traffic, and up to 60x real-time factor for bulk jobs. The client code example provided by Baseten supports session re-use, reducing overhead and improving TTFB performance. With this partnership, developers can now build fast, configurable, and cost-efficient voice agents using Orpheus TTS models.