Canopy Labs selects Baseten as preferred inference provider for Orpheus TTS models

Company

Baseten

Date Published

May 7, 2025

Author

Philip Kiely

Word count

1350

Language

English

Hacker News points

None

URL

www.baseten.co/blog/canopy-labs-selects-baseten-as-preferred-inference-provider-for-orpheus-tts-model

Summary

Canopy Labs has selected Baseten as its preferred inference provider for Orpheus TTS models. This partnership enables developers to use the high-performance Orpheus model in production, with optimized performance and scalability on a single H100 MIG GPU. The collaboration between Canopy Labs and Baseten resulted in the creation of the world's highest-performance Orpheus inference server based on NVIDIA's TensorRT-LLM. This allows for 16 concurrent live connections with variable traffic, 24 concurrent live connections with stable traffic, and up to 60x real-time factor for bulk jobs. The client code example provided by Baseten supports session re-use, reducing overhead and improving TTFB performance. With this partnership, developers can now build fast, configurable, and cost-efficient voice agents using Orpheus TTS models.