The text-to-speech (TTS) landscape is rapidly changing, with new state-of-the-art models launching every month. Developers and businesses are seeking powerful, flexible, and cost-effective TTS options, and several open-source libraries have emerged to address this need. Spark-TTS is a 500 million parameter model that supports zero-shot voice cloning, bi-lingual speech synthesis, and adjustable voice attributes. Kokoro is a super-small TTS model with 82M parameters, offering fast deployment and high-quality audio at a lower cost. Fish Speech v1.5 features low CER/WER, fast latency, and support for multiple languages, but its license restricts commercial use. xtts-v2 supports 13 languages and expressive speech synthesis, while StyleTTS produces exceptionally natural-sounding English speech with a permissive license. OpenVoice v2 offers instant voice cloning capabilities, but with limited language support compared to MeloTTS. VITS is a lightweight model suitable for on-device use cases like article reading or language practice. These open-source TTS libraries offer alternatives to commercial solutions and can be used in conjunction with Modal for real-time applications.