Vonage call transcription: adding real-time speech-to-text to Vonage
Blog post from Gladia
The integration of a speech-to-text infrastructure with the Vonage Voice API streamlines fragmented processes into a single API, offering approximately 270ms real-time latency for live agent assistance and post-call batch processing for QA scoring. This integration allows contact centers to improve agent efficiency by providing real-time transcripts and automated QA scoring, reducing manual review workloads and potential errors in call logs, which could otherwise corrupt CRM data and compliance records. The setup involves routing Vonage WebSocket streams to transcription endpoints, enabling live and post-call analysis with varying latency preferences, while maintaining compliance with regulatory standards like PCI-DSS, HIPAA, and GDPR. The system supports multiple languages and offers features such as speaker diarization, sentiment analysis, and named entity recognition, which enhance QA accuracy and coaching effectiveness. Additionally, the API's flexibility accommodates multilingual and code-switching scenarios, crucial for BPO environments, and offers different plan tiers to manage data governance and usage-based expenses.
| Trend | Post Mentions | Total Month Mentions | Posts | Companies | MoM |
|---|---|---|---|---|---|
| Real-time | 30 | 5,457 | 1,338 | 238 | -5% |
| Data Pipeline | 1 | 441 | 203 | 86 | -29% |