OpenAI’s New Voice Cloning Tool: Safety Concerns Delay Release
Blog post from SSOJet
OpenAI has introduced advanced speech-to-text and text-to-speech models, gpt-4o-transcribe and gpt-4o-mini-transcribe, which enhance transcription accuracy and offer developers control over AI-generated speech, making them suitable for applications like customer support and multilingual conversations. These models outperform earlier versions like Whisper by reducing errors through improved training methods and diverse datasets. OpenAI's Voice Engine, a voice cloning tool, allows for the creation of synthetic voices from brief audio samples but is being cautiously piloted due to concerns over misuse, such as in political contexts or fraud. The tool's limited release aims to ensure ethical deployment, with safeguards like watermarking and red teaming networks in place to prevent abuse. The technology, while raising ethical concerns, could disrupt the voice acting industry by making voice generation more accessible and affordable, although OpenAI stresses that it should complement rather than replace human talent. The company's commitment to addressing ethical challenges and engaging stakeholders highlights a balanced approach in advancing voice technology.
| Trend | Post Mentions | Total Month Mentions | Posts | Companies | MoM |
|---|---|---|---|---|---|
| Voice AI | 5 | 566 | 85 | 28 | -37% |
| AI Guardrails | 1 | 220 | 86 | 29 | -28% |
| Reinforcement learning | 1 | 188 | 89 | 21 | -13% |