Meta’s Voicebox AI is a Dall-E for text-to-speech | Engadget


GPT-4: Meta introduces Voicebox, a generative text-to-speech model capable of producing conversational audio clips in multiple languages. Trained on over 50,000 hours of unfiltered audio, Voicebox outperforms current text-to-speech systems in intelligibility and audio similarity while operating up to 20 times faster. Potential applications include prosthetics for vocal cord damage patients, in-game NPCs, and digital assistants. However, Meta has not released the app or source code to the public due to potential misuse risks.
Read more at Engadget…