Meta’s Voicebox AI is a Dall-E for text-to-speech | Engadget


GPT-4: Meta introduces Voicebox, a generative text-to-speech model capable of producing conversational audio clips in multiple languages. Trained on over 50,000 hours of unfiltered audio, Voicebox outperforms current text-to-speech systems in intelligibility and audio similarity while operating up to 20 times faster. Potential applications include prosthetics for vocal cord damage patients, in-game NPCs, and digital assistants. However, Meta has not released the app or source code to the public due to potential misuse risks.
Read more at Engadget…

Discover more from Emsi's feed

Subscribe now to keep reading and get access to the full archive.

Continue reading