Meta's Voicebox AI is a Dall-E for text-to-speech

GPT-4: Meta introduces Voicebox, a generative text-to-speech model capable of producing conversational audio clips in multiple languages. Trained on over 50,000 hours of unfiltered audio, Voicebox outperforms current text-to-speech systems in intelligibility and audio similarity while operating up to 20 times faster. Potential applications include prosthetics for vocal cord damage patients, in-game NPCs, and digital assistants. However, Meta has not released the app or source code to the public due to potential misuse risks.
Read more at Engadget…

Meta’s Voicebox AI is a Dall-E for text-to-speech | Engadget

Related

Build Your Own ChatGPT for $100 with Karpathy’s Innovative Nanochat Kit

Tiny Recursive Model: How a 7M-Parameter Net Outsmarts Giants with Latent Scratchpads and Iterative Self-Critique

CodeMender: DeepMind’s AI Agent That Finds and Fixes Security Flaws Automatically

Qualcomm Acquires Arduino: Open Source Community Watches With Caution

ChatGPT Becomes a Platform: Apps Now Live Inside the Conversation

Claude Code 2.0 with New Features and Enhanced IDE Integration

Claude Sonnet 4.5: Revolutionizing Coding with AI’s Latest Marvel

Interstellar Visitor 3I/ATLAS Takes a Direct Hit from the Sun

The hidden energy cost of AI-generated video