Largest text-to-speech AI model yet shows 'emergent abilities'

Amazon researchers have developed BASE TTS, the largest text-to-speech model to date, with 980 million parameters and trained on 100,000 hours of speech. This model demonstrates emergent abilities, handling complex linguistic challenges such as compound nouns, emotional expressions, foreign words, and paralinguistics more effectively than previous models. The medium-sized version of BASE TTS, with 400 million parameters, notably exhibited a significant improvement in these areas. The model’s streamable nature allows for real-time, low-bitrate speech generation, with the potential to enhance accessibility features. Although impressive, the model remains experimental, and its source data has not been published due to potential misuse by bad actors.
Read more at TechCrunch…

Largest text-to-speech AI model yet shows ’emergent abilities’

Related

US Government Halts Anthropic’s AI Models Citing Security Fears, Sparks Industry Controversy

The Build Log That Spoke to AI Agents

Half a Billion Dollar AI Blunder: The Hidden Costs of Unchecked Tech Spending

ECC v2.0: Elevating Agentic Work with Versatile Operator Systems and Open-Source Innovation

The Vulnerability Bottleneck Has Moved

China’s First Real Gaming GPU Is Here — And That Matters More Than FPS

Shai-Hulud and the Danger of Trusted Packages

When the Future Remembers First

YellowKey Turns BitLocker Into an Open Door