🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

RWKV-v5 Eagle 7B, a new 7.52 billion parameter model, has been released under the Apache 2.0 license by the Linux Foundation, allowing unrestricted personal and commercial use. The model demonstrates significant improvements in multi-lingual performance across 23 languages on benchmarks that test common sense reasoning. Despite the lack of benchmarks for the remaining 75+ languages of the 100+ trained, the model shows promise in narrowing the performance gap with leading models like Mistral-7B.

In English language benchmarks, Eagle 7B competes closely with top models, sometimes outperforming them, and aligns with expected transformer performance for its training size. The model’s architecture, RWKV-based, offers computational efficiency with linear scaling compared to traditional transformers, suggesting a potential shift towards more scalable AI architectures.

The RWKV team is committed to building AI that supports a diverse range of languages, aiming to cover 50% of the world’s population. This approach is exemplified by the Indonesian-NLP discord group, which has successfully fine-tuned an Indonesian language model from the RWKV base models at a low cost.

Looking ahead, the team plans to release an updated Eagle paper, further token training, and new model variants in 2024. The RWKV project acknowledges the support of its community and developers, emphasizing the importance of accessible and efficient AI for global inclusivity.
Read more…

🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

Related

The Energy Infrastructure Gap That Could Decide the AI Race

AI-Powered Security Checks: Filtering Bots Without Slowing Users

Inside the Underground World of LLM Jailbreaks

GPT-5 is Here, and It’s Not What You Expected

The AI Agent That Actually Knows How to Build ML Models

Qwen-Image: Finally, an AI That Can Actually Write

Perplexity’s Stealth Crawling Sparks Debate Over AI Web Ethics

Feeding Your Gut to Fight Fat: How Tryptophan Sparks Hormone Recovery

Putting Math Behind the Madness: A Theoretical Framework for LLM Hallucinations