WizardLM-2: A New Contender in the Language Model Arena Surpasses Many, Nears GPT-4

WizardLM-2, a new language model, has been rigorously evaluated for its performance against a variety of baselines through both human and automatic assessments. In a detailed human preferences evaluation involving complex real-world instructions, WizardLM-2 showcased its competitive edge. Notably, its 8x22B variant slightly trails behind the proprietary GPT-4-1106-preview but outperforms other models like Command R Plus and GPT4-0314. The 70B version of WizardLM-2 surpasses models such as GPT4-0613, Mistral-Large, and Qwen1.5-72B-Chat, while the 7B variant is on par with Qwen1.5-32B-Chat and exceeds the capabilities of Qwen1.5-14B-Chat and Starling-LM-7B-beta. These findings position WizardLM-2 remarkably close to the forefront of proprietary models and significantly ahead of its open-source counterparts, marking it as a formidable contender in the realm of language models.
Read more…

WizardLM-2: A New Contender in the Language Model Arena Surpasses Many, Nears GPT-4

Related

When the Vending Machine Went Sentient

Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Star Wars Reimagined: China’s Laser Satellite Outpaces Starlink

Court Rules AI’s Use of Books as Fair Use but Slams Pirated Collection Storage

Introducing the OWASP AI Testing Guide: A New Standard for AI Security Testing

The Low-Background Steel Problem of AI

Chinese AI Firms Dodge US Chip Bans with Cross-Border Data Smuggling to Malaysia

OpenAI open-sources a demo of a UI testing agent

Financial Dynamics in Agentic AI: Cursor’s Rise Versus GitHub Copilot