DeepSeek Coder V2 Outperforms GPT-4 Turbo in Coding and Math Benchmarks

Chinese AI startup DeepSeek recently unveiled DeepSeek Coder V2, an open-source code language model that has surpassed state-of-the-art closed-source models like GPT-4 Turbo and Llama-3 70B in coding and math benchmarks. With its robust capabilities in more than 300 programming languages and a massive context window of 128K, this model builds on the 33 billion-parameter DeepSeek Coder, broadening its potential for handling complex coding scenarios.

DeepSeek Coder V2 demonstrated superior performance in multiple evaluations: scoring 76.2 on MBPP+, 90.2 on HumanEval, and 73.7 on Aider benchmarks. This positions it ahead of not only GPT-4 Turbo but also other notable competitors such as Claude 3 Opus and Gemini 1.5 Pro. Even in general language and reasoning, the model scored an impressive 79.2 on the MMLU benchmark, closely competing with other high-performing models and nearly matching the 88.7 score of GPT-4o.

The success of DeepSeek Coder V2 suggests a significant shift towards high-performing, accessible AI tools, further evidenced by its availability under the MIT license for both commercial and research purposes through Hugging Face or API. To explore the model’s capabilities, users can also interact with it via a chatbot provided by the company.

For more details on DeepSeek Coder V2, visit VentureBeat.

DeepSeek Coder V2 Outperforms GPT-4 Turbo in Coding and Math Benchmarks

Related

When the Vending Machine Went Sentient

Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Star Wars Reimagined: China’s Laser Satellite Outpaces Starlink

Court Rules AI’s Use of Books as Fair Use but Slams Pirated Collection Storage

Introducing the OWASP AI Testing Guide: A New Standard for AI Security Testing

The Low-Background Steel Problem of AI

Chinese AI Firms Dodge US Chip Bans with Cross-Border Data Smuggling to Malaysia

OpenAI open-sources a demo of a UI testing agent

Financial Dynamics in Agentic AI: Cursor’s Rise Versus GitHub Copilot