Devstral-Small-2505 Sets New Standard for Open-Source Coding Agents

Mistral has introduced a new open-source model specifically designed for software engineering agents: Devstral-Small-2505. At 24 billion parameters, this agentic LLM is optimized for tasks like navigating codebases, editing files across projects, and automating software development workflows. Despite its relatively compact size, it achieves a leading 46.8% on the SWE-Bench Verified benchmark, outperforming both Claude 3.5 Haiku and GPT-4.1-mini under the same test conditions.

Fine-tuned from Mistral-Small-3.1, Devstral supports a context window of 128,000 tokens—a vital capability for managing large codebases or multi-file operations. The model was trained in collaboration with All Hands AI, and the results place it as the top-performing open-source model on SWE-Bench Verified to date. Importantly, it uses the Tekken tokenizer with a vocabulary size of 131k, offering high tokenization efficiency for code-related input.

Devstral’s design centers around “agentic coding,” meaning it’s built to power autonomous systems that can carry out software engineering tasks without constant supervision. It is fully open-source under the Apache 2.0 License, and is efficient enough to run locally on a single RTX 4090 or even a Mac with 32GB RAM.

For local usage, the model supports deployment via vLLM, Ollama, Transformers, LMStudio, and mistral-inference, with GGUF-format weights available for easy integration into quantized inference pipelines. For agent orchestration, the recommended setup involves the OpenHands runtime, which provides a Dockerized environment to facilitate tool use, sandboxing, and iteration loops.

Fine-tuning and experimentation are also streamlined with Unsloth Dynamic 2.0, enabling fast and accurate quantization and adaptation workflows. The Devstral ecosystem includes examples for building complete applications—like a FastAPI + React to-do app—through guided prompts, showcasing how quickly you can prototype and deploy with the agent.

This release reflects a growing trend in task-specialized LLMs, where performance is not just measured in tokens per second or raw benchmarks, but in how efficiently and safely a model can operate as part of a broader development agent. Devstral’s focus on real-world developer tasks, long-context support, and extensibility make it a strong contender for integration into modern AI-driven coding systems.

Devstral-Small-2505 Sets New Standard for Open-Source Coding Agents

Related

When the Vending Machine Went Sentient

Constant-Time Breakthrough Raises the Hash-Table Speed Limit

Star Wars Reimagined: China’s Laser Satellite Outpaces Starlink

Court Rules AI’s Use of Books as Fair Use but Slams Pirated Collection Storage

Introducing the OWASP AI Testing Guide: A New Standard for AI Security Testing

The Low-Background Steel Problem of AI

Chinese AI Firms Dodge US Chip Bans with Cross-Border Data Smuggling to Malaysia

OpenAI open-sources a demo of a UI testing agent

Financial Dynamics in Agentic AI: Cursor’s Rise Versus GitHub Copilot