Tiny Recursive Model: How a 7M-Parameter Net Outsmarts Giants with Latent Scratchpads and Iterative Self-Critique

Merit first: a 7-million-parameter model — Tiny Recursive Model (TRM) from Samsung — reportedly outperformed giants like DeepSeek-R1, Gemini 2.5 Pro, and o3-mini on hard reasoning benchmarks (ARG-AGI 1 and ARC-AGI 2). That’s huge: a model roughly 10,000x smaller getting better at reasoning shows scale isn’t the only path to competence.

Why it works (clean, practical ideas):

Draft-first, then think. TRM produces a quick complete draft answer (one-shot, not token-by-token). That draft is the starting hypothesis.
Latent scratchpad. Instead of writing thoughts as text, TRM keeps an internal latent “scratchpad” z made of several slots (code defaults to six). These latent vectors hold the model’s internal reasoning state.
Intense inner-loop critique. The model repeatedly refines the scratchpad — the code runs a small net across the scratchpad slots multiple times (six updates per cycle) to self-criticize and improve the logic. That’s where reasoning happens.
Revise the draft. After refining z, the model produces an improved draft from the updated latent state.
Repeat until confident. The draft→think→revise loop is repeated many times (reports say up to 16 cycles) to converge on a solid answer. A learned confidence head (q_hat) lets it stop early when ready.

Smart engineering choices amplify impact:
– Most inner thinking steps run under no_grad (and outputs are detached), so the model avoids storing huge backprop graphs — letting a tiny net perform many iterative steps without exploding memory/compute.
– Weight reuse via recurrence gives effective depth far beyond the parameter count.
– A binary quality head decides when to stop, making thinking adaptive.

Opinion: this is a brilliantly economical approach — trading raw parameter count for iterative, latent computation and smart training tricks. For reasoning-heavy tasks, doing more internal thought with a small model can beat brute-force scale. Expect more research applying latent scratchpads and iterative critique to get more mileage out of tiny models.
Read more…

Tiny Recursive Model: How a 7M-Parameter Net Outsmarts Giants with Latent Scratchpads and Iterative Self-Critique

Related

Tiny Recursive Model: How a 7M-Parameter Net Outsmarts Giants with Latent Scratchpads and Iterative Self-Critique

CodeMender: DeepMind’s AI Agent That Finds and Fixes Security Flaws Automatically

Qualcomm Acquires Arduino: Open Source Community Watches With Caution

ChatGPT Becomes a Platform: Apps Now Live Inside the Conversation

Claude Code 2.0 with New Features and Enhanced IDE Integration

Claude Sonnet 4.5: Revolutionizing Coding with AI’s Latest Marvel

Interstellar Visitor 3I/ATLAS Takes a Direct Hit from the Sun

The hidden energy cost of AI-generated video

When AI Agents Become Insider Threats: Notion’s Security Wake-Up Call