Software engineering has always had a quiet truth: most failures don’t come from writing the wrong lines of code, but from skipping the boring steps around it. Specifications, test runs, verification, review — the scaffolding that keeps systems from drifting into subtle failure modes. That same truth is now reappearing in AI agents, and it’s harder to ignore.
A recent surge of interest around “Agent Skills” makes that visible. The idea is disarmingly simple: instead of trusting an AI coding agent to behave like a disciplined engineer, you explicitly force it through structured phases — Define, Plan, Build, Verify, Review, Ship. Not as advice, but as a checklist embedded into its working context.
The key insight isn’t about model capability. It’s about behavior under weak constraints. Even highly capable agents tend to skip unglamorous steps unless something external enforces them. “Do good engineering” is too vague. A required verification phase with explicit criteria changes outcomes.
That difference — between instruction and enforcement — is where things start to get interesting.
At a completely different layer, a recent position paper argues that the real risks in AI systems don’t come from individual models at all, but from how they are arranged. The structure of interaction — who goes first, who sees what, who trusts whom — determines system behavior more than the alignment of any single agent.
Three failure modes stand out:
Ordering instability means the same agents, in a different sequence, produce different results. Not occasionally — structurally. Change the order, change the outcome.
Information cascades show how early mistakes propagate. If the first agent is confidently wrong, downstream agents often amplify the error instead of correcting it, especially when they can’t access the original evidence.
Functional collapse is more subtle. A system can pass fairness or quality checks at the component level while losing meaningful discrimination at the system level. Everything looks fine in isolation. The failure only appears when the pieces interact.
What’s counterintuitive is that stronger models make these problems worse. More capable agents converge faster. They agree more confidently. That sounds good — until you realize it accelerates the spread of incorrect assumptions through the system.
Put these two threads together and a pattern emerges.
Agent Skills acts as a compensating mechanism: if a single agent can’t be trusted to follow disciplined workflows, you impose structure externally. The topology argument scales that idea up: if multiple agents can’t be trusted to behave safely in composition, you must design the system’s interaction structure as the primary control surface.
In both cases, alignment alone isn’t enough.
Behavior emerges from constraints.
This is not a new lesson. Distributed systems learned it decades ago. Reliable systems assume disagreement, failure, and partial information. Consensus protocols are designed to handle nodes that are wrong, slow, or inconsistent.
A system where every node agrees instantly isn’t necessarily robust. It’s often fragile — because a shared flaw propagates everywhere without resistance.
AI agent systems are drifting toward the same realization, but from the opposite direction. Instead of starting with failure assumptions, they often start with trust in model capability — and only later discover that trust doesn’t compose.
There’s also an uncomfortable implication here.
We tend to think better models reduce risk. But in tightly coupled systems, higher confidence can amplify failure. A slightly uncertain agent might question upstream output. A highly confident one is more likely to accept and propagate it.
In other words, a bit of friction — hesitation, redundancy, even disagreement — can make systems more robust.
The practical takeaway is not to distrust models, but to shift focus.
Designing agent systems is less about picking the most capable model and more about shaping the environment it operates in. That includes:
- Forcing explicit workflow stages instead of relying on implicit discipline
- Designing information flow so early decisions can be challenged
- Introducing redundancy or parallelism to reduce cascade risk
- Evaluating systems under multiple configurations, not just a single pipeline
These are system design problems, not model tuning problems.
If you want to see how these ideas intersect in more detail, the full discussion is worth reading:
Agents Need Systems Thinking, Not Just Aligned Models
The deeper shift is conceptual. Alignment is not a property you can assign to a model and expect to persist. It’s something that emerges — or fails to — from the structure surrounding that model.
And as agent systems grow more complex, that structure stops being an implementation detail. It becomes the system.