The architectures of the previous post make integration safe; they do not tell you when it has quietly gone wrong.
Cooperation has to be made an observable, repairable object — or a system can finish its work and still have coordinated badly in ways no one can see.
You cannot govern what you only see as an outcome
Most evaluation of multi-agent systems collapses cooperation into a single terminal signal: did the task complete. That signal is insufficient, and CalBench shows how. In its decentralized scheduling benchmark, scored against an optimal solver, 29.4% of the seats that completed every assigned meeting still carried positive excess cost against the optimum (Zou et al., 2026, CalBench: Evaluating Coordination–Privacy Trade-offs in Multi-Agent LLMs, arXiv:2605.09823v2, preprint). Completion concealed a quarter of the coordination failures, and without a reference optimum to measure against, that loss is invisible — nothing in a success log reveals it. The first governance requirement is therefore a measurement substrate: an oracle or baseline and an accounting of regret, not just a record of outcomes.
Making cooperation a verifiable object
If outcome signals are insufficient, governance needs cooperation itself rendered as something checkable in flight. COOP² supplies the construct: it models cooperative tasks as constraint-guarded state transitions and treats cooperation as the process of jointly satisfying those constraints over time (Yang et al., 2026, COOP²: Defining, Observing, and Repairing Cooperation in LLM Multi-Agent Systems, arXiv:2603.00349v2, preprint). The constraints are typed — temporal, spatial, capability, dependency — and each carries a satisfaction signal. Every guarded transition emits a verifiable pass or fail, which turns cooperation from an opaque outcome into an auditable event stream: when a task stalls, the system attributes the stall to a specific unmet constraint rather than to a diffuse “the agents didn’t coordinate.”
The typed lens earns its keep by localizing failure. Across COOP²’s environments, dependency violations fall by an order of magnitude as model capability rises — 93% for a weaker model down to 7–11% for a stronger one — while spatial-coordination violations stay at 36–43% for every model tested, a failure mode the authors flag as largely separate from capability (Yang et al., 2026). The series’ spatial-versus-semantic split surfaces again in a governance instrument: capability buys down some coordination failures and leaves others untouched, and only a typed signal lets an operator see which is which — coverage across the language-to-action gap, where intent expressed in language either does or does not become a satisfied constraint on the shared state.
Repair as a budgeted control, not an open loop
Observability is half of governance; the other half is what you do when a violation is predicted. COOP² pairs its constraint formalism with a repair mechanism that anticipates constraint failures from a proposed group plan and opens a targeted channel for revision before the agents act — and it is explicit that intervention itself adds decision overhead and communication load (Yang et al., 2026). That honesty is the load-bearing part, because coordination activity is not free: COOP² finds that “communication is not always cooperation” — added structure such as a centralized coordinator can raise decision time by 25 to 50% and, in one configuration, cut the task score by roughly two-thirds (Yang et al., 2026). So repair has to be scoped and budgeted — not an unbounded retry loop, but a control with a known cost, triggered by a predicted failure: the difference between a circuit breaker and simply rerunning a failing job. One caveat to keep: the paper specifies the repair loop and names its cost but demonstrates its benefit on a single process trace, not an aggregate benchmark.
Privacy and fairness are coordination costs, not side concerns
Governance of agent teams introduces dimensions single-agent oversight never had to price. CalBench makes one concrete and uncomfortable: an agent tuned to disclose as little as possible can shift uncompensated burden onto its teammates by withholding the cost information they need to allocate work fairly. In its varied-cost condition, the model that leaked the least did so by omitting cost context — mentioning cost or constraints in 6.3% of its messages against a 29.2% average for the others — and carried the highest burden unfairness as a result (Zou et al., 2026). “Privacy-preserving” and “fair” are not the same setting; they can trade off directly, and a governance regime that audits only for disclosure will pass an agent quietly imposing its costs on the team. CalBench’s typed non-LLM reference protocols show the structural alternative: vocabulary-restricted messaging gives a provable disclosure floor that free-text negotiation cannot — coordination by construction applied to the privacy dimension itself.
The trust paradox the governance has to resolve
Underneath these mechanisms sits a tension CooperBench names directly: the trust paradox. Models are trained to be cautious — to require observable evidence and resist unverifiable assertions — the right default for a single agent facing a user who may mislead it. Collaboration under workspace isolation demands the opposite: an agent must act on a partner’s claim about a state it cannot see (Khatua et al., 2026, CooperBench: Why Coding Agents Cannot be Your Teammates Yet, arXiv:2601.13295v2, preprint). Verification-first instincts and trust-requiring collaboration pull against each other, which partly explains why agents fail to update on a partner’s stated plan even when it was communicated clearly. The resolution is not more trusting agents; it is to remove the dilemma by turning conversation into verifiable shared state — pasted signatures, explicit insertion-point contracts, integration checks before a completion claim is honored (Khatua et al., 2026). Governance is what lets agents stop having to trust each other: it supplies the checkable substrate that makes a claim something other than a request for faith.
The production view governs by structure, not by reading transcripts
The leading practitioner account governs its multi-agent system on exactly these terms. Anthropic monitors agent decision patterns and interaction structures rather than the contents of individual conversations — observability that preserves user privacy while still surfacing why agents fail (Hadfield et al., 2025, How we built our multi-agent research system, Anthropic Engineering). And for agents that change state over many steps it favors end-state evaluation — judging whether the correct final state was reached rather than whether a prescribed path was followed — because valid agent trajectories vary (Hadfield et al., 2025). Both choices map onto the corpus: observe interaction structure, not chatter; verify the state, not the script.
What this means for an architecture
The governance layer for an agent team has three requirements, and the corpus specifies each. Provision a measurement substrate, because completion-rate service levels are insufficient acceptance criteria — a fleet can finish everything and still leak a quarter of its coordination quality (Zou et al., 2026). Instrument cooperation as a stream of verifiable constraint satisfactions, so failures are attributable to a specific broken requirement rather than a vague coordination shortfall (Yang et al., 2026). And make repair a scoped, budgeted control triggered by predicted violations rather than discovered ones (Yang et al., 2026). Underwriting all three is the move that has run through this series: cooperation governed by structure, made observable and correctable by construction, rather than trusted to emerge and inspected only when it fails.
Cooperation must be made observable and repairable — instrumented as verifiable constraint satisfactions rather than inferred from outcomes, and corrected through scoped, budgeted intervention rather than open-ended retries.
Completion is not evidence of good coordination, disclosure-minimization is not the same as fairness, and the resolution to the trust paradox is not more trusting agents but a checkable shared substrate. Where a system cannot show how its agents cooperated, it cannot claim they did.
