The discipline guide to running multi-agent systems after the v0. Twelve chapters when complete; four readable today, eight as drafts land. PDF, EPUB, $39, free updates through v1.0 and beyond.
The moment a single-agent system becomes a two-agent system is where the engineering changes shape, not just the deployment topology. The temptation is to think of it as a small step. We have one agent. We're adding another. They'll talk via HTTP. Done. The temptation is wrong, and the rest of this book is the unpacking of why.
The wire is not where the problems live. The problems live in everything that isn't the wire: auth that propagates correctly across hops, memory scoped so agent B can't leak agent A's writes into a different user's context, traces that span the full request, and the partial-failure modes where one timing-out specialist doesn't cascade into a fleet-wide meltdown.
A two-agent system is not a slightly-bigger one-agent system. It's the smallest possible distributed system. The moment your system is distributed, you inherit a body of literature that's older than LLMs by decades - L. Peter Deutsch's distributed-computing fallacies still apply, the agents are doing different work than microservices but the failure modes of the network connecting them are the same shape.
This book is about the discipline that takes you from "two agents that successfully exchange JSON" to "a fleet of agents that hand off work correctly under load and failure." Released today in early access.
Buy A2A in Production (early access) $39 →
Twelve chapters is the destination; four chapters is what's readable on day one. The four drafted at launch are the load-bearing ones for the architectural decisions you're making today: should I split this into multiple agents at all, and if I do, how do I share state without leaking it? Chapter 1 (why A2A is its own discipline) plus the three Part 4 chapters on memory taxonomies, cross-agent and cross-tenant scoping, and federated memory architectures.
The remaining eight - the protocol landscape, router and supervisor patterns, swarm and blackboard, auth across agent boundaries, write-permission and provenance, observability across hops, cost and latency under partial failure, and what's next - are on the roadmap and land as drafts complete. The early-access price reflects what's readable now; you get every chapter at no extra cost. If the order you need is different from the launch order, file an issue against the companion repo - early-access readers shape the priority more than the original outline does.
Part 1 - Foundations. Why A2A is its own discipline (Chapter 1, drafted). The protocol landscape - A2A as a protocol vs. A2A as a pattern; the Google-authored A2A spec alongside LangGraph hand-offs, OpenAI Swarm, raw HTTP; why MCP isn't A2A and why conflating them produces architectures that don't work.
Part 2 - Orchestration. Router and supervisor patterns - the two foundational orchestration shapes, where each earns its keep, when to compose them. Swarm and blackboard patterns - parallel agents with an aggregator, agents reading and writing a shared workspace, and the most under-discussed A2A topic: the cases where one bigger agent with more tools beats any multi-agent shape.
Part 3 - Trust between agents. Auth across agent boundaries - how user identity and service identity propagate through a fleet, OIDC end to end, "specialist runs as the user" vs "specialist runs as itself with a user-context claim," token scoping across hops, the leak modes you don't see until production. Write-permission, provenance, and the telephone game - who's allowed to write what on whose behalf, and how the trail survives the agent hop.
Part 4 - State across agents. (All three drafted, adapted from the seeded Agent Memory in Production material.) Memory taxonomies in multi-agent systems - episodic, semantic, working, procedural, with the multi-agent dimensions added. Cross-agent and cross-tenant scoping - per-agent / per-user / per-tenant / cross-agent, the leakage modes that get worse in A2A, deletion-on-request across multiple agents. Federated memory architectures - comparative deep dive on Anthropic's memory tool, Claude Code's MEMORY.md, MemGPT/Letta, Mem0, DIY-on-Postgres, each viewed through the federation question.
Part 5 - Operations. Observability across agent boundaries - one user request crossing three agents should produce one trace, not three. OpenTelemetry context propagation across A2A calls, the per-hop event log, replay tooling, debugging a fleet at 11pm. Cost, latency, and partial failure - per-request budget caps that propagate across hops, circuit breakers between specialists, timeout-and-fallback, disagreement resolution.
Part 6 - The future. The long-context-vs-delegation question (when does talking to another agent beat just doing it yourself with a million-token window?). The autonomy spectrum. Whether A2A becomes a platform primitive or stays a per-app concern. The bets I'd make starting a new multi-agent system today.
You have shipped at least one agent to production. You know what an LLM call looks like, you've integrated tools, you've handled at least one production incident with an agent in it. You're now standing at the moment where one agent is becoming two - and you can feel that the second agent isn't a small step. You're somewhere between mid and senior on the IC ladder, or a tech lead who needs to make architectural calls about whether to add an agent or add a tool.
You don't need to know the A2A spec by heart. You do need to know what a distributed-systems fallacy looks like - "the network is reliable" should ring a bell. If it doesn't, get a copy of Designing Data-Intensive Applications first; this book treats the distributed-systems half of multi-agent systems as load-bearing prior knowledge.
A2A in Production is Volume IV of the Yaw Labs Production Series. Volume I, MCP in Production, is the protocol-and-server perspective on the tools agents call. Volume II, Claude Code in Production, is the operator's view of running a single agent. Volume III, Semantic Search in Production, is the retrieval substrate the agent reaches into. Volume IV is what happens when one agent becomes a fleet - the discipline of multi-agent systems as their own engineering problem.
module-N-final tags, filling in by reader pull as chapters land. Add your GitHub username at checkout; the invite arrives within minutes.Want to read before you buy? Chapter 1 is free. The shape of the book matches the shape of the chapter.
Twelve chapters on multi-agent systems as their own discipline.
Four readable today, eight as drafts complete. PDF + EPUB. Free updates through v1.0 and beyond. $39 one-time, secure checkout.
Published by Yaw Labs.