MCP in Production: The MCP Server Book

Thirteen MCP servers in production at Yaw Labs. tailscale-mcp taught us that the LLM-vs-human identity question has no good default. aws-mcp taught us that a tool list can grow to 40,000 tokens before anyone notices. npmjs-mcp taught us that "auth" is four different problems wearing the same coat. lemonsqueezy-mcp taught us that errors a model can act on are a different art form from errors a human reads.

This book is what we wrote down between server #2 and server #14, when the same surprises kept landing in different forms and we got tired of solving them from scratch.

Twelve chapters on what the spec doesn't tell you. The protocol is the easy part; the hard part is the one you only learn by running these things in production for six months and watching what breaks.

PDF + EPUB. Free updates as the spec evolves. Read a sample chapter.

What this MCP server book teaches you to fix
Chapter-by-chapter: how to build production MCP servers
Who this MCP book is for
What's in the box
MCP in Production: FAQ

What this MCP server book teaches you to fix

Each of these is a thing one of the @yawlabs/* servers has actually hit -- not a hypothetical. The book gives you the schema change, throw discipline, or hosting decision that catches the next one before it reaches a customer.

The 40,000-token tool list. Your aws-mcp ships 200 tools; the model degrades on selection; the trace shows it grabbing the wrong one. Chapter 5 is the longest chapter for a reason. Chapter 5.
Auth that doesn't fit the LLM-vs-human pattern. Your upstream API expects a human at a browser; the agent isn't a human. The four common patterns and what to do when none of them fits. Chapter 4.
Tools that fight each other instead of composing. Output-shape-as-input-shape. Pagination that survives a non-deterministic caller. List-then-detail done right. Cross-server composition without the joins falling apart. Chapter 6.
Errors the model can't act on. A 500 with a stack trace is a useless signal to a model. Throw discipline + trigger phrases + transient-vs-terminal retry, with a six-axis grading rubric. Chapter 7.
E2E tests that pass on Tuesday and fail on Thursday. The harness pattern that makes testing a probabilistic consumer tolerable, instead of accepting that "non-deterministic" means "untested." Chapter 8.
The idle server that's burning more than the active one. Six hosting options, an honest comparison of managed MCP platforms, container packaging, reproducible builds off your laptop. Chapter 9.
The security review you didn't know was coming. The threat model, the mitigations, the questions a competent reviewer will ask, and a checklist you can run yourself before they do. Chapter 10.

Chapter-by-chapter: how to build production MCP servers

Part 1 - Foundations

Chapter 1. Why MCP exists - the three primitives, the two transports, what MCP solved that Plugins and function-calling didn't.
Chapter 2. Anatomy of a server - stdio adapter, HTTP service, in-process embedded; the four architectural shapes a real server takes.
Chapter 3. Building your first production-grade server - worked example, npm init through npm publish. Every chapter from here refers back.

Part 2 - Surface

Chapter 4. Auth, secrets, and identity flow - the four common patterns, the LLM-vs-human identity problem, what to do when your upstream API doesn't fit.
Chapter 5. Schema design, or how to not poison the model - the longest chapter. Naming, parameter modeling, the tool-list size problem (40,000 tokens is not OK).
Chapter 6. Tools that compose - output-shape-as-input-shape, pagination, list-then-detail, idempotency, cross-server composition.
Chapter 7. Error handling that the model can act on - throw discipline, trigger phrases, transient vs terminal retry, a six-axis grading rubric.

Part 3 - Lifecycle

Chapter 8. Testing a probabilistic consumer - unit, integration, and end-to-end testing patterns when the consumer is non-deterministic; the harness pattern that makes E2E tolerable.
Chapter 9. Hosting, scaling, and not going broke on idle - six realistic options, an honest comparison of managed MCP platforms, container packaging, reproducible builds off your laptop.
Chapter 10. Security review survival - the threat model, the mitigations, the questions a competent reviewer will ask, a checklist you can run yourself.

Part 4 - In practice

Chapter 11. Case studies from the @yawlabs portfolio - tailscale-mcp (the first one), npmjs-mcp (auth-shaped), aws-mcp (schema-shaped), lemonsqueezy-mcp (errors and money). Architecture, surprises, what a v2 would look like.
Chapter 12. What's next, and what to bet on - in-flight spec changes, ecosystem gaps, and the bets I'd make today.

Who this MCP book is for

You ship code for a living. You have read the MCP spec and shipped at least one server. You know what tools/list and notifications/initialized are without looking them up. You want to know why your aws-mcp tool list is 40,000 tokens and what to do about it.

You're somewhere between mid and senior on the IC ladder, or a tech lead deciding how to invest your team's MCP work.

Not for: spec walkthroughs (modelcontextprotocol.io does that better), "what is MCP" introductions, or vendor-neutral tool surveys.

What's in the box

The book in PDF and EPUB.
Free updates as the MCP spec evolves and the @yawlabs servers ship new lessons.
Sample chapters free online -- Build an MCP Server, MCP Server Auth, MCP Server Testing, and more.
Pointers into the live @yawlabs/* server repositories the book references.

MCP in Production: FAQ

Does this MCP book cover Anthropic's MCP spec, or only Claude-specific behavior?

The spec. MCP is a multi-client protocol and the book treats it that way -- the server you ship runs against Claude, Cursor, Cline, and anything else that speaks the protocol. Client-specific quirks (Claude's tool-list size sensitivity, Cursor's transport preferences) are noted at the point where they constrain a server-side decision.

What if the MCP spec changes?

Each chapter pins the spec version it was written against, and updates ship as the spec moves. The protocol has been moving steadily; the disciplines (schema design, throw discipline, the four auth patterns, the testing harness) survive minor-version churn. When a load-bearing change lands, the affected chapter gets a revision and you get the update.

Do I need to know Claude Code or Cursor first?

No. This book is the server-side view -- you're shipping the tools, not operating the agent that calls them. If you have used any MCP client at all, you have enough context; every chapter builds from the protocol up rather than from a particular client's UI.

Do I need to ship a public MCP server to benefit?

No. Local and internal MCP servers are the larger use case -- the LLM-vs-human auth question, schema design, throw discipline, and testing patterns apply identically whether the server runs on your laptop, in your VPC, or as a published @yawlabs/* package. The hosting chapter covers all three deployment shapes.

How do the companion-repo invites work?

There are no invites -- the companion repo is public, so just clone it. Starter code, exercises, and worked solutions live at chapter-N-final tags for each hands-on chapter. No GitHub account or access request needed.

Get MCP in Production

Twelve chapters. PDF + EPUB. Free updates. Free with a Token Limit News signup.

Built on the same patterns as the Yaw MCP CLI (@yawlabs/mcp).