MCP Server Auth: A Practical Guide (2026)

Auth is where MCP server projects stall. The protocol does not specify auth; the spec hands you a transport and a primitive set and leaves the rest to you. Which means the question "how does the model prove it is allowed to call this tool" is yours to answer -- and the wrong answer leaks something different in production depending on which of four common shapes you picked.

This guide covers the four patterns, the LLM-vs-human identity problem each one solves or punts on, and the escape valves for upstream APIs that fit none of them. The full chapter is Chapter 4 of MCP in Production.

Pattern 1: no auth (local-only)

The server runs as a stdio subprocess of the client. The client trusts the user that launched it; the server trusts whoever invokes it via JSON-RPC. There is no auth surface because there is no network. The 80% case for personal tools, dev workflows, and anything that ships as npx your-server.

When it fits: stdio servers where "the user that launched the client" is "the user the server is acting for."

When it breaks: the moment the server is shared (HTTP), or the moment the model needs to act with credentials the user does not have.

Pattern 2: env credentials

The server reads a credential from its environment (STRIPE_API_KEY, GITHUB_TOKEN) and uses it on every upstream call. The user puts the credential in their shell or the client's config; the server uses it; there is no per-call auth.

When it fits: upstream APIs with long-lived API keys (Stripe, OpenAI, most service tokens). The user has the key; the server uses it.

When it breaks: multi-tenant scenarios. The credential authenticates the server, not the user. If two users share the server, the upstream API sees one identity.

Pattern 3: OAuth (per-user)

The client handles an OAuth handshake; the server gets a per-user token; calls are made on behalf of the authenticated user. This is the right answer for HTTP MCP servers serving multiple users -- each user's actions go upstream with that user's identity.

When it fits: HTTP servers, multi-tenant tools, upstream APIs that issue per-user tokens (Linear, Notion, GitHub Apps).

When it breaks: the upstream API doesn't speak OAuth, or the client doesn't implement the handshake (the spec covers it; not all clients support it).

Pattern 4: on-behalf-of with service identity

The server has a long-lived service credential (Pattern 2) but adds a per-call user-context claim -- a JWT, a header, a request field -- that the upstream API uses for authorization and audit. The server acts as itself with a user-context label; the upstream API treats the server as a trusted caller asserting who it is acting for.

When it fits: internal services where you control both the server and the upstream API. The right shape for enterprise MCP servers.

When it breaks: the upstream API does not understand the user-context claim. Everything looks like the service identity. Audit trail goes to a single user.

The LLM-vs-human identity question

None of these patterns answers the underlying question: when the model takes an action, is it the user, or is it the model?

"It's the user." Every action is attributed to the human; the audit trail shows the human's username; rate limits hit the human. Easiest mental model; wrong when the model takes an action the user did not specifically authorize.
"It's the model, acting on behalf of the user." Audit trail shows "claude-code-cli acting for jeff@yaw.sh"; rate limits hit the agent; the human can disclaim actions they didn't intend. Correct but requires upstream API support that mostly does not exist yet.
"It's an LLM agent identity, distinct from any human." The upstream API treats the agent as a first-class principal with its own permissions. Cleanest model; requires building agent-identity infrastructure most APIs lack.

There is no good default. The decision leaks something whichever way you pick. The book covers the leak modes per pattern and the escape valves -- including a "we don't know yet" pattern for upstream APIs that are still figuring it out.

What to do when nothing fits

The tailscale-mcp server in the @yawlabs portfolio hit this in v0.1. Tailscale's API expected a human at a browser; the agent was not a human. The workaround was a server-side service account with a separate per-tailnet authorization layer the agent could prove against -- not a pattern that exists in the spec, but the one that fit the upstream.

The rule: name the pattern as a deviation, document why, and don't pretend it generalizes. The book has the worked example with the failure modes that drove the design.

Want the full chapter?

MCP in Production Chapter 4 is the auth chapter. It covers the four patterns with worked examples from the @yawlabs portfolio, the LLM-vs-human identity question with the leak modes per pattern, and the escape valves for upstream APIs that fit nothing.

MCP in Production

The MCP server book. Twelve chapters from shipping fourteen @yawlabs/* servers. PDF + EPUB. Free updates as the spec moves. Free with a Token Limit News signup.