The product that shipped as mcp.hosting is now Yaw MCP. Same architecture, new name, much bigger remit. Here's what changed, what didn't, and why one CLI that fronts every MCP server you use stays out of your context window in a way the standard setup can't.
The product is now part of the Yaw Labs family alongside Yaw Terminal, Yaw Mode, and typed. One brand, one dashboard at yaw.sh/mcp, one account that ties everything together.
Concretely:
@yawlabs/mcpyaw-mcp~/.yaw-mcp/YAW_MCP_TOKEN, YAW_MCP_URL, etc.mcp_pat_...Before getting to what Yaw MCP does differently, it's worth being precise about what the baseline is. The Model Context Protocol is client-neutral, but every client implements its own config surface. The standard shape across the four major clients is "hand-edit a JSON file per client, per machine."
Add servers via claude mcp add (writes to ~/.claude.json for user scope) or edit a project-local .mcp.json. Top-level key is mcpServers; each entry needs command, args, optional env. Secrets live in env blocks on disk.
Edit claude_desktop_config.json. Same mcpServers shape as Claude Code. User-scoped only (no project-local override). On macOS it lives under ~/Library/Application Support/Claude/; on Windows under %APPDATA%\Claude\.
Edit ~/.cursor/mcp.json for user scope or .cursor/mcp.json per project. Same mcpServers shape.
Edit .vscode/mcp.json in the workspace. The top-level key is servers, not mcpServers -- VS Code's the odd one out. Workspace-scoped only (you wire it in per project).
If the server launches via npx on Windows, you need "command": "cmd", "args": ["/c", "npx", "-y", "@scope/server"] -- not bare npx. Skipping the cmd /c wrapper produces ENOENT on the npx.cmd shim. Every server you add has to remember this.
Any one of these is mild. The combination is where the friction lives.
If you use Claude Code on your laptop and Cursor at your desk and Claude Desktop for casual chats and VS Code at work, the same GitHub MCP server has to be added to four config files. Add a new server and that's four new edits. Rotate a Slack token and that's four files to touch. Swap a project from a desktop to a laptop and you re-do the entire install.
Standard MCP setup puts API tokens directly in the env block of the config JSON. That means your GitHub token, your Stripe key, your Slack bot token, your Linear API key all live in plaintext in files that get backed up, occasionally end up in dotfiles repos, and accumulate in shell history when you grep for them. Revoking a key when an MCP server gets compromised means hunting it down across every config file you have and rotating each one independently.
This is the biggest cost and the one most users don't see directly. The MCP spec is "declare all your tools up front, then call them." A standard client setup with 30 MCP servers installed loads every tool from every server into context at startup. Hundreds of tool descriptions, all paying tokens on every turn, regardless of whether the model uses any of them.
The spec does have a tools/list_changed notification that lets a server tell the client "my tool list changed, refetch it." But standard clients don't have a coordinated way to actually use that for lazy loading -- it's a hook, not a workflow. Nothing in the vanilla setup picks which servers to load based on the task at hand.
You install an MCP server by copying a one-liner from someone's README. Once it's in your config, it runs as your user account with your environment variables every time your client starts. There's no compliance check, no spec-conformance grade, no "this server's behavior has been tested" signal -- just a name in a JSON file.
Yaw MCP installs as one MCP server into each client. From the client's view it has about ten meta-tools: mcp_connect_dispatch, mcp_connect_discover, mcp_connect_activate, mcp_connect_deactivate, mcp_connect_install, mcp_connect_import, mcp_connect_health, mcp_connect_suggest, mcp_connect_read_tool, mcp_connect_exec, mcp_connect_bundles. That's the up-front context cost: ten tools, no matter how many real servers you have installed.
The underlying servers' tools come into context only when you activate them. Lazy loading uses the same tools/list_changed notification the protocol defines -- Yaw MCP pushes new tools into the client's tool list when a server activates and removes them when it deactivates. Same primitive every spec-compliant client already supports; Yaw MCP just gives it a workflow.
You don't have to know which server you want. mcp_connect_dispatch takes a plain-English intent, ranks every installed server against it, loads the top match, and exposes that server's tools so the LLM can call them in the same turn.
> Create a GitHub issue for the login bug
[mcp_connect_dispatch is called with intent="create a GitHub issue for the login bug"]
Dispatched "create a GitHub issue for the login bug" -- loaded top 1 of 1 matching server.
gh (score 4.32): Loaded "gh" -- 24 tools: gh_create_issue, gh_list_prs, ...
[gh_create_issue is then called, returns the new issue]
Ranking is two-stage when the backend has a Voyage embeddings key configured: a local BM25 pass narrows to a shortlist, then a semantic rerank reorders. With no key it falls back to BM25 only -- dispatch keeps working with slightly weaker ranking on ambiguous queries.
Three client-side signals layer on top of the ranker. Health-aware: servers that recently failed to load or have high error rates get down-ranked. Learning: servers that succeeded before get a small (+10% max) nudge, persisted across restarts. Sampling tiebreak: when the top two candidates are within 10% of each other and your client supports MCP sampling, Yaw MCP asks your client's own LLM to pick -- no extra provider key, no extra cost.
mcp_connect_discover lists every installed server, optionally ranked by relevance to a context string. mcp_connect_activate loads specific servers by namespace. mcp_connect_deactivate unloads them. Servers also auto-unload after about ten tool calls to other servers, with an adaptive cap per namespace -- bursty servers get more patience, idle ones unload at the baseline.
mcp_connect_read_tool returns a single tool's schema and docs without activating its whole server. When the model only needs a couple of tools from a big server, this reads one or two schemas instead of dumping a whole catalog into context.
mcp_connect_exec runs a short pipeline of tool calls in one round-trip. Steps reference earlier outputs by id; capped at sixteen steps; no eval -- only dot/bracket path resolution. Useful for "get the latest PR, post a summary to Slack, file a Linear ticket" sequences the LLM would otherwise drive turn by turn.
mcp_connect_bundles lists and matches curated presets like "DevOps incident," "PR review," "growth stack," "data ops." Pair with activate to load a whole bundle in one call.
mcp_connect_suggest surfaces recurring multi-server workflows from persisted history. If you repeatedly use gh then linear then slack for the same kind of task, suggest lists the pattern with a ready-to-run activate call. Set YAW_MCP_AUTO_LOAD=1 and the top pack pre-activates at startup with no LLM round-trip.
You add servers on the yaw.sh/mcp dashboard. Name, namespace, command, args, env vars. Yaw MCP pulls the config at startup and re-polls every 60 seconds. Add a server on the dashboard from your phone and it shows up in the next poll on every machine you're signed in on. No restart, no JSON edit, no per-client copy.
Every install of Yaw MCP reads the same account's server list. The same token gives you the same servers across every machine. Install Yaw MCP on a second laptop with the same mcp_pat_ and within 60 seconds it sees the same GitHub / Slack / Stripe servers you configured from the first.
API tokens you paste into a server's env block on the dashboard are encrypted on the backend and injected at spawn time. They don't sit in a committed .env file or a client config JSON, and they're never logged. Rotate a credential in one place; every machine picks up the new value on the next poll. Revoke the yaw.sh/mcp token and every install loses access immediately.
The @yawlabs/mcp-compliance suite runs 88 behavioral tests against an MCP server and reports a letter grade. (We wrote up the methodology in Grading MCP Servers A to F: 88 Tests Against the Spec.) Grades surface inline in mcp_connect_discover output -- you see github -- GitHub [ready] [A] before you activate.
Set YAW_MCP_MIN_COMPLIANCE=B (or any grade) and mcp_connect_activate refuses to load anything below the floor, with a refusal message that spells out the grade and the env var to unset. Ungraded servers always pass (don't punish unknown) -- audit them yourself with yaw-mcp compliance <target> before relying on them.
The easiest form of cross-server prompt injection is "stuff a giant payload into a tool reply to swamp the model's context." YAW_MCP_PRUNE_RESPONSES (on by default) redacts large file-blob-shaped content before it reaches your LLM. Side effect: less accidental token burn. Set to 0 to disable.
Tools are namespace-prefixed (gh_create_issue, never bare create_issue), so a server can't impersonate tools from another server it has no business touching.
yaw-mcp doctor prints the loaded config files, your token's source + fingerprint (last 4 chars), the API base URL, installed clients, env overrides, persisted learning state, flaky-namespace reliability rollup, shell-history "shadow" hits (CLIs you run that an MCP server could replace), and an upgrade check against the npm registry. Paste the output into a support ticket; --json for the same data in a structured snapshot.
| Concern | Standard setup | Yaw MCP |
|---|---|---|
| Add a server to 4 clients | Edit 4 JSON files | 1 dashboard click |
| Add a server to 3 machines | Edit 12 JSON files | 0 edits (auto-syncs) |
| Tool surface at session start (30 servers installed) | Hundreds of tools | ~10 meta-tools |
| Credential storage | Plaintext in env blocks | Encrypted on backend |
| Rotate a credential | Edit N files, restart N clients | Edit on dashboard, <=60s |
| Revoke access immediately | Find every file, edit each | Revoke token, every install stops |
| Trust signal before activation | None | Compliance grade (A-F) |
| Pick the right server for a task | You | dispatch (ranked + learned) |
| Multi-server workflows | Manual sequence | exec pipeline, bundles, suggest |
Windows cmd /c npx wrapper | Every server entry, every config | One entry, applied by the installer |
One client on one machine with a handful of servers? Hand-editing mcp.json is fine. There's no reason to add a layer.
Yaw MCP's value shows up when that setup stops scaling: multiple clients, multiple machines, more than a few servers, credentials you'd rather not have sitting on disk, a context window you don't want crowded with tool descriptions the model won't use this turn.
One command per client:
npx -y @yawlabs/mcp@latest install <claude-code|claude-desktop|cursor|vscode> --token mcp_pat_your_token_here
This edits the chosen client's config file (correct path for your OS, correct JSON shape), writes your token to ~/.yaw-mcp/config.json so every other client you install picks it up automatically, and on Windows wraps npx in cmd /c for you.
Or install into every detected client at once:
yaw-mcp install --all --token mcp_pat_...
Or run install --list first (read-only, no token needed) to see what's on the machine before you touch anything.
Get a token at yaw.sh/mcp -- Settings, Tokens. Free for personal use; multi-device sync, encrypted credential storage, and bundle access are the Pro features.
The architecture you already knew is intact. The dispatch / discover / activate primitive that keeps tool surface out of your context, the cloud-managed server list, the encrypted credentials, the compliance grading -- all the same. What's new is the name and the home: Yaw MCP, under the Yaw Labs umbrella, alongside the rest of the developer-tools line we ship.
Published by Yaw Labs.