The practitioner's guide to shipping semantic search after the v0. Twelve chapters on hybrid retrieval, eval discipline, drift, and re-embedding - the work nobody warned you about. PDF + EPUB + free updates. $39.
A 4:17pm Slack message on a Friday. A product manager at a retail customer had spent her afternoon doing what the eng team kept promising to automate: pulled 200 queries from last week's logs, opened the live site in one tab and a staging build in another, and scored relevance side-by-side. Recall@10 on brand-and-size queries -- "Patagonia R1 men's medium" -- had dropped from 94% in September to 71% on the live system. The eval rig that should have caught it was a 40-query golden set from launch, none of them with brand-and-size. The system had been silently hurting paying customers for two weeks before a non-engineer noticed.
That's the silent-drift quarter, and it's the work this book is about. The eval rig that catches a regression on a Tuesday morning instead of waiting for a Friday-afternoon spreadsheet. The hybrid retrieval that you tune until "tuned" turns out to be a moving target. The re-embedding plan that exists before the model gets upgraded.
Twelve chapters on the discipline that separates a search system that gets better over time from one that quietly degrades. Discipline-first, opinionated, war-stories.
PDF + EPUB. Free updates as the field moves. Secure checkout. Read Chapter 1 free.
Every one of these has happened on a real production system -- mine, a friend's, or a team I've audited. The book tells you what to design for and what alarm to wire before the next one.
ef_search tuned for a snapshot, never revisited. p95 doubles, you can fix it, but you have to know to look. Chapter 5.You've built backend services. You know what a 99th-percentile latency violation looks like in your monitoring. You've stood up at least one search system - maybe a v0 over pgvector, maybe a Pinecone-backed feature, maybe a product-search rewrite at an e-commerce shop. You've seen "users are saying search is bad" land in your inbox. You're somewhere between mid and senior on the IC ladder, or a tech lead who needs to make build-vs-buy calls about retrieval infrastructure.
You don't need to know what an embedding is. The book assumes you understand vectors-as-points-in-space at the level of "I've put some in a database and queried by cosine similarity." If you're past that, you're at the right starting line.
Not for: introductions to embeddings (good intros exist on the open web - start there), vector-DB feature comparison spreadsheets (they rot too fast to be useful), or research-frontier surveys (the leading edge isn't yet boring enough to bet on in production).
Semantic Search in Production is Volume III of the Yaw Labs Production Series. Volume I, MCP in Production, is about shipping Model Context Protocol servers - the protocol-and-server perspective. Volume II, Claude Code in Production, is about running the agentic harness that calls those servers - the operator's perspective. Volume III is about the substrate the agent reaches into when it needs to find something: retrieval as a discipline. Volume IV, A2A in Production (early access), is what happens when one agent becomes a fleet - federation, auth across hops, and partial failure.
YawLabs/semantic-search-in-production-companion repo - starter code, exercises, and worked solutions at module-N-final tags for each hands-on chapter.No. Each volume in the Production Series stands on its own. Volume III is about retrieval; if you're not building MCP servers or running Claude Code, you don't need the other two to make sense of this one.
Updates ship as the field moves. Chapter 2's specific model recommendations are pinned to the writing date and noted as such; the principles for picking a model port across generations. When voyage-4 or text-embedding-4 ships and changes the cost-quality picture meaningfully, the chapter gets a revision and you get the update.
Not yet. If a print edition happens, it will be later. The digital version is the canonical living one and keeps getting updates either way.
You enter your GitHub username at checkout. The order webhook fires an invite to that user, adding you as a collaborator to the private companion repo. You should see an email from GitHub with the accept-invitation link within a few minutes. If you don't get the invite within an hour, email contact@yaw.sh with the order ID and the GitHub username you want invited.
Buy Semantic Search in Production
Twelve chapters. PDF + EPUB. Free updates. $39 one-time, secure checkout.
Companion volumes: MCP in Production, Claude Code in Production, and A2A in Production. Together they cover the agentic-tooling stack: protocol, operator, retrieval substrate, and the multi-agent fleet.