All posts

MCP and the Context Engine: Giving AI Agents Native Access to Codebase Knowledge

Previous posts in this series covered how our context engine works: hybrid search combining keyword matching with semantic similarity, a persistent server that keeps the ML model hot, and a finalize phase that captures what agents learn back into the knowledge base. All of that was already working. But agents still accessed it the hard way — by constructing shell commands, spawning processes, and parsing text output from stdout.

This post is about removing that friction using the Model Context Protocol. MCP turned our context engine from a CLI tool into a native capability — agents now search codebase knowledge the same way they read files or run grep, as a built-in tool call with typed parameters and structured results.

The Problem: Shell Commands as a Knowledge Interface

When an AI agent needs to understand how a system works before modifying it, it needs to query the knowledge base. Before MCP, that meant the agent had to construct a bash command with the right binary name and flags, handle shell escaping for the query string, wait for process spawning overhead on every single query, and parse unstructured text output back into something it could reason about.

The search itself was fast — the persistent context server keeps everything in memory. But the shell layer added friction at every step. Worse, the agent had to know the exact command syntax. If it forgot a flag or misquoted a query, the search failed silently or returned garbage. This is the kind of mechanical overhead that makes agents less likely to use a tool even when they should.

What MCP Is

The Model Context Protocol is an open standard that lets AI coding agents discover and call external tools as native function calls. Instead of shelling out, the agent's runtime starts an MCP server as a child process and communicates with it over stdin/stdout using JSON-RPC. The server declares what tools it offers — with names, descriptions, and input schemas — and the agent can invoke them exactly like its built-in tools.

The protocol is deliberately minimal. A handshake, a tool discovery call, and tool invocations. No HTTP servers to configure, no authentication, no service discovery. Just structured JSON on pipes. This simplicity is the point — it means you can wrap almost any existing capability as an MCP server with very little code.

The Architecture: A Thin Adapter

Our MCP integration follows a simple layered design. The MCP adapter is a lightweight process that the agent's runtime spawns. It speaks the MCP protocol on one side and forwards requests to the existing context server on the other. It has no search logic, no ML dependencies, and no state of its own. It is purely a protocol translator.

This means all the intelligence stays where it belongs — in the context server, which owns the embedding model, the search index, and the hybrid retrieval pipeline. The adapter just bridges the gap between the agent's tool-calling interface and the server's search API. Adding MCP did not require changing anything about how search works. It only changed how agents reach it.

Three Tools for Knowledge Discovery

The MCP server exposes three tools, each designed for a specific step in the knowledge discovery workflow.

  • Browse — The discovery step. Takes a natural language query and returns ranked section headers — document names, section titles, relevance scores, and trigger phrases. No full content. This keeps the response compact so the agent can scan many results without blowing its context window.
  • Fetch — The deep-read step. Takes trigger phrases from browse results and returns full section content plus one hop of related context from the document graph. This two-step pattern — browse to find, fetch to read — keeps the knowledge payload tight and targeted.
  • Rebuild — Re-indexes the knowledge base after new documents are added or existing ones are updated. The search index refreshes in seconds, so new knowledge is immediately available to the next query.

The two-step browse-then-fetch pattern is intentional. An agent searching for "how does the physics sync work" might get fifteen matching sections from browse. Instead of dumping all that content into its context window, it reads the titles and scores, picks the two or three most relevant hits, and fetches only those. This is how a human developer uses search — scan the results, then click into the ones that matter.

Writing Tool Descriptions for LLMs

One of the more interesting aspects of building MCP tools is writing the descriptions. These are not documentation for humans — they are instructions for an AI agent deciding which tool to use and when.

Our browse tool description says "use this BEFORE grepping to discover where things live." The fetch description says "use trigger phrases from browse results to sharpen precision." These phrases encode workflow guidance directly into the tool discovery layer. The agent sees them as part of its available toolkit and incorporates the usage patterns into its decision-making.

This matters because tool selection is one of the hardest problems in agent systems. An agent with access to both grep and a knowledge engine will default to grep unless something steers it toward the knowledge engine first. The MCP tool descriptions are that steering mechanism — they explain not just what the tool does, but where it fits in the workflow.

Permissions and Zero-Friction Access

MCP tools integrate with the agent runtime's permission system. Each tool gets a permission entry that can be pre-approved, so the agent can browse and fetch knowledge without being prompted for approval on every query. This is critical — if the agent had to ask permission every time it wanted to search, it would fall back to grep out of convenience.

The goal is to make knowledge search as frictionless as reading a file. No prompts, no approval dialogs, no special invocation syntax. Just call the tool with a query and get results back. The lower the barrier, the more likely agents are to actually use it.

What Changed

  • Before: — Agent constructs a shell command string, spawns a process, waits for it to exit, captures stdout, parses unstructured text. Every query pays process spawning cost. The agent must remember exact command syntax and handle shell escaping.
  • After: — Agent calls context_browse with a query string. The MCP adapter, already running as a long-lived child process, forwards the request to the context server and returns structured results. No process spawning per query, no string parsing, no shell intermediary. The tool call is indistinguishable from calling any built-in tool.

Search quality is identical — both paths reach the same hybrid search engine. The difference is entirely in the interface. The CLI path had friction that discouraged use. The MCP path has none.

MCP as a Pattern for Custom Agent Tools

The broader takeaway is about MCP as a pattern for giving AI agents access to custom capabilities. If you have a service, a database, a search engine, or any tool that agents should use — MCP lets you expose it as a native tool call with almost no integration overhead.

The key design principle is to keep the MCP layer thin. All business logic belongs in the underlying service. The MCP server is just protocol translation — receive a tool call, forward it, return the result. This separation means you can iterate on your service independently, add MCP as a new access layer without refactoring anything, and test the underlying service without MCP in the loop.

For knowledge engines specifically, MCP solves a real problem. AI agents are much more likely to query a knowledge base when it appears as a native tool than when it requires constructing shell commands. The protocol overhead is minimal, but the behavioral impact on agent tool selection is significant.

Closing

This is the fifth post in a series on our context engine. The earlier posts covered hybrid search internals, the persistent inference server, and reasoning chain capture. This post covers the last piece: how search became a native tool instead of a shell command.

The MCP adapter is the thinnest layer in the stack. It does almost nothing. That is the point. The hard problems — search quality, model inference, knowledge capture — were already solved. MCP just removed the last bit of friction between the agent and the knowledge it needs. Sometimes the most impactful change is not making something faster or smarter, but making it easier to reach.