A language model is a text-in, text-out system. By itself, it cannot run a SQL query, restart a service, or take a snapshot. It produces text describing what it would do, and a downstream system decides whether to actually do it. Most of the safety work in a production-facing AI agent lives in that decide-whether-to-do-it layer.

MCP, in one paragraph

Anthropic's Model Context Protocol, released in late 2024, is the standard answer to the "how does the LLM call tools" problem. The model declares the tools it can call. The MCP server executes the call. The result returns as structured data the model can reason about. Any model that speaks MCP can use any MCP server's tools without custom integration code.

This sounds like plumbing. It matters for one specific reason: before MCP, every agent had its own bespoke tool-call parser, and every parser broke in a different way every time you upgraded the model. Some models wrapped tool calls in markdown code fences. Some put them in prose paragraphs. Some added a trailing comma to the JSON. Maintaining a regex-based parser per model was sixty to a hundred lines of fragile code that needed re-testing on every upgrade.

Replacing that with MCP is a one-weekend job. The bespoke parsing goes away. Model upgrades stop touching the integration layer.

Three layers, explicit handoffs

The architecture pattern that holds up in production is sensor / reasoning / actuator, with explicit boundaries between the three.

  • Sensors observe. They run probes that query DMVs, scrape network telemetry, read log streams. They return structured data. They do not call the model, and they do not take action. Their job is to produce a snapshot.
  • Reasoning interprets. The reasoning layer takes the snapshot, builds a prompt, sends it to the LLM (local, in regulated environments), and parses the response. The output is a structured finding — severity, hypothesis, recommended action, confidence score. The reasoning layer never executes anything by itself either. It produces a recommendation.
  • Actuators act. The actuator layer is the only place that mutates state. It is also where the gates live: confidence checks, blast-radius rules, snapshot-before-apply enforcement.

The handoffs are the value. A finding moves from sensor → reasoning → actuator only after passing the gates appropriate to that severity. A LOW-severity finding gets logged and stops. MEDIUM fires an alert. HIGH and CRITICAL proceed toward the actuator, where the next gate decides whether to ask a human or apply automatically.

The snapshot-before-apply rule is load-bearing

If there is one rule worth being inflexible about: the actuator must verify a snapshot was taken in the current remediation session before it executes. No snapshot, no apply.

This is enforced in code, not policy. The remediation tool refuses to run if no snapshot record appears in the current session log. There is no way to bypass it short of editing the source.

This rule absorbs an entire class of failure modes. If the snapshot tool is down, no remediation runs. If the network is partitioned and the snapshot API is unreachable, no remediation runs. If the agent crashed mid-cycle and lost session state, the next cycle starts fresh and has to re-snapshot. The system fails toward "do nothing," which is the correct failure mode for autonomous remediation.

The confidence gate is the second load-bearing piece

Even past the snapshot check, the actuator does not auto-apply everything. The model returns a confidence score with its recommendation. Only HIGH-confidence findings proceed without human review. And even within HIGH-confidence, the auto-apply scope is narrow on purpose.

What is safe to automate today is a small list. ALTER INDEX ... REBUILD and UPDATE STATISTICS are reversible and well-understood, and they auto-apply overnight. Query rewrites, configuration changes, service restarts: those require a human to read the finding and approve it through the management interface.

This narrows the scope of "self-healing" to a defensible claim. The system is not running an autonomous DBA. It is running an autonomous index-maintenance agent with a smart alarm bolted to the front for everything else. The phrase "self-healing infrastructure" is mostly marketing once you look at it closely; what actually ships is a tightly bounded slice of automated remediation surrounded by careful human approval gates.

Least privilege as a hard limit

The agent's database connection runs as a service account with VIEW SERVER STATE, VIEW DATABASE STATE, and write access to its own bookkeeping schema only. Not sysadmin. Not db_owner on production data.

This is not a compliance checkbox. It is the practical limit on what the agent can do in a worst-case scenario. If the model hallucinates a DROP TABLE, the connection rejects it before it leaves the agent. If a tool call is malformed in a way that pattern-matches against a privileged operation, the privilege boundary catches it.

The blast radius of an autonomous system is bounded by the privileges of its credentials. Tighten the credentials, and the cost of a worst-case mistake stays survivable.

What this looks like in real code

The architecture is not exotic. The pieces are:

  • A probe runner on a 60-second cycle that returns a DiagnosticSnapshot dataclass.
  • A reasoning layer that serializes the snapshot to a prompt, calls the local model via MCP, and parses a structured response.
  • An actuator layer with a hard requirement that the current session log contain a snapshot record before any mutating call executes.
  • An authority matrix mapping operations to one of three tiers: auto, ask, deny. The default for unrecognized operations is ask.

If you want the full reference implementation — the DMV probe SQL, the prompt structure, the actuator code, the authority matrix — Part 2 of The Birth of Bob walks through it chapter by chapter. The architecture in the book is not the first design. It is the third one. The first did not have a rollback mechanism. The second had a rollback mechanism but no confidence gate on the actuator. Each iteration was motivated by something going wrong in a way the previous design could not handle gracefully.