The Onboarding Sprint: How to Operate a 200-Table Database You've Never Seen Before

May 07, 2026—

You inherit a database with no documentation, 200 tables, 400 stored procedures, and a senior DBA who left in March. The traditional onboarding takes two weeks. With a language model in the loop, it is two days. Here is the playbook I have used four times now.

Day 1, morning — the catalog

The first hour is mechanical. Pull the catalog.

-- Tables and their row counts
SELECT t.name, p.rows
FROM sys.tables t
JOIN sys.partitions p ON p.object_id = t.object_id
WHERE p.index_id IN (0, 1)
ORDER BY p.rows DESC;

-- Stored procedures with their definitions
SELECT s.name AS schema_name, p.name, m.definition
FROM sys.sql_modules m
JOIN sys.procedures p ON m.object_id = p.object_id
JOIN sys.schemas s ON p.schema_id = s.schema_id;

-- Triggers
SELECT t.name AS table_name, tr.name AS trigger_name, m.definition
FROM sys.triggers tr
JOIN sys.sql_modules m ON tr.object_id = m.object_id
JOIN sys.tables t ON tr.parent_id = t.object_id;

-- Views
SELECT s.name AS schema_name, v.name, m.definition
FROM sys.views v
JOIN sys.sql_modules m ON v.object_id = m.object_id
JOIN sys.schemas s ON v.schema_id = s.schema_id;

Save these to JSON. You now have the database's surface area as data you can iterate over.

Day 1, afternoon — the summaries

For each procedure, view, and trigger, send the definition to a local LLM with a tight prompt:

"Summarize this SQL object in three sentences. Then list: (1) tables read, (2) tables written, (3) any non-obvious side effects (cursors, dynamic SQL, calls to other procedures, calls to linked servers, anything that materially affects behavior beyond the obvious read/write)."

Save each summary keyed by object name. After lunch you have a directory of every callable thing in the database with a one-paragraph description and a structural fingerprint.

This part is unsexy and tedious, which is exactly why nobody does it manually. It is also the foundation of every other day in the playbook.

Day 2, morning — the dependency map

Build a dependency graph. For each procedure, you now have the list of tables it reads and writes. Invert the index: for each table, list the procedures that touch it. Save as a wiki page per table.

For complex tables — high row counts, write hot-spots, or lots of touching procedures — drop the dependency list back into the LLM and ask:

"Looking at all the procedures that touch dbo.Orders, what business processes does this table participate in? Group them: (1) order creation, (2) fulfillment, (3) reporting, (4) anything else. For each group, name the procedures and describe what they do collectively."

The output is a one-page "what does this table do" summary that you would have produced manually after a month of working with the database. You have it in the second hour of day 2.

Day 2, afternoon — the danger map

Now you find what is structurally suspicious. This is the move that pays for the whole sprint.

Run the procedure summaries back through the LLM with a critical prompt:

"Among these procedures, identify any that do one or more of the following:

Use cursors for operations that should be set-based.

Build dynamic SQL from input without parameterization.

Use SELECT * in production code.

Apply NOLOCK hints on writes-adjacent reads.

Modify multiple tables without explicit transactions.

Have implicit conversions that would cause index misses.

Are over 1,000 lines (too big to be safe to change).

Group the findings by severity."

You will get back a list of the procedures most likely to cause an incident in your first quarter on the job. This list is a triage queue: read these procedures yourself, in priority order, before anything else.

The model will be wrong about a few of them. You verify each by reading the actual code. The model is also right about a non-trivial number of them, including ones you would not have looked at in your first month otherwise.

Day 3 — operational state and access

By day 3 you have a working understanding of the surface. Now look at operational state:

Backup history. msdb.dbo.backupset tells you when backups last ran and whether the chain is intact.
Job history. msdb.dbo.sysjobs and sysjobhistory for what runs on a schedule.
Permissions audit. Who has sysadmin? Who has db_owner? Who has db_datareader on tables that contain regulated data?
Recent errors. The error log, deadlock history, blocking history.

For the permissions audit specifically, the LLM is useful as an interpreter:

"Given this list of role memberships and explicit permissions, identify (1) anyone with rights inappropriate for their apparent role, (2) any service accounts with sysadmin (which is almost always wrong), (3) any orphaned users from former employees still in the database."

The output is your permissions cleanup backlog.

What you have at the end

After three days:

A directory of every procedure, view, and trigger with a one-paragraph description.
A dependency graph with per-table summaries.
A triage queue of structurally suspect procedures.
A backup-and-job inventory.
A permissions cleanup list.

You also have a question for every entry in those documents that the model could not answer with confidence. Those questions are your conversations for week two — the things to ask the surviving developers, the application team, or whoever inherited the institutional memory.

This is the work that the senior DBA who left in March did manually over their first six months. You are not as good as they were yet. You will catch up faster than that, because the documentation by-product is permanent — the next person who inherits this database starts where you ended, not where you began.

The longer-form playbook with prompts and example scripts is in the appendices of The Birth of Bob. The short version is what you have already read: it is mostly mechanical, the AI is mostly a faster reader, and the value compounds because the output is documentation, not just understanding.