What AIOS Run Looks Like in Month 2 vs. Month 12

Two Run reviews, same Thursday, two different clients. Back to back.

The first was a firm eight weeks past Build close. Their ops lead came in with a list. An approval gate was firing too often on a low-risk workflow and the team was routing around it. A new hire had quietly turned off an automation because nobody had walked him through the acceptance bar. Two edge cases had surfaced a gap in the data layer. We spent the hour triaging and tightening.

The second call was a firm past the twelve-month mark. Same cadence, different conversation. Their CEO wanted to talk about a new service line they'd finally staffed. The only operational question was which of two automations to retire because the workflow was being absorbed into the new service. The word "triage" didn't come up.

Both calls are healthy. They just look nothing alike. And that's the thing clients ask about most at Build close: what does Run actually look like? Am I paying a retainer so someone checks a server once a month? The difference between month 2 and month 12 is the whole case for why Run exists.

What Run looks like at month 2

Month 2 of Run is mostly triage. That's not a bug. That's what the phase is for.

The team is still adjusting to the automations and the acceptance bars that came out of Build. Edge cases are surfacing weekly because the system is meeting real traffic for the first time, and real traffic always has shapes the Blueprint couldn't fully predict. Trust in the approval gates is being earned, not assumed. Some people on the team are leaning in. Others are quietly working around the new tooling. All of that is normal, and all of it gets worked through in the monthly leadership session.

A typical month 2 session looks like this. We check which automations are firing and which are not. We look at where the team is pushing back, with names on it, not in the abstract. We retune the acceptance bars that got set in the Build rush. If a gate is firing on ninety-five percent of cases and the reviewer is rubber-stamping every one, the bar is too low and we raise it. If it's firing on ten percent and the other ninety slip past review, the bar is in the wrong place. This is the kind of acceptance-bar work that can only happen with the system in production, not in a design doc.

We also spend time on things not yet broken but will be. A new hire next month needs training nobody's written. A seasonal spike is six weeks out. An integration is drifting because the vendor changed an API. Run catches these while they're cheap to fix.

The KPIs at month 2 tell an honest story. Revenue Per Employee hasn't moved yet. That's the longest lagging of the three, and anyone who tells you it moves in sixty days is selling. Task Automation percentage is climbing fast, usually in the 20 to 45 band, because the automations from Build are coming online. Away-From-Desk Autonomy is the first to move, sometimes noticeably in month one, because even partial automation plus a trustworthy brief takes hours back immediately.

What Run looks like at month 12

Month 12 Run is not the same thing with more miles on it. The center of gravity has moved.

By month 12, the system is part of the furniture. New hires learn AIOS as onboarding, not as a retrofit. The team has stopped thinking of the automations as "the new thing." Acceptance bars that took weeks to tune at month 2 are now invisible.

The monthly leadership session changes character. It's less about what broke and more about what's next. The question on the table is usually: what's the next capability we should Build on top of the platform we've installed. Which initiative is now possible because the team has bandwidth that didn't exist a year ago. The session is strategic, not operational, and the CEO is usually driving the agenda, not me.

Pruning matters as much as tuning by month 12, sometimes more. Automations that were load-bearing at month 2 are redundant at month 12 because the workflow has changed, a better tool has absorbed the job, or the team stopped doing that work. Part of every late-Run session is deciding what to retire. The principle is borrow before you build, and it applies inside Run too. Prune before you add. Otherwise you re-accumulate the 43-tool sprawl the AIOS install was supposed to collapse.

The KPIs at month 12 look different in an order. Away-From-Desk is the new normal. People take real time off, handoffs don't take three Slack threads, and nobody flinches when the CEO is out for a week. Task Automation is in the 55 to 70 band, which is the ceiling we actually want. Revenue Per Employee is finally moving, usually 15 to 35 percent above the Fit Check baseline. That's the number that convinces a CEO the install was worth it, and it's the last one to arrive.

How the three KPIs evolve on different time scales

The three KPIs we track in Run do not move together. They move in order, and the order is what tells you whether the install is actually working.

Away-From-Desk Autonomy moves first, often in month one or two. The reason is mechanical. Even partial automation plus a daily brief that synthesizes signals across the firm's tools takes hours back from senior people immediately. The CEO who used to spend Sunday evening catching up on the week has their Sunday back within thirty days.

Task Automation percentage moves second, climbing through the middle months of year one. This is the KPI that responds most directly to month-to-month retuning. As acceptance bars settle and trust in the gates grows, the share of repeatable low-judgment work flowing through the system with approval climbs. The curve is steep in months two through six and flattens as it hits the natural ceiling somewhere in the 60 to 70 band.

Revenue Per Employee moves third, and it takes the longest. This KPI trails every decision the other two make possible. It needs the team to have bandwidth, it needs senior judgment pointed at higher-value work, and it needs a quarter or two of commercial traction on whatever the firm chose to do with that bandwidth. When clients ask why RPE isn't moving at month 3, the answer is that it shouldn't be yet. When it hasn't moved by month 9, that's a real signal and we dig in.

Judge the Run engagement on the wrong KPI at the wrong time and you'll call it broken when it's on track, or healthy when it's stalled. Watching them in sequence is how you tell the difference.

The failure modes

Run has specific failure modes, and I want to name them because they're predictable and most of them come from how the client treats the phase, not from the system itself.

The first is treating Run as maintenance. This client thinks the monthly session is a check-in, not an operating meeting. They show up unprepared, or send a deputy with no authority, or treat the hour as a status update. Run only compounds if leadership is actually using the session to decide what to tune, prune, and Build next. Without that, the system calcifies. Automations that should have been retired stay on. Acceptance bars that should have been raised stay loose. Revenue Per Employee does not move, and six months in, the client concludes the retainer isn't pulling its weight. They're right. It isn't, because they haven't been using it.

The second is chasing new automations without pruning. This client is enthusiastic. Every month they want to add something. After twelve months of pure addition, they've rebuilt the same 43-tool sprawl the install was supposed to solve. The team feels more overwhelmed, not less. If the monthly session isn't retiring things at roughly the rate it's adding them by month 6 or 7, the system is drifting.

The third is skipping the session. Life gets busy, a deal takes over a quarter, the CEO reschedules and then skips, and suddenly three months have gone by with no leadership time on the operating system. Clients who skip three sessions in a row almost always exit Run, and that's fine. What's not fine is skipping while nominally staying in, because then the retainer pays for nothing and the resentment compounds until the off-ramp is ugly.

When I see the pattern starting, I'll call it out in the session before it hardens. That's part of the job.

What graduation looks like

Every Run engagement has an ending, and the good ending is graduation.

Graduation looks like this. The system is running. The team uses it without thinking about it. The KPIs are where they should be for the firm's size and stage. The monthly session has, for two or three months in a row, been a five-minute "everything's fine" plus a forty-five-minute strategy conversation that has nothing to do with the AIOS. At that point, the firm doesn't need me in the room every month. They need me on call.

The transition usually looks like moving from a monthly retainer to a quarterly review, or to an as-needed engagement for specific new Builds. It's not churn. It's the engagement doing what it was designed to do.

The fear at Build close is often the opposite. Clients worry they're signing up for a retainer that will never end. The honest answer is that the retainer ends when it should end, and the incentive for both sides is to get there. Engagements that end on graduation terms produce the best case studies and the most referrals.

Why Run is the retainer, not the maintenance fee

The whole reason Run exists is that an AIOS is a living system, not a finished project. The firm it was installed into is changing every month. New hires, new clients, new service lines, new tools, new things the team has learned. An install that was perfect at Build close will be wrong at month 6 if nobody tunes it.

Run is the monthly leadership session plus the cross-layer upgrade cadence that keeps the system aligned with the business as the business changes. It's not a support ticket or a server check. The session is where we decide together what to tune, prune, and Build next. The cross-layer work is where the five layers stay coherent as any one of them shifts.

Try to run an AIOS without a Run phase and you get what we see at firms who bought a tool and went home. Six months of drift, a team that loses trust in the automations, a CEO who becomes the bottleneck again because the gates are firing on their inbox, and a Revenue Per Employee number that stalls under what the install should produce. Run is the phase that prevents that, and that's why it's priced as a retainer, not a maintenance fee.

The shape of Run is month 2 triage becoming month 12 strategy, and the arc between them is the work. You can read more on how we work. The MIT Sloan writing on AI in operations and the HBR operations strategy hub both make versions of this point. Operating systems, human or digital, need an owner and a cadence.

If you're at Build close wondering whether Run is worth it, the honest answer is that it's worth it if you'll use the monthly session. If you won't, don't sign. If you will, Run is where the install pays back the curve.

The first step is the Fit Check. Five minutes, free, no pitch. Everything downstream, including what Run might look like at month 12 for your firm, starts with a real answer to whether the firm is ready to begin.

-Ed