The Outcomes Method

How we work — four phases, three decision gates, no obligation to keep going.

Most consulting engagements fail at the same place: the gap between a strategy that sounded right in a room and a system the team actually runs in production. The Outcomes Method exists to close that gap on purpose, not by accident.

The pattern we built it for

Run any post-mortem of a stalled AI initiative and the story is remarkably consistent. A pilot worked in a notebook. The executive sponsor loved it. A six-month roadmap was costed and approved. Then nothing changed on a single business metric.

The strategy wasn’t wrong. The team wasn’t lazy. The budget wasn’t insufficient. What happened — almost every time — is that no one stayed accountable for the transition from a roadmap on a slide to a system in production. The strategists who scoped the work didn’t build it. The engineers who would have built it never sat in the workshop. The operator who would have run it at 2am on a Saturday was never in the room at all.

We started the firm to close that gap on purpose. The Outcomes Method is the operating system for how we do it — and the published contract for what you should expect from us at every phase.

The strategy wasn't wrong. The team wasn't lazy. The budget wasn't insufficient. What happened was that no one stayed accountable for the build.

Figure 1

The arc — Discover, Design, Deliver, Operate

Four phases. Three decision gates between them. Each gate has explicit pass criteria written and agreed before the previous phase starts.

The Outcomes Method — full four-phase arc A horizontal flow showing the four sequential phases — Discover (week 1), Design (week 2), Deliver (weeks 3–5), Operate (ongoing) — connected by three decision gates rendered as diamonds in the accent colour. Each gate is the point at which the engagement can advance, refine, or stop. Discover Week 1 Map · Listen · Frame GATE 1 Worth designing? Design Week 2 Architect · Cost · Plan GATE 2 Worth building? Deliver Weeks 3–5 Build · Ship · Measure GATE 3 Worth running? Operate Ongoing The Outcomes Method Each gate has explicit pass criteria. "Stop" is a valid outcome at every gate.

Phase 1 — Week 1

Discover

Map current state, listen to the operators, and frame the outcome the engagement is responsible for.

Discover starts with people, not systems. We run two workshops in week one: the first with leadership and the second with the operators who will run whatever we ship. The operator workshop is non-negotiable. The most common reason engagements fail downstream is that the people who would actually use the system never sat in the room when its constraints were defined.

In parallel, we map the relevant systems-of-record — what data exists, who owns it, how clean it is, what would have to be untangled to use it. The aim is not a comprehensive data catalogue (consultancies sell those; they age badly). The aim is a one-page answer to a specific question: if we wanted to ship the highest-leverage AI use case for this team in the next eight weeks, what would actually be in the way?

What you get

  • A prioritised use-case backlog — typically 5–8 candidates, each with effort, value and risk scored on a shared scale
  • The data + access map — every system we’d need to touch, the owner’s name, and the realistic friction to use it
  • A short list of constraints worth naming — the things that don’t show up in slide decks but will shape every decision (regulator stance, internal politics, sunk-cost commitments)
  • A go / no-go recommendation for Design — written before the gate meeting, not invented during it

What we need from you

  • Two two-hour workshop slots — one leadership, one operator. Not the same people.
  • Read-only access to the systems we discuss, or a sandbox export if access takes weeks.
  • One stakeholder who can answer questions inside 24 hours.

Phase 2 — Week 2

Design

Pick the use case, architect a buildable system, cost it, and stage the build into the smallest first slice that proves the outcome.

By the start of Design we’ve already chosen what we’re building. Design is where that choice becomes a system someone can actually build inside the engagement timeline. Not a vision deck. A buildable architecture with named components, named risks, and a costed plan.

The piece of Design most consultancies skip is the second cost model — the one that estimates not just what the build costs but what running the system in production will cost over twelve months. We always do both. The number of AI systems quietly abandoned because the inference bill grew faster than the value is non-trivial; the Design gate prevents that pattern.

What you get

  • An architecture diagram — components, integrations, the data flow, the part that uses an LLM and the part that doesn’t
  • A measurement plan — the three numbers we’ll watch, the baseline, the target, the dashboard that’ll display them
  • Two cost models — build cost (firm price) and twelve-month run cost (estimate with confidence band)
  • A risk register — every risk we identified, the mitigation, who owns it, the residual
  • The first-slice scope — exactly what week three of Deliver will produce

What we need from you

  • A single decision-maker available for one 90-minute review session at end of week
  • Confirmation of the budget envelope before we cost (we cost to the envelope, not to a number we invent)
  • The CISO / compliance contact named, if not already

Figure 2

The decision-gate model — Design → Deliver

The most consequential gate in the arc. Three outcomes. The criteria are written in week one, not invented at the gate.

Decision-gate model for the Design-to-Deliver transition A flow diagram showing four artefacts from the Design phase (architecture diagram, measurement plan, two cost models, risk register) feeding into a central gate. The gate asks three questions: are the success criteria measurable and agreed, is the architecture buildable by the team that will run it, and does the cost-value model justify the build. Three outcomes branch from the gate: proceed to Deliver in green, refine and return to Design in amber, or stop the engagement in muted text. All branches are explicit and contractual. Design artefacts Architecture diagram Measurement plan Two cost models Risk register GATE 2 Worth building? Three questions 1. Success criteria measurable + agreed? 2. Architecture buildable by the run team? 3. Cost-value model justifies the build? Three outcomes Proceed All three yes → Deliver Refine One or two no → back to Design Stop Fundamental no → engagement ends

The model above is read aloud at the gate meeting. The three questions are answered by the client — not by us — and the outcome follows from the answers. "Stop" is a frequent outcome here. Most engagements that stop at Gate 2 stop because the cost-value model didn’t pencil out, not because the architecture didn’t work. That’s a useful filter to apply early rather than discover six weeks into a build.

Phase 3 — Weeks 3–5

Deliver

Three-week build sprint with a working system in real users’ hands at the end. Weekly demos. Open repo. No hidden state.

Deliver is short on purpose. The first-slice scope agreed at Gate 2 is what gets built — no more, no less. We resist scope creep ruthlessly because the engagement’s value is measured by what ships in production, not by what’s in the backlog at the end of week five.

The build runs as three one-week sub-cycles. Each ends with a working artefact in your stack — not a demo on our laptop. Week three: the riskiest piece (usually the AI integration itself, with evaluations). Week four: the system around it (the workflow, the operator UI, the observability). Week five: production readiness (auth, monitoring, runbooks, handover preparation).

What you get

  • A working system in production — the first-slice scope, used by at least one real operator
  • An evaluations harness — automated quality checks against a labelled dataset, run on every change
  • Observability — every model call traced; cost, latency, accuracy visible on a dashboard you own
  • Runbooks for the operator — common failure modes, escalation paths, who’s on call
  • A weekly demo with the team that will run it — not the team that bought it

What we need from you

  • A named technical owner from your side who joins the daily standups (15 minutes, async-friendly)
  • Production environment access by end of week four — not week six
  • The operator who will run the system available for the week-four demo and the week-five handover

Phase 4 — Ongoing

Operate

The system is yours. We’re on a defined retainer or on call, depending on what you need. Either way, the relationship is explicit, not assumed.

Operate is where most consulting relationships quietly decay. The engagement ends; the consultancy moves on; six months later the system has drifted off whatever metric it was supposed to move and no one notices. We design against that drift on purpose, with two operating modes you can choose from.

Retainer mode is a fixed monthly engagement for ongoing improvement and accountability — typically one senior practitioner one day a week, attached to your team. We attend your standups, watch the dashboards, propose the next improvement, and make the changes. The retainer is structured as a 12-month commitment because that’s the timeframe over which the kind of compounding wins we aim for actually show up.

On-call mode is a defined response SLA for when something specific needs us — model upgrade, regulator query, integration change, a new use case to scope. We don’t bill unless we’re engaged, and the relationship is paused (not ended) between engagements.

What we deliver in either mode

  • A quarterly review — the dashboard against the original success criteria, what’s changed in the data, the model market, the regulator stance
  • An updated risk register — risks change as the system ages; we keep the register honest
  • A named escalation path — a single phone number that reaches a senior practitioner inside the response SLA, not a ticket queue

The escape hatches

When the methodology says don’t continue

The Outcomes Method is engineered so that "stop" is a real, contractual outcome — not a last-resort uncomfortable conversation. The patterns that trigger it are consistent enough to be worth naming up front, so the gate meeting is never the moment a client first hears the option.

We’ll stop at Gate 1 if

  • The use cases on the backlog all rely on data that isn’t actually reachable inside the engagement timeline
  • The operator workshop revealed that the people who would run the system don’t want it (and the reasons are valid, not change-resistant)
  • The highest-leverage opportunity is a process change rather than an AI build — we’ll name it and recommend you do the process work first

We’ll stop at Gate 2 if

  • The twelve-month run-cost model is more than the value the system would create
  • The architecture only works with a vendor change you’re not prepared to make
  • The compliance posture required is genuinely beyond what current foundation models support — better to wait six months

We’ll stop at Gate 3 if

  • The system runs but the measured outcome isn’t tracking, and we can’t identify a fix inside another two weeks at our cost
  • The operator can’t complete the runbooks after training — that usually means the design needs to change, not that the operator needs to try harder

A stopped engagement is not a failed engagement. The findings from Discover and Design are yours regardless. Recommending against a build, when that’s the honest reading of the gate criteria, is one of the most valuable things this methodology asks us to be willing to do.

4 phases Discover · Design · Deliver · Operate
3 decision gates with written pass criteria Stop is a valid outcome at every gate
1 senior partner from kickoff to handover No bait-and-switch to juniors

Common questions

About working with us this way

How is The Outcomes Method different from agile or design thinking?

Agile is about how to organise work; design thinking is about how to frame problems. The Outcomes Method is about when to stop. Every phase ends at a decision gate with explicit criteria the engagement must pass before the next phase starts. Most consultancies bill through the whole roadmap whether the gate clears or not — we don't.

What if the gate criteria fail mid-engagement?

Three outcomes are possible at every gate: proceed, refine, or stop. "Stop" is a real option and we use it when warranted — sometimes the right move is to pause an engagement before it accumulates more sunk cost. We charge for the work delivered, not for an obligation to continue.

Can we run only one phase (e.g. Discover) without committing to the full arc?

Yes. The AI Readiness Sprint is exactly that — a standalone Discover engagement. You can stop after any phase. Discover-only is the most common single-phase engagement; some clients re-engage 6–12 months later when they're ready to Design and Deliver.

Who runs the engagement on your side?

A named senior partner from kickoff through handover. No bait-and-switch to juniors after the contract is signed. The partner who runs the Discover phase is the partner who runs Operate.

How does The Outcomes Method handle scope changes?

Scope changes route through the same decision gates. A change request that affects Deliver is evaluated against the Design phase's success criteria — if it doesn't move the agreed metrics, it doesn't make the build. If it does, we re-cost and reset the gate.

What artefacts do we own at the end of an engagement?

Everything. Code, documentation, model evaluations, run-books, decision logs, the methodology canvas we used to land here. We deliberately design for the team that runs the system after we leave — including ourselves six months from now if you re-engage us.

Next step

See whether this is the right way to work together.

Most engagements start with a 30-minute call. We’ll come ready with questions, and tell you honestly whether we’re the right team for what you’re trying to do.