A 90‑Day Plan to Pilot Enterprise Agents (Without Burning Bridges)

The future is automated—and still human. Keep people at the center and you’ll actually get somewhere.

Phase 1 (Weeks 1–3): Frame & baseline

Pick one process; write outcome in one line (metric).
Define the win in plain numbers (e.g., “Auto-triage 60% of tickets in ≤6h with <5% rework”). Done when the sentence fits on one line and a CFO would nod.
Map reality (AS‑IS, exceptions, inputs).
One-page flow of what actually happens today, with decision points, exception list, and sample inputs. Done when SMEs say “that’s it,” not “it depends.”
Baseline current cost/time/error with 10 real cases.
Time a small, honest sample and capture cycle time, error rate, touches, and effort. Done when you have a table you can compare against later.

IXP/DU to extract; Agent to decide; RPA/API to execute.
Wire a vertical slice: unstructured → structured → decision → action. Done when one golden-path case runs end-to-end without human help.
Confidence threshold + escalation path defined.
Set a numeric floor (e.g., 0.80) and the exact route for low-confidence or policy hits (queue, owner, SLA). Done when the agent can tell you why it escalated and to whom.
Telemetry wired: logs, costs, reasons.
Emit event logs with case IDs, latency, confidence, result, and estimated cost-per-case. Done when you can open one dashboard and answer “what happened, how fast, how much.”

Exception drills (timeouts, bad inputs, policy blocks).
Break it on purpose and verify retries, fallbacks, alerts, and audit trails. Done when each failure mode produces the expected behavior and a readable log.
Shadow run → limited prod window → daily report.
Run in parallel with humans, then a controlled prod slot; publish a daily snapshot of volume, success %, escalations, and incidents. Done when humans trust the numbers.
Publish deltas and the ‘stop/scale’ recommendation.
Compare to baseline (time/cost/quality), list risks left, and state the call. Done when you can show a 1-pager that says “stop, fix, or scale,” with math—not vibes.

If you can’t explain the result to a CFO and a security lead in five minutes, it’s not ready. Simple beats clever.

Week	Deliverable
1	One-line outcome, AS-IS map
2	Data sample pack (10 real cases)
3	Risk register + SLO draft
4	Thin slice model in Studio Web
5	IXP extraction baseline
6	Agent decision loop + guardrails
7	API actions wired via Integration Service
8	Exception drills + logging review
9	Shadow run, daily report
10	Limited prod window
11	Publish deltas + stop/scale memo
12–13	Hardening, handover, and doc set

SME time collapses → book a 30-min weekly clinic w/ a named deputy; send a 1-page pre-read 24h before; escalate to sponsor after 2 missed reviews.
No test data → build a small synthetic set plus a redaction pipeline; include 10 real edge cases; get security sign-off before UAT.
Cost spikes → cap cost-per-case with alerts at +20% WoW; switch models/tools or downshift to rules where possible; review spend in the weekly ops check.

👉 For full templates, scorecards, and slide-ready checklists, see Digital Automation Guide and the companion course. Join the waitlist.