The Machine
Identity and Policy
The hardest infrastructure problem in AI-native operation is knowing which agent is acting, on whose behalf, with what scope, for how long, against which policy, and with what audit trail. Legacy IAM answered who. Agents require answering why (intent) and how long (ephemeral access) as well. The substrate that carries that answer determines whether the 2026 incident library — overnight token-spike bills, exposed MCP servers, poisoned skill marketplaces, over-privileged agents paying attacker invoices — reaches the firm or stops at the gateway.
By early 2026 the substrate that carries every skill, harness, and agent had a name and a vendor category. SACR and Insight Partners published Emerging Agentic Identity Access Platforms (AIAP) in February 2026 with a direct thesis: "Legacy IAM/SSO solved 'who,' but agents force you to secure 'why' (intent) and 'how long' (ephemeral access)" in February 2026 with a direct thesis: "Legacy IAM/SSO solved 'who,' but agents force you to secure 'why' (intent) and 'how long' (ephemeral access)"↗. The report lays out a four-phase model — Discover and Register agents, Translate intent into authorization, Broker and Inject short-lived credentials, Watch and Terminate at runtime — and frames the shift as "SSO for Agents" with a centralized broker replacing the pattern of agents calling SaaS APIs directly with embedded secrets.
Gravitee's State of AI Agent Security 2026 report, published the same month, surveyed more than 900 executives and technical practitioners. 88 percent of organizations report confirmed or suspected AI-agent security incidents within the past year, rising to 92.7 percent in healthcare. Only 14.4 percent report that every agent going live has had full security or IT approval. Only 22 percent treat agents as independent identities rather than sharing keys through a service account. Fortune covered the same data in April 2026 under the headline "AI agents are acting like employees, but company structures still treat them like software". Fortune covered the same data in April 2026 under the headline "AI agents are acting like employees, but company structures still treat them like software"↗. The gap between adoption pace and identity-layer readiness has become the most consequential operational gap in enterprise AI deployment.
Six layers are load-bearing, and skipping any one leaves a specific 2026 incident pattern
The substrate has six components in priority order: identity and policy, tool integration, connectors and the data bus, model access, observability, and budgets. Skipping any one produces a documented 2025-2026 incident pattern. The opening tour here maps each layer to the concrete consequence; the rest of the chapter develops each in turn starting with the most load-bearing.
- Identity. No unique agent identity — one bot user wraps many agents. LoginRadius names this the Service Account Trap: no log attributes an action to a specific agent or human owner, SOC 2 and EU AI Act compliance break, and forensic reconstruction of an incident becomes impossible. The Gravitee data quantifies the prevalence: 78 percent of surveyed organizations are still in this posture.
- Policy. No runtime authorization layer. Identity and OAuth scopes answer whether an agent can call an API; they do not answer whether this specific action, on behalf of this specific user, under the current context, should execute at all. Microsoft's Authorization Fabric pattern — a Policy Enforcement Point plus Policy Decision Point returning ALLOW, DENY, REQUIRE_APPROVAL, or MASK per invocation — is the 2026 reference implementation for closing that gap.
- Tool integration. No authentication or sandbox on MCP servers. OX Security's April 2026 research, published through the Cloud Security Alliance, documents a systemic MCP vulnerability where an attacker who influences a configuration runs arbitrary shell commands on the host. Roughly 7,000 publicly exposed MCP servers, 150 million-plus Anthropic SDK downloads, up to 200,000 estimated vulnerable instances, confirmed RCE on six live production platforms including LiteLLM, LangChain, and IBM LangFlow. Equixly's earlier audit found command-injection vulnerabilities in 43 percent of deployed MCP servers, server-side request forgery in 30 percent, and arbitrary file access in 22 percent.
- Connectors and the data bus. No N+M connector layer. Each agent calls each SaaS API directly with embedded secrets, producing N×M credential sprawl and audit gaps. When a SaaS vendor changes the interface, every agent that called the old version breaks simultaneously.
- Observability. No agent-native tracing. Traditional APM stacks alert on exceptions and latency; agents fail quietly with confident HTTP 200 responses containing hallucinated customer IDs or invented invoice numbers. Galileo's observability research documents the retry-cascade pattern where one bad response triggers loops of regeneration, silently tripling hourly spend before anyone notices.
- Budgets. No per-agent spend cap. NotebookCheck documented a popular agent tool sending 120,000 tokens of context in each 30-minute heartbeat at about $0.75 per request, accumulating $18.75 in idle overnight spend and projecting $250 per week before any real work began. A skill-marketplace supply-chain incident adds the second vector: Snyk's ToxicSkills study identified 1,467 malicious payloads across the agent-skills ecosystem, 76 intentionally designed for credential theft, backdoor installation, or data exfiltration, with eight of those malicious skills publicly available as of publication.
The rest of this chapter works through the six components starting with identity and policy, which carries more load than any other layer.
A working substrate answers five identity and policy questions
The AIAP four-phase model resolves into five practical questions. Each question is a layer, ordered so a failure at the top propagates through everything below.
Which agent is acting? Each agent and each agent version gets a unique managed identity with its own lifecycle, human owner, scope boundaries, and audit trail. The Service Account Trap is the canonical failure — one bot user wrapping many agent identities, no attributable log, no way for SOC 2 or EU AI Act audits to answer which specific action came from which agent on whose behalf. Microsoft's Entra Agent Identity preview, released alongside the Authorization Fabric pattern, automatically creates an Entra identity when a Copilot Studio agent is built and centralizes agent-identity management in the Entra admin center. The direction of travel is that agent identity becomes a first-class directory object, equivalent in weight to human identity.
On whose behalf? Delegation has to be explicit. The agent is acting for an employee, a department, a customer, or on its own behalf under a standing delegation — each case carries different authorization shape. SACR's Phase 2, Translate and Authorize, converts the delegation intent into a deterministic policy decision the substrate enforces rather than a natural-language instruction embedded in the agent's prompt. The operational pattern that survived 2025's incidents is expressing policy as code with tools like Open Policy Agent, AWS Cedar, or a vendor PDP, and treating the prompt as an intent declaration the policy engine evaluates deterministically. The alternative — relying on the prompt to encode both intent and authorization — fails under prompt injection and cannot be audited.
What scope, for how long? Zero Standing Privilege is the principle. No long-lived secret should touch the agent. Tokens get scoped narrowly and issued for minutes, brokered at request time through a credential fulfillment layer — SACR's Phase 3. The canonical public patterns for ephemeral credentialing come from Browserbase bb (credential brokerage via a serverless integration proxy plus network-layer injection, so the sandbox never sees real credentials) and Ramp Glass (Okta SSO plus MCP proxy for sub-2-second cold start). Both are developed in the anchor cases below; the pattern to note here is that long-lived API keys and permanent tokens make replay attacks trivial and revocation a global-logout problem, and the 2026 fix — OIDC plus SPIFFE plus JIT tokens expiring in minutes, scoped per sub-task — requires that the substrate own credential issuance rather than each agent carrying its own secrets.
Against which policy? A Policy Enforcement Point plus Policy Decision Point evaluates each tool invocation at runtime. Microsoft's Authorization Fabric returns one of four decisions per call — ALLOW, DENY, REQUIRE_APPROVAL, or MASK — and centralizes authorization across every agent rather than reimplementing role checks inside each agent's prompt. The failure mode the pattern prevents is what Simon Willison named the lethal trifecta: an agent with access to sensitive data, exposure to untrusted content, and external communication capability. A concrete shape: a customer-support agent reads an inbound email (untrusted content), queries the CRM for customer records (sensitive data), and can respond by email (external communication). An attacker embeds a prompt-injection instruction in the inbound email that tells the agent to summarize the last ten customer records and forward the summary to an attacker address. Any two of the three produce ordinary risk; all three combined produce structural exfiltration that no prompt-level guardrail can be trusted to contain. Breaking the trifecta at the gateway — denying outbound email to agents with CRM read access, or routing untrusted content through a quarantined pre-processing agent with no data access — is the only mitigation the substrate community has found to hold against adversarial inputs.
With what audit trail? SACR's Phase 4, Watch and Terminate, is runtime enforcement. Every interaction attributes to the specific agent identity, the initiating user, the department, the model called, the prompt class, the policy decision, and the agent's reasoning trace where available. The audit log is what makes incident response possible. Without attribution the organization can detect that something went wrong but cannot determine which agent did what, which makes both regulatory reporting (SOX, EU AI Act, GLBA) and internal remediation structurally impossible. Phase 4 also owns the termination path: when an agent or agent class misbehaves, the substrate has to revoke credentials, halt running sessions, and flush any pending tool invocations — without waiting for the agent to voluntarily stop.
The substrate also has to handle three practical tiers of data access by agent purpose. Personal agents, like a digital colleague for a specific employee, get the same access as the employee — reading Slack, email, calendar, and docs on the employee's behalf. Analytical agents get read-only access, usually broader than any single employee (a whole CRM or data warehouse), but cannot write. External chatbots are isolated — they see only what the firm would be comfortable publishing publicly, and never touch internal data. The most common 2026 identity incident is granting personal-tier access to an analytical or external agent by mistake, usually because the agent's identity was created from a template that defaulted to too much scope.
The corporate default for agent permissions also inverts the personal default. In individual use, "everything is allowed, ask me on edge cases" fits. In any organization above roughly fifty people, the defensible default treats each tool as either always-allowed, allowed-with-approval, or always-blocked — and agent permissions run below the corresponding human permissions, because any agent with external communication capability can be prompt-injected and any data it can reach can in principle be exfiltrated. The discipline here is design posture rather than a cited requirement, but the alternative — permissive defaults above fifty people — has produced most of the 2026 identity incidents in the Gravitee survey data.
CLI wins the inner loop; MCP wins the outer loop
Anthropic launched the Model Context Protocol in November 2024. By mid-2026 it was supported by OpenAI, Microsoft, Google, and Amazon, and the Anthropic SDK alone had more than 150 million package-registry downloads. The success of the protocol produced a specific, documented cost: an MCP server loads its full tool schema into the agent's context window at connection time, and stacking multiple servers pushes the schema overhead past the useful threshold.
Jannik Reinhard benchmarked a Microsoft Graph Intune task in February 2026. Via the Microsoft Graph MCP server, the task consumed approximately 145,000 tokens — 28,000 of them for schema alone. The same task via the Microsoft Graph CLI consumed 4,150 tokens, zero of them for schema — a 35x reduction in total token use. The GitHub MCP server ships with 93 tools; loading all of them costs around 55,000 tokens before the agent has touched a repository. Stack a GitHub server with a database connector, Microsoft Graph, and Jira, and an enterprise agent routinely pays 150,000-plus tokens of tool-schema overhead before any real work begins.
The mechanism behind the cost is orthogonal to the model. Modern LLMs have been trained on billions of lines of terminal interactions — Stack Overflow answers, GitHub repos, shell documentation, tutorials. The model already knows git log --oneline -10, docker ps, kubectl get pods, and the gh CLI without a schema. MCP tools are custom schemas the model sees for the first time at invocation, so the model reasons about unfamiliar tool interfaces on the fly while token-laden schema burns context window. CircleCI's independent benchmark captures the practical consequence: CLI delivered 33 percent better token efficiency and a 77 versus 60 task-completion score in a browser-automation evaluation.
Anthropic's engineering team published Programmatic Tool Calling in November 2025. The pattern presents MCP servers as files on a filesystem the agent explores on demand, loading only the tool definitions it needs per task. For a hypothetical Salesforce MCP server the approach reduces token usage from 150,000 to 2,000 — a 98.7 percent saving. Cloudflare published the parallel pattern three months later as Code Mode. The Cloudflare MCP server covers the entire Cloudflare API through two tools (search and execute) at roughly 1,000 tokens of total context, compared with 1.17 million tokens for an equivalent naive MCP server — a 99.9 percent reduction that comes out of converting the agent's interaction with MCP into code against a typed SDK. Code Mode carries its own failure mode: the pattern depends on an SDK that is complete and correct for the underlying API, and when the SDK lags the protocol the agent has no fallback channel.
The deployment pattern across the most mature 2026 stacks converges. CLI for the inner loop where the model already knows the binary from training data — gh, docker, kubectl, gws, jq, editor-integrated tooling like Claude Code. MCP for the outer loop where centralized authentication, audit, and structured access across teams actually pay for themselves. CircleCI's summary frames the choice precisely: "The CLI vs. MCP question is really a question about where you are in the development loop. CLIs fit the inner loop: fast, local, zero overhead. MCP servers fit the outer loop: external systems, shared infrastructure, structured access. Most teams need both", editor-integrated tooling like Claude Code. MCP for the outer loop where centralized authentication, audit, and structured access across teams actually pay for themselves. CircleCI's summary frames the choice precisely: "The CLI vs. MCP question is really a question about where you are in the development loop. CLIs fit the inner loop: fast, local, zero overhead. MCP servers fit the outer loop: external systems, shared infrastructure, structured access. Most teams need both"↗.
The substrate lesson is that both patterns need to be in the stack, and naive MCP is the trap — connecting three servers blindly produces 100,000-plus tokens of schema overhead that pushes the agent's reasoning into the tail of its context window, where output quality measurably degrades. Programmatic Tool Calling and Code Mode are the engineering answer; they let an organization keep MCP's audit and policy advantages while keeping the per-task context within a usable budget.
Connectors and the data bus collapse N×M into N+M
Three categories of data feed into the substrate. Internal databases — Postgres, ClickHouse, Snowflake, BigQuery, DuckDB — hold the firm's transactional and analytical history. SaaS services — Slack, Notion, Salesforce, Gmail, GitHub, Linear, Jira — hold most of the working-state metadata the firm generates. Streaming data — product metrics, website events, calls, tickets — flows continuously and requires real-time ingestion rather than batch queries.
The architectural pattern that survived 2025-2026 is a single internal data bus. Every source feeds into the bus through a connector; every agent reads through the policy and authentication gateway above it. The collapse from N agents times M sources to N plus M connections is the architectural win — the combinatorial surface that produces credential sprawl, permission sprawl, and audit gaps becomes a star topology with policy enforcement in the center. The bus design also isolates the firm from SaaS-vendor churn: when a connector changes, one thing breaks instead of every agent that called the old interface. Airbyte and Fivetran are the common starting points for batch ingest; Kafka is the standard for streaming; custom connectors fill the gaps for vendor-specific APIs the connector ecosystem has not yet covered.
The brownfield variant handles systems that predate APIs. A large pharma with SAP cannot replace the ERP without a multi-year project carrying its own failure modes. The pragmatic 2026 path demotes the legacy system to a system of record — essentially a database — and puts computer-use agents above it as operators who click the same buttons a human did. The agent becomes the new user of a 1996 banking system or a 2003 ERP, and the firm avoids the replacement project entirely. The pattern works when the legacy system has a stable UI the agent can learn, and it fails on rapidly iterating web apps where every weekly release shifts the DOM the agent was trained against.
Model access is a smaller problem than it first appears
The model-routing layer is second-order relative to identity. Multi-provider routing through Vercel AI Gateway, OpenRouter, Portkey, or a custom proxy prevents vendor lock-in and enables cost-aware routing. The gap between Haiku-class and Opus-class pricing is roughly an order of magnitude and the "best model" rotates every three-to-six months as each provider ships a new generation. Routing decisions are best expressed at the skill definition level, not dynamically per request, because runtime task-complexity inference is itself expensive and unreliable.
The argument for treating this as second-order is structural. A routing layer over an agent with unscoped credentials still produces data exfiltration on the first prompt-injection attempt; a single-provider stack over a strong identity and policy layer still operates within a bounded failure mode. Build the gateway before optimizing model choice; multi-provider routing is a second-quarter problem for most firms.
Observability is non-optional because agents fail quietly
Traditional application monitoring fails on agents because they fail quietly. A confident HTTP 200 response with well-formatted JSON can contain completely wrong content — a hallucinated customer ID, an invented invoice number, a reasoning chain that looks coherent and points at the wrong conclusion. Traditional observability stacks (Datadog, New Relic, CloudWatch) alert on exceptions and latency, and neither signal fires when the agent returns garbage at normal speed.
The required capabilities for agent observability are specific:
- End-to-end tracing that correlates each agent run across the reasoning trace, the full tool-call sequence, and the final output.
- Output validation against schema or semantic checks even when the HTTP status is 200.
- Loop detection that catches recursive tool calls before they drain a budget.
- Cost-anomaly alerts that fire when an agent's spend in a window exceeds three times its rolling average.
- Confidence-degradation tracking that catches the slow drift from correct to plausibly-wrong without a sudden failure.
The frequently-cited agent-observability platforms in 2026 practitioner writing include Langfuse, LangSmith, and Arize (Phoenix / AX), with AI gateways like Helicone and Portkey covering the cost-tracking slice and Datadog and New Relic extending traditional APM stacks upward. The pattern most teams adopt is a single tracing platform paired with a gateway; a team standing up observability for one agent typically installs the tracing SDK first (Langfuse, Arize, or equivalent) and layers cost tracking and gateway routing afterward.
Budgets are a first-class infrastructure primitive
Token spend tracking operates per agent identity, not per parent account. The metrics that matter are cost per task execution, cost per skill, and cost per employee. Without per-user budgets the runaway-agent incident — the retry-cascade pattern Galileo documents as tripling hourly spend before anyone notices — is a matter of when, not if. The default posture that survives is a hard cap per agent per day, a soft alert at three times the agent's rolling-average daily spend, and a fleet-wide kill switch that halts every agent in the firm within seconds. The kill switch matters because runaway incidents typically run for hours before the finance team sees the invoice, and the halt action has to execute faster than the reporting delay once triggered.
Once every invocation is attributed to (agent-identity, skill, invoking-user, department), the firm picks one of three accounting shapes for how those attributions roll up to the P&L:
- Central pool. One firm-wide token budget allocated top-down. Works cleanly at small scale; breaks once more than roughly five departments compete for the pool because prioritization conversations become political rather than operational.
- Department allocation. Each department holds its own budget and pays directly. Scales to enterprise headcounts but creates cross-department friction on shared skills — the firm needs an explicit cost-sharing rule for skills invoked by multiple departments (attribute to caller's department, split pro-rata by usage, or socialize as overhead).
- Direct-user. Every employee holds a per-month token budget with rollover rules. Scales further and produces the cost-consciousness incentive directly, but requires per-identity attribution fully in place before the chargeback can be trusted.
Most 2026 production deployments use department allocation with a shared-skill cost-sharing layer on top. The three-options mapping ports from traditional cloud FinOps almost unchanged; the new primitive is only the attribution tag.
The wasted-token leak is a policy failure, not an engineering one. Production reports through 2025-2026 converge on roughly 35 to 45 percent of tokens consumed by failure modes that produce no usable output — re-summarization loops, tool-call amnesia, retry spirals beyond two attempts, hallucinated tool calls. The hard caps enumerated above (per-invocation ceiling, three-times-rolling-average alert, kill switch) are what close the leak; the authoring of those caps belongs in the governance policy document, not in an agent's prompt or on an observability dashboard someone checks once a week. Teams that put budget ceilings in prompts discover that prompts are channels the model can forget; teams that put ceilings in the harness and the policy engine discover the leak closes.
AI spend at firms serious about the transformation competes with payroll, not with software. At GTC 2026 Jensen Huang framed the scale publicly: he said he would be "deeply alarmed" if a $500,000 Nvidia engineer did not consume at least $250,000 worth of tokens annually, confirmed Nvidia is "trying to" spend roughly $2 billion per year on tokens for its engineering team, and separately floated giving every engineer an annual token budget worth roughly half their base pay as a recruiting incentive. At that dollar scale the FinOps discipline has to precede the deployment rather than follow it.
Four anchor cases carry the substrate end-to-end
Four public implementations carry different slices of the substrate pattern, and reading all four shows the category converging rather than any single vendor defining it.
Ramp Glass. The strongest public example of identity done cleanly at scale. Okta SSO connects approximately thirty pre-bundled tools at first launch — Salesforce, Snowflake, Gong, Slack, Notion, Google Workspace, Figma, Linear, Datadog — with a single click on install. The agent never holds a long-lived secret because the SSO flow is the substrate, not an afterthought bolted onto an agent that was built first. An MCP proxy keeps connections alive at app startup, compressing cold-start from 45 seconds to roughly 2 seconds. A three-person core team reached 700 daily active users within three months of Glass's launch, and non-engineers contributed skills through a Git-backed Dojo marketplace without ever needing direct credential access.
Block and Goose. Block's open-source coding harness Goose launched January 28 2025. The substrate-specific claim is that Goose is MCP-native and model-agnostic from day one, so Block's identity and policy layer had to abstract across providers and audit MCP connections uniformly rather than special-casing a single model vendor. The public release of the substrate preceded the February 2026 workforce restructure (developed in 1.3) by a full year. Substrate readiness is what made the later organizational move possible without a quality collapse.
Anthropic Cowork. Cowork gives Claude an isolated VM as the consent surface — the filesystem, the network stack, and the process tree all scope to the sandbox. The user grants access per folder and per domain rather than blanket, and the agent's local files and network are isolated from the human's. The infrastructure choice is the security model: the VM is where permission gets granted, rather than a claim the agent makes in a prompt. The underlying micro-VM isolation standard is Firecracker, AWS's KVM-based project with approximately 125ms cold start and density around 150 micro-VMs per second per host. Variants like Kata Containers run a container runtime inside a Firecracker micro-VM for developer-friendly ergonomics; gVisor intercepts syscalls in user-space for moderate isolation without full virtualization. Teams pick the isolation layer by threat model.
Browserbase bb. The substrate-specific technique bb's engineering blog contributes is the pre-warmed sandbox snapshot. A cron job rebuilds the base snapshot every 30 minutes with key repositories cloned into /knowledge/, dependencies pre-installed, the agent runtime pre-started on a local port, and system tools (bun, git, gh, ripgrep, pdftotext, TypeScript LSP, Tailscale) baked in. Cold start becomes near-instant; the sandbox is at most 30 minutes behind main; new sessions pull only the delta. The same agent runs in three modes — deployed Slack-interactive, background webhook-triggered, and web UI — with sandbox reuse keyed by Slack thread ID in a KV store so multi-turn continuity works without re-uploading context. The failure mode of the pre-warmed pattern is staleness: if the codebase moves fast and a session lands a few minutes after a breaking change on main, the sandbox's cached state can diverge from production, and the delta-pull at session start is what keeps the divergence bounded. The six-tool taxonomy — read, write, edit, exec, safebash, skill — constrains the surface the agent can touch; credential brokerage via the integration proxy plus network-layer injection keeps real secrets out of the sandbox entirely.
Six failure modes map to specific substrate gaps
Each named failure ties back to a specific layer. The pattern is the failure one encounters when that layer is skipped or misbuilt, not a generic cautionary tag.
- Service Account Trap. Multiple agents share one bot user. No audit log can attribute an action to a specific agent or human owner; compliance and forensic clarity collapse. Closing the gap requires unique managed identity per agent and per agent version.
- Standing privilege. Long-lived API keys or permanent tokens. Replay attacks become trivial and revocation requires a global logout. The 2026 answer is OIDC plus SPIFFE plus JIT tokens that expire in minutes and scope per sub-task.
- Lethal Trifecta. An agent holds sensitive data, consumes untrusted content, and has external communication capability. Any two of the three produce ordinary risk; all three combined produce structural prompt-injection exfiltration. Structural separation at the gateway — denying one of the three capabilities to the affected agent class, or quarantining untrusted content through a pre-processing agent with no data access — is the only mitigation that holds against adversarial inputs, because no prompt-level guardrail can be trusted.
- Naive MCP. Three MCP servers stacked equals over 100,000 tokens of tool descriptions before any work begins. Reasoning collapses once accumulated MCP responses push the agent into the tail of its context window where output quality measurably degrades. Programmatic Tool Calling and Code Mode solve the surface2.
- Skipped layer. Higher-order capability lands before the layer that should contain it. A skill marketplace shipped on top of no observability produces unreviewable skill execution. Agents deployed on top of no per-user budget produce runaway bills. Connectors installed on top of no data-loss prevention leak data silently. A gateway stood up without a policy engine cannot evaluate per-invocation authorization. Each skipped layer is invisible until the bill arrives, the data leaks, or the audit fails — the Gravitee data puts the share of organizations deploying agents without full security approval above 85 percent.
- Legacy thrash. Trying to replace SAP, Oracle, or 1996 banking software before agents exist. The pragmatic 2026 path is the inverse — demote the legacy system to a system of record and put computer-use agents above it as operators who click the same buttons the human did.
Horizon. The vendor landscape for agentic IAM, PEP/PDP fabrics, and MCP governance is moving fast enough that specific product choices made in mid-2026 will look stale by mid-2027. The category shape — identity plus policy plus ephemeral credentials plus runtime enforcement plus observability — is stable, while the vendor names inside each slot continue to churn on a quarterly cadence. The safe posture is to architect around the category and treat vendor selection as a 12-18 month refresh rather than a multi-year lock-in.
Run this week
Six concrete tasks, each with a deliverable and a time box.
- Agent inventory (2-4 hours). Build a spreadsheet with one row per production agent. Columns — owner, human-owner approval status, credential type (SSO / API key / service account), identity object (unique managed ID vs. shared bot user), data-access tier (personal / analytical / external), observability coverage (trace + cost + quality vs. none), daily spend cap. The spreadsheet is the baseline every subsequent audit references.
- Attribution test on one agent (1 hour). Pick the highest-volume production agent. For its most recent action yesterday, trace end-to-end: which identity, which user, which scope, which policy decision, which reasoning. If any of the five is missing, that is the first gap to close.
- Lethal Trifecta scan across the inventory (1 hour). For each agent, mark three boolean columns — sensitive data access, untrusted content consumption, external communication capability. Any agent scoring 3/3 is flagged. Structural separation at the gateway is the fix, not prompt-level guardrails.
- MCP schema-overhead measurement (2 hours). For each connected MCP server, count the tokens loaded at connection time. Total above 30,000 tokens per task triggers a migration plan — Programmatic Tool Calling for Anthropic stacks, Code Mode for Cloudflare-style typed SDKs2.
- Kill-switch drill (4 hours). In a sandbox environment, trigger a scripted retry loop in a test agent with real credentials. Time the fleet halt from incident start to agent stop. Target — under 30 seconds. If the fleet cannot halt in under a minute, the finance team will see the token bill before the security team hears about the incident.
- Observability SDK on one agent (1-2 days). Install Langfuse, Arize, or an equivalent tracing SDK on a single agent and wire up end-to-end tracing plus cost-anomaly alerts. Confirm trace visibility in the dashboard before expanding coverage; the rollout order is tracing first, gateway-based cost tracking second, quality evals third.
The next chapter picks up context engineering — how the firm structures data so agents can find, trace, and reason across it — on top of the substrate this chapter has described.