Where the Margin Goes

The previous chapter developed the unit-cost compression and the three downstream shifts. The gains the compression produces do not land evenly. The founder building a new company and the executive transforming an existing firm read the same shift from opposite sides — the founder asks where markets open at new price points, and the executive asks where workforce functions contract or grow. Both readers get the same underlying answer from the bounded-versus-expandable framing, but the operational move is different on each side.

The scale runs at decade-envelope at the global level. The World Economic Forum's Future of Jobs Report 2025 projects roughly 1.1 billion jobs transformed by technology over the coming decade; within the 2030 horizon specifically, 92 million roles displaced and 170 million created, a net gain of 78 million. Most of that movement is reshape-in-place rather than elimination, but the share of reshape versus elimination varies sharply by market and by function. The mechanism that decides which share a given market or function gets is older than AI — Jevons Paradox for labor, which Mollick names directly: when the per-unit cost of producing something falls, total demand for it rises, and in the expandable-demand cases the productivity gain pays for more work rather than for fewer workers. The rest of this chapter walks through where that logic lands on each side.

Five service markets become attackable at agent-price points

The mechanism runs through production cost. When the per-unit cost of producing a service falls by an order of magnitude, the universe of customers who could afford the service expands — and the firms that clear the price at the new level can attack markets the old cost structure left under-served. Five current markets show the shape most clearly today:

Legal services (contract review, compliance, first-pass redlines). Billing rates at Am Law 200 firms run roughly $500 to $1,200 per attorney-hour; first-pass review at an agent-assisted firm clears at $30 to $80 per document, an order-of-magnitude compression that opens the service to SMB clients priced out of traditional counsel. The category leader is Harvey, serving roughly 100,000 lawyers across firms including O'Melveny, A&O Shearman, and Latham & Watkins at a reported $195M ARR by end of 2025 and an $11B valuation from its March 2026 Series D. Senior attorneys retain the judgment and accountability layer while first-pass work migrates to the agent. Failure mode: unauthorized-practice-of-law exposure varies by state, and a founder without a licensed-attorney partner runs into that wall within the first few months.
Medical diagnostics and clinical documentation (ambient scribes, second opinions, initial imaging reads). Specialist reads at traditional rates run $200 to $700 per case, rationing second opinions to a fraction of the cases that would benefit. Agent-assisted initial reads compress the cost below $25 per case, routing clinician attention to the harder cases where human judgment changes outcomes. Abridge anchors the category — $300M Series E in June 2025 at a $5B valuation, with Kaiser Permanente's deployment across 40 hospitals and 600+ medical offices representing the largest generative-AI rollout in healthcare history (Kaiser's fastest implementation of any technology in more than 20 years) representing the largest generative-AI rollout in healthcare history (Kaiser's fastest implementation of any technology in more than 20 years)↗. Failure mode: FDA clearance for clinical-decision-support software and malpractice-liability assignment are the distribution moats — a founder without a cleared device or a liability-partner hospital network does not reach the market.
Financial analysis and accounting for small and mid-market businesses. A fractional CFO runs $5,000 to $15,000 per month; an agent-produced monthly financial package with human oversight clears at $300 to $1,500, bringing full-coverage finance into segments that previously operated without it. Campfire is the category's current AI-native ERP anchor — a customer that migrated from NetSuite reported its monthly close cycle moving from 15 days to three days, and Campfire has doubled revenue for six consecutive quarters while raising $103.5M across seed, Series A, and Series B. Failure mode: audit sign-off and regulatory filings still require a licensed professional — the founder either partners with one or loses the higher-value work that produces customer stickiness.
Custom engineering design and small-batch manufacturing. Engineering-hour rates at specialty shops run $200 to $400; agent-led CAD and mechanical-layout work with human sign-off compresses the per-job cost to a fraction of the original, which makes prototype-scale and short-run work economical that large-contract engineering firms ignore. Hadrian operates the model at defense-industrial scale — $260M Series C in July 2025 led by Founders Fund and Lux Capital (plus a factory-expansion loan arranged by Morgan Stanley), software-defined precision factories producing mission-critical metal parts for aerospace and defense, and a "Factories-as-a-Service" structure serving Department of Defense munitions and shipbuilding programs. Failure mode: tolerance-stack-up and manufacturability tradeoffs still require physical-manufacturing expertise; an AI-native engineering firm that hands off broken parts to a human-staffed manufacturer loses the unit economics the agent produced.
Audit and assurance for SMB and mid-market firms. Big-Four audit engagements price out of reach for the long tail of mid-market companies, and the regional firms that serve that tail run on labor rates that have themselves compressed against AI-native challengers. Modus raised $85M in April 2026 led by Lightspeed Venture Partners to build an AI-native audit platform that partners with regional firms rather than replacing them, with a founding team drawn from Palantir, Citadel, Ramp, Thoma Bravo, Bridgewater, and AWS. Fieldguide reports that its platform automates up to 70 percent of audit testing — evidence validation, results summary, exception flagging — through end-to-end agent procedures. The market shape is the same as the other four: bounded by per-engagement labor cost in the legacy form, opened up at agent-price points to clients that previously self-audited or worked with an under-resourced regional partner. Failure mode: the human signatory on the audit opinion remains a regulatory requirement, and an AI-native firm without a licensed-auditor partner runs into the same regulatory wall as the legal-services category above.

Each of these markets shares the same structural shape: demand was bounded by price, the service was under-consumed at the old per-unit cost, and the regulatory or physical-world moat still requires human expertise at the judgment layer. The greenfield founder's test against the five: which of these applies in a market where the founder has three or more years of operating experience and five or more prospective customers willing to take a call this month? Distribution and regulatory fit, not product, are where most early attempts fail.

Role categories are contracting first at AI-native firms

Executives inside existing firms work the same cost compression from the other side. The question driving the restructure is which of the firm's current functions grow, stabilize, or shrink, and the clearest signal is what firms already operating this way have done.

Block's February 2026 restructure (developed at full length in 1.3) is the public benchmark for what gets cut at firm scale. The cuts concentrated in the parts of the org where bounded demand met substitutable work: layers of middle management and functions where the old workflow split a single task across multiple people coordinating through meetings. The replacement shape from the Sequoia piece Dorsey and Botha co-authored makes the new shape of the org chart explicit: conventional management compresses toward three roles — individual contributor, directly responsible individual, and player-coach — coordinated through an intelligence layer that pipes every company artifact into one queryable model. The layers that disappeared were not the executors or the accountable owners; they were the information-routing middle that stopped being necessary once the routing could run on the substrate itself.

Privately held AI-native firms at smaller scale have gone further on role categories specifically. Improvado — a data-pipeline SaaS product with measurable revenue — has eliminated four role categories wholesale: business development representatives, content managers, pure product managers (every PM now codes), and dedicated frontend developers. Fifty to a hundred percent of the code in a given repo is written by agents; conversion rate has roughly quadrupled; headcount is roughly half its pre-restructure size; growth rate accelerated through the cut rather than dipping. The four eliminated categories shared a common shape — bounded demand for the output, task work that an agent now does end-to-end, and a workflow where the human role was doing the part the agent does better. A similar pattern runs at earlier-stage AI-native firms across verticals: the roles that go first are the ones sized to produce a fixed volume of bounded-demand output.

The mechanism is standard labor economics, not a 2026 invention. Two questions decide how productivity gains land on any given function. The first is whether AI augments the worker or substitutes for the task. The second is whether demand for the function's output expands with productivity, or is bounded by a fixed population, budget, or contract. The four combinations sort cleanly enough for a CEO to walk the firm's top eight functions through them inside an hour.

Where AI augments the worker and demand expands with productivity, the function grows — senior software engineering, specialist medical diagnostics above the first-read tier, partner-level legal work. Where AI substitutes for the task and demand is bounded, the function contracts — the role categories the AI-native firm above eliminated, alongside financial analysts on fixed coverage lists, contact-center agents on fixed inquiry volumes, traditional outbound BDRs on fixed lead pools. Where AI augments and demand is bounded, the function rebalances roles around the tool without changing team size much — content marketing, finance close cycles, most internal operations. Where AI substitutes and demand is expandable, the function splits by sub-segment — routine customer support compresses toward agents while personalized high-value support expands with more attention per customer.

Broader enterprise surveys match the direction. McKinsey's 2025 State of AI survey reports that a plurality of respondents saw little function-level headcount change from AI over the past year, but a median 30 percent of respondents expect a decrease in function-level workforce over the next year↗. The expected-decrease concentration sits in exactly the combination the AI-native firms cut first. McKinsey Global Institute's Agents, Robots, and Us adds the other side: new role categories — agent product managers, AI-evaluation writers, human-in-the-loop validators — are appearing alongside the contractions rather than after them, and the hiring into the new categories is often paid for by the hiring that stops into the old ones.

The CEO faces two failure modes and neither one is safe

The decision has no safe side. Cutting the workforce faster than AI can actually replace the tasks produces a familiar pathology — knowledge loss degrades handoff quality on the work the old staff understood implicitly, and critical talent leaves for competitors that have not yet restructured. Failing to restructure leaves the firm running the old cost base while competitors who have restructured produce the same output at lower cost, the gap compounds quarterly, and the eventual restructure lands under market-share pressure rather than on the firm's own schedule.

A CEO picking which side the firm is closer to running into has concrete signals to check. The cutting-too-fast risk surfaces in sub-twelve-month average tenure after a restructure, quality regressions on handoff-heavy workflows, and regretted-attrition rates above fifteen percent in the functions the firm cut. The cutting-too-slow risk surfaces in revenue-per-employee trailing the two nearest competitors, headcount growing faster than output for three or more consecutive quarters, and AI-assist tool penetration below thirty percent in functions where augmentation raises individual productivity without changing team size. Most firms in mid-2026 sit on the cutting-too-slow side of this diagnostic, and the error mode that gets called "caution" is usually a forecast that the transition will run slower than it actually does — a bet paid out in revenue-per-employee over the next six to eight quarters relative to the firms that bet the other way.

3.3 develops the sequencing framework — which functions to restructure first, how to budget for the transition-quarter productivity dip, and how to avoid the forced-restructure spiral that bounded-demand functions produce when handled as if they were expandable.

Goal selection, accountability, and novel judgment remain structurally human for now

The boundary holds today and will not hold forever. The domains that have resisted the current generation of agent substitution share a structural feature — the task cannot be specified completely enough in advance for a reward model to evaluate it without a human in the loop. Three categories sit at the boundary today:

Goal-setting under ambiguity. Choosing which problem is worth solving, with no ground truth available for the choice until the work is underway. Resists substitution because the evaluator and the chooser cannot be separated — the same judgment that picks the problem also assesses whether the chosen problem was worth picking.
Accountability for uncontractable outcomes. Signing off on outcomes where the failure mode cannot be pre-specified contractually — a regulatory approval, a loan commitment, a surgical decision, a hiring call. Resists substitution because legal and economic frames still require a natural person to hold the liability, independent of where the work was actually done.
Novel judgment at the frontier of what has been done before. Evaluating which direction to take when the training distribution does not contain a clear precedent — a genuinely new product category, a deal structure with no prior precedent. Resists substitution because the model's strength is pattern-matching against prior examples, and the domain by definition has few or none.

Each of these resists substitution today, and each has visibly moved between model generations over the last two years. Teams that design their operating model around "these will always stay human" are making a forecast about model capability that the last three years have been falsifying. The working posture is configurable autonomy — every role carries an explicit statement of which parts are currently human-held, which parts are AI-assisted today, and which parts are expected to migrate within the next one or two model generations.

Horizon: The boundary between human-only and agent-assisted tasks has visibly moved in each model generation since GPT-4. Planning for it to keep moving at roughly the same cadence is a bet on pattern continuation rather than a guaranteed outcome, but the pattern is where every operating-model design decision sits right now.

Next two weeks for a reader of this chapter. A founder uses the five-market filter — pick one where the reader has three-plus years of operating experience and five customer conversations ready this month, then sketch one page of who pays what for which unit of output today versus the agent-price equivalent. An executive runs the two-axis test across the firm's top eight functions — augment versus substitute crossed with expandable versus bounded — pulls revenue-per-employee against the two nearest competitors, and books one sequencing conversation for the substitute-plus-bounded function with the highest headcount share. The Interstitial that follows turns both versions of the exercise into the formal autonomy-map that the rest of the playbook assumes the reader has done.