The Shift
Why every company is being rebuilt
Why production-ready agentic AI forces every firm to move from building products to building the systems that produce them, and what converges in 2026 to make the shift non-optional.
The arrival of production-ready agentic AI is a restructuring of work on the scale of the steam engine. Every organization now shifts from building products directly to building AI-powered systems that produce products, with the firm's competitive advantage relocated from operational throughput to the design and maintenance of the underlying agent architecture.
The shift is legible on two different time horizons depending on the reader. Founders and executives who restructure now capture order-of-magnitude gains on revenue per employee, time to ship, and cost per unit of work, and reset the cost baseline competitors have to match. Professionals who learn to design, deploy, and govern agent fleets define what competence looks like in their field for the next decade. Roles whose value rested on executing tasks that an agent can now perform face compression. The scale is broad: the World Economic Forum's Future of Jobs Report 2025 estimates that roughly 1.1 billion jobs — the majority of knowledge-work categories — will be transformed by technology over the next decade, with 92 million eliminated and 170 million new roles created in their place.
Three twentieth-century results frame the shift
The analytical frame for the shift was built piecemeal across the twentieth century:
- Coase (1937) established that firms exist because internal coordination costs less than coordinating the same work through the open market. The argument lived in business-school textbooks for decades as an interesting abstraction.
- Shannon (1948) showed that any signal, of any kind, can be represented as bits and processed through a common computational substrate.
- Beer (1972) framed any viable organization as nested sensor-actuator loops that together monitor operations, coordinate resources, manage the present, plan the future, and hold identity.
The two terms work different jobs. AI is the substrate that produces tokens at machine speed. Cybernetics is the architecture that wires those tokens into a self-regulating system. AI-native is the regime where both finally come together because the agents are cheap enough to staff Beer's nested loops at the bandwidth Beer's design assumed but pre-AI hierarchies could never supply.
All three results were largely theoretical for the rest of the century. In 2026 they become operational together, because the cost of moving bits through autonomous computational substrate rather than through salaried humans drops by an order of magnitude:
- Coase's transaction-cost argument converts into measurable unit-economics questions (developed in 1.3).
- Shannon's common substrate becomes an actual substrate rather than a metaphor, with every business function decomposing into bit-processing pipelines.
- Beer's sensor-actuator loops can be staffed by agents inside skill files, evaluated against defined evals, and composed into the operating model rather than delegated to management layers.
Every business function is a token pipeline
A business function, treated abstractly, is a transformation of one class of tokens into another: sales ingests inquiry tokens and produces contracts; engineering takes requirements as input and yields code, tests, and deployment artifacts; finance consumes transactions and produces reconciliation outputs; operations functions across most categories of work decompose the same way. The framing follows directly from Shannon: if any signal can be represented as bits, then every information-exchange-based business activity is a bit-processing activity in the rigorous sense.
The pipeline has three measurable properties that, until recently, were bounded by human cognitive constraints:
- Flow rate — how fast tokens move from input to output. Historically capped by human reading and comprehension speed.
- Loss rate — the fraction of context that drops at each handoff. Constrained by working memory and organizational communication bandwidth.
- Cost per token — the monetary price of processing one unit of work. Historically a function of salaried labor rates divided by per-FTE output.
Each of these bounds produced a specific organizational artifact: management layers aggregated flow, documentation systems mitigated loss, and FTE pricing managed cost.
Inference-based agents move all three bounds at once:
- Flow rate now scales with pipeline design and model throughput, orders of magnitude above what humans reading and typing can produce.
- Loss rate is governed by context-window size and context-graph quality, and shrinks as both improve across model generations.
- Cost per token tracks model choice and prompt optimization, with a cost base an order of magnitude below salaried labor for most comparable task categories.
Running a firm, in this framing, is the work of designing the pipeline and setting its parameters rather than the work of hiring people to absorb the bounds.
Block and Shopify describe the shift from the CEO seat
Two public framings, from Block and Shopify, describe the shift in operating-model terms.
Jack Dorsey frames Block's company architecture as a four-layer stack: capabilities at the base, interfaces above, proactive intelligence above that, and a unified world model at the top. In public statements and through Block's internal restructure, Dorsey has described the shift plainly: a firm that ships products becomes a firm that builds and runs the system producing products, with products falling out of the system rather than being the primary object of the firm's work. The firm's capabilities under this stack are components the system uses rather than features the company directly ships, and leadership gets an order-of-magnitude improvement in organizational legibility from the unified world model because every artifact the firm produces — Slack messages, emails, PRs, recorded meetings — piped into the intelligence layer becomes queryable state.
Tobi Lütke's adjacent claim from Shopify is more radical: companies are social technologies that the industry does not yet know how to engineer. Under this framing, AI is the first tool that lets the company itself be the engineered object rather than the product the company ships, and Shopify's "prove AI can't do it" memo of April 2025 captures the operating-model consequence.
Both framings point at the same structural shift: the firm is moving from product-building to system-building, and the system builds the products. Block's four-layer stack and Shopify's claim that AI lets the company itself become the engineered object are two angles on the same restructuring of the senior-leadership job.
Four compounding trends concentrate the transition in 2026
-
Inference-price collapse. Between April 2023 and March 2025, the price to reach GPT-4-level performance fell by more than 10x, with task-specific rates declining between 9x and 900x annually depending on the task. When the cost of the underlying computation falls by an order of magnitude, the set of tasks worth automating expands into core business functions.
-
Autonomous task duration on a compressed doubling cadence. The 99.9th-percentile length of autonomous agent turns grew from under 25 minutes to over 45 minutes between October 2025 and January 2026, and METR's time-horizon benchmark shows the median task an agent can complete autonomously doubling every 89 days since early 2024. Work that previously required a full human day of focused attention moves into autonomous reach inside a model generation.
-
Adoption velocity narrowing the transition window. Generative AI reached 54.6% adoption at its three-year mark — faster than personal computers (19.7%) or the internet (30.1%) reached the same threshold. The window available to incumbents measures in quarters rather than years.
-
Four scaling laws compounding simultaneously. Richard Sutton's 2019 essay The Bitter Lesson named the structural regularity that underwrites the trend: across seven decades of AI research, the general-purpose methods that scale with available computation have outperformed approaches that embed human-coded domain knowledge, because the ones that ride the cost-of-computation curve compound while the others stall. The four scaling laws on display in 2025-2026 — pre-training scale, post-training and reinforcement-learning scale, test-time compute, and reinforcement self-improvement (the last developed in 2.6) — are the current instantiation rather than a one-off. Each carries its own improvement rate, and the effective capability trajectory is the product of all four rather than the sum.
The structural consequence is a binary choice at the firm level: redesign workflows around agent processing, or layer AI tools onto legacy processes while keeping the process topology intact. 1.2 develops why the second option (retrofit) is the default and the specific ways it breaks; 1.3 quantifies the cost-curve difference between the two paths; 1.4 carries the analysis to the margin math of the redesign.