The Playbook
Build Knowledge, Not Systems
The shared workspace is the single piece of infrastructure that turns a set of Personal OSes into team capability. Onboarding collapses from weeks to one to two days on standardized processes once the workspace is the onboarding.
A team of N practitioners running individual Personal OSes produces N individual outputs. Each agent works alone, cross-references nothing, and re-asks the questions the person at the next desk already answered yesterday. The team's average does not move. The shared workspace is the single piece of infrastructure that turns those N individual practices into team capability. With it, one person's breakthrough becomes everyone's baseline within a week. Without it, Stage 1 is a shared Git repo with N folders that never reference each other and a workspace that exists on the org chart rather than in practice. The practitioner outcome that makes this load-bearing rather than aesthetic: onboarding collapses from the several weeks it used to take to roughly one to two days on standardized processes, because the workspace is the onboarding.
The principle that governs this chapter inverts the standard IT playbook: do not design the structure. Describe what the team needs. The structure is a byproduct of use, not a prerequisite. Teams that try to design the perfect ontology before anyone starts producing artifacts spend six months in committee and ship nothing usable. Teams that dump company context into a single folder and start the agent on Monday have something working by Friday and a real structure emerging within a month. The architecture is discovered through use, not specified up front. The three-stage path practitioners converge on is just start, then structure, then ontology — and the cost of skipping straight to ontology is months of empty taxonomy that produces no working artifacts.
The shared workspace is what makes Stage 1 load-bearing
Every team member running a Personal OS produces two things of lasting value: the four context files, and the working artifacts (morning brief, style guide, deal rubric, decision logs) that accumulated while they practiced. Those artifacts only compound for the organization if a colleague can read them, run them, extend them, and contribute back. Without a shared substrate, the best practitioner in the team produces the best individual output and the team's average sits unchanged. With a shared substrate, one person's breakthrough becomes everyone's baseline within a week, because the artifact lives where every agent reads on every session and every other practitioner can grep, modify, or extend it.
Stage 1 belongs between Stage 0 (personal practice) and Stage 2 (the first production pipeline) for a structural reason: the pipeline requires shared context that the workspace holds. A first pipeline written against N individual workspaces produces N pipelines, each of which has to be maintained separately. The same pipeline written against a shared workspace becomes a single artifact that every team member's agent can read, every new hire's agent reads on day one, and every modification accrues to the team rather than to the individual who shipped it. The workspace is the substrate that lets the pipeline scale past one user.
Start with one shared root, not the perfect ontology
The fastest working shared workspace is the four-file Personal OS scaled to a team root. One repository, opened daily by every team member, containing four files at the root and nothing else for the first week:
- A top-level
CLAUDE.md(or equivalent) that names who the team is, what it does, how it works, and the load-bearing rules of engagement. Roughly 40 to 60 hand-written lines, not an auto-generated 400-line marketing dump. The practitioner pattern that consistently outperforms generated alternatives is hand-crafted, concise files with explicit "what to avoid" directives. - A
context.mdwith the company brain dump. Products, pricing, ICP, logistics, partners, current OKRs. The first draft is a messy export from Notion or wherever the canonical content currently sits. The version that pays off is the one maintained weekly as the company changes. Before building this in earnest, inventory the sources the team will eventually ingest: system name, access method (API, manual export, or locked behind a UI), and whether an MCP connector already exists. - Per-role
role.mdfiles (sales, marketing, engineering, customer success, finance, operations) capturing the verbs of each role. Who approves what, who owns which outcomes, what "good" looks like on each recurring deliverable. The temptation to write these as job descriptions misses the point. The agent needs verbs and concrete decision rules, not ladder rubrics. - A shared
lessons.mdthat grows one line at a time and gets tiered into HOT and WARM as it crosses roughly 50 entries. Same mechanism as the personal lessons file, now consolidated across the team so that a correction once becomes a correction for everyone.
This is the whole workspace at Day 7. It does not solve every context problem the team has. It does solve enough of them that agents stop hallucinating basic company facts and start producing work colleagues recognize as theirs. The structure emerges from the questions the team asks in the first weeks, and trying to design a comprehensive folder tree before use inverts the principle the chapter is built on. The most common failure in 2026 rollouts is a perfect-structure plan that never ships because the committee debating the taxonomy is too far ahead of the people producing artifacts.
Outgrow the single repo into Vault, Connectors, Skills
Once the team passes roughly 20 people, or the first compliance or privacy boundary appears, the natural shape is three layers: Vault, Connectors, Skills. Each layer carries a clear purpose and a clear failure mode, and the layers separate cleanly enough that a team can adopt them one at a time as the underlying problem becomes acute.
Layer 1 — the Vault. The content layer, where knowledge actually lives. Personal vaults per member hold material an individual needs but no agent except their own should see (private notes, 1:1 call recordings, drafts in progress). Shared vaults per project hold what the team collaborates on (CLAUDE.md, context, role files, shared skills, meeting notes, project docs). Git is the default sync transport because it is free, version-controlled, and language-native for every engineer on the team. The practical pattern at small scale: everything is personal by default, and projects are the sharing unit, with explicit file copy from personal into project folders when collaboration starts. The shared context is the substrate every player consults, replacing the conductor that used to be the manager. The failure mode at this layer is N individuals' four-file systems sitting in a Git repo with no common root: agents cannot cross-reference, people re-answer the same questions, and the team's average output never moves.
Layer 2 — Connectors. The integration layer, where agents reach into the systems of record. Centralized, security-audited connectors for chat, mail, CRM, calendar, ticketing, code, and the rest of the SaaS stack, each with agent identity, scoped credentials, an audit trail, and a periodic parsing cron for systems that do not push updates. This layer is where the service-account trap must be avoided at team scale: each agent acts under its own named identity rather than a shared bot user, so audit logs can attribute actions and compliance holds. The agentic-first stack principle belongs here too. When picking vendors, choose by API and MCP and CLI capability, not by how the dashboard looks. A tool with a beautiful UI and no agent access becomes a dead end at Stage 1 because humans will write to the UI and agents will never see the writes. The failure mode at this layer is credential sprawl: team members holding their own keys, every departure breaking half the workspace, and the audit trail fragmenting across personal accounts.
Layer 3 — Skills. The capability layer, where encoded organizational procedures live. A shared skills library (Git-backed), an admin who reviews new skills for security and quality before they land, and a discovery mechanism so colleagues can find the skill that matches the task they are trying to do. Entity resolution as a human responsibility — master profiles linking the same person across email, chat, CRM, and call transcripts — sits at this layer because every skill benefits from it once built. The mature endpoint of this layer: at fifty skills the library is additive, at five hundred it has to start consolidating into a single shared knowledge graph or it will rot. Champions discover new skills, translate tacit knowledge into skill files, and define the eval criteria that tells the team whether a skill is ready for production. The failure mode at this layer is orphan skills: skills with no owner, no retirement policy, and no eval, accumulating until the library contains capabilities that no longer match the business and nobody can safely delete.
The public anchor for this shape running in production: Block. Angie Jones, who led AI Enablement on the Block engineering team, published the architecture in detail in January 2026. Every repository carries two markdown files at the root: an AGENTS.md (machine-facing — build and test commands, code-style conventions, the architecture patterns the agent loads on entry) and a HOWTOAI.md (human-facing — how the team uses AI in this repo, with setup instructions, tips, and example workflows). Subdirectory-scoped rule files give agents domain-specific context without bloating the token budget. The connector layer is a library of more than 60 internal MCP servers, each authored by Block engineers, with OAuth flows and keyring-stored credentials replacing the long-lived secrets that would otherwise sprawl across machines. Within three months of launching the AI Champions program against this substrate, automated pull requests rose 21×, reported time savings rose 37%, and AI-authored code jumped 69%. Within three months of launching the AI Champions program against this substrate, automated pull requests rose 21×, reported time savings rose 37%, and AI-authored code jumped 69%↗. The substrate today serves roughly 7,500 weekly active employees across Square, Cash App, Afterpay, Tidal, and Platform teams, with about 90% of code submissions authored partially or fully with AI. The architecture is the same shape this chapter develops, and the load-bearing observation is that the lift came from the markdown substrate and the connector layer rather than from any choice of model.
A note on connector efficiency. The naive pattern of loading every available tool definition upfront breaks down past a few dozen connectors because the tool catalog itself eats the model's context window before the user has typed a question. Anthropic's November 2025 engineering post on MCP code execution showed the alternative: instead of loading every server's tools, the agent discovers tools by exploring a filesystem of server folders and reading only the tool files needed for the current task, dropping context use from 150,000 tokens to 2,000 tokens for a 98.7% saving on the same workflow. The Vault / Connectors / Skills pattern aligns naturally because the workspace is already a filesystem the agent reads on demand. Connector descriptions live in their own folders, and skills reference the specific connector files they need rather than enumerating the whole catalog upfront.
The master skill compresses onboarding from weeks to days
The load-bearing artifact of Stage 1 is the master skill: a single top-level markdown file, typically around 200 lines, that encodes an entire standardized process by linking to the sub-skills that execute each step. For a B2B SaaS client onboarding, the master skill contains the steps (collect client info, configure data sources, connect ad accounts and web analytics, set up the tagging structure, validate reports, confirm attribution), the data dependencies (what each step needs from the client), the approval gates (what a human signs off at which stage), the escalation rules (what to do when the client is unresponsive or data quality fails), and inline links to the sub-skills that actually do the work.
A simplified concrete example of the structure:
# Master: Client Onboarding
## When to run
- Trigger: new client contract signed in CRM.
- Entry point: /onboard-client <client_id>
- Duration: 1-2 business days for standard clients; escalate if longer than 5 days.
## Steps
1. Collect client info -> see skills/client-intake.md
- Approval gate: client contact confirms entities to include.
2. Data source inventory -> see skills/data-inventory.md
3. Connect ad accounts -> see skills/ad-platform-auth.md
4. Configure tracking -> see skills/tracking-config.md
5. Validate first reports -> see skills/report-validation.md
- Escalation: if any dimension fails validation twice, escalate to analyst.
6. Confirm attribution -> see skills/attribution-setup.md
- Final approval: account manager signs off before client handoff.
## Hard rules
- Never launch without the client's explicit confirmation of legal entities.
- Never auto-resolve a tracking discrepancy above 5% — always escalate.
- Log every deviation to runs/<client_id>/exceptions.md.
The master skill works as a narrative spine that composes the sub-skills, rather than replacing them. The onboarding-compression claim (several weeks down to one to two business days on standardized processes) is the consequence of three things the master skill enforces. First, the process is legible to both agent and human in the same language, so there is no translation step between what the senior analyst knows and what the agent can execute. Second, the hard rules and escalation thresholds mean the agent does not block on judgment calls it is not trusted to make; it escalates with context. Third, the act of writing the master skill is itself the forcing function that reveals which of the team's "documented processes" were actually consistent and which were whatever the senior analyst felt like that week. The inconsistencies show up because the agent executes the letter of the file and produces the wrong outcome, and the fix is to sharpen the file.
The cultural move that makes this stick is uncomfortable for the first week and routine within a month: manual launches of the master-skill process come off the table once the skill is stable. The team must go through the master-skill flow; where the practitioner stumbles, they debug and improve the skill. Teams that keep a manual fallback path keep the manual path, and the skill rots because nobody is forced to fix the failures the agent surfaces. The directive feels heavy-handed at first because it removes the practitioner's familiar "just do it manually this once" escape hatch, but the discipline is what produces the compression. A skill that runs ten times has been pressure-tested ten times; a skill that runs three times because the team kept routing around it is still a draft.
Nightly sync keeps the workspace from rotting
A shared workspace that is not actively maintained decays in ways the team does not notice until it is too stale to use. A nightly agent cron does four things on a schedule the team never thinks about. It distributes new artifacts — yesterday's call transcripts into the right client folders, yesterday's commits into the right project, yesterday's chat threads summarized and filed. It refreshes memory — the lessons file is compressed, HOT and WARM tiers re-sorted against the week's actual queries, outdated entries flagged for human retirement. It syncs tasks — tickets and their comments land in the project folder so an agent asked about project status reads the current state rather than last week's. It pulls external documents — drive changes, doc updates, shared-calendar edits become local markdown the agent can read.
None of this is architecturally novel. What matters is that it runs without human attention and that a named owner watches its error rate. The same team member who owns the master skill owns the sync. If overnight syncs silently start failing and nobody notices for a week, the workspace the agents are reading is already a week stale, and the output quality drops in ways that look like "the agent is getting worse" but are really "the context is getting older." The practitioner rule: treat overnight sync failures as P1 incidents, not as items to look at next sprint. The workspace is the substrate every other piece of the team's AI capability sits on; a stale substrate makes every downstream pattern unreliable in correlated ways.
Three failure modes separate workspaces that compound from workspaces that rot
Designing the structure before anyone uses it. Teams that spend months designing the ideal ontology before anyone starts producing artifacts produce an elegant folder tree with nothing in it. The committee debates the taxonomy. The sales team keeps writing notes in the old tool. The marketing team keeps dumping docs in shared drives. The workspace that was going to be the substrate becomes a parallel system nobody trusts because it never filled up. The fix is the inverse instinct: commit to one root folder today, put the four files in it today, have every team member add their active context today, and let the structure emerge from the questions the team asks over the next month. The ontology will converge on a shape; it will not converge before use, and trying to force convergence ahead of use produces months of empty taxonomy.
Personal vaults without a shared surface. The workspace exists in name — a shared Git repo, a branded root folder, an onboarding doc that names it — but each team member operates out of their own four files and never consults the shared layer. Agents cannot cross-reference. The same question gets answered ten times in ten private logs. When someone leaves, their vault goes with them and the team learns nothing. The diagnostic is simple. Measure how often an agent retrieves context from a teammate's file versus from the practitioner's own. Rare cross-team retrieval means the shared workspace exists on the org chart and not in practice. The fix is to force cross-referencing into the master skill's workflow so that the onboarding process literally cannot complete without pulling from the shared context. The shared surface becomes load-bearing rather than optional.
No shutdown date on the legacy tool. The shared workspace is built in parallel with the existing knowledge tool, and the team writes to both. The workspace is never canonical because the legacy tool is where "real" documentation still lives. The legacy tool is never complete because people are writing to the workspace. Agents read both, get contradictions, and either hallucinate a synthesis or confidently surface the stale version. The structural fix is a hard shutdown date on the legacy tool, announced publicly and enforced. Persuasion loses to political inertia above about fifty people, and "we'll migrate content gradually" is the phrase that means "we'll never migrate." Realistic timing: the stated two-week deadline almost always slips to four or six weeks because of image migration, cross-link rewiring, and formatting edge cases. Budget accordingly, but do not let the slip become indefinite. The shutdown is the forcing function; without it, the team has two systems forever.
What carries forward at team scale
Personal-layer practice produces three durable investments. Team scale adds three more, and the practitioners who keep all six compound the advantage; teams that keep four of six lose the workspace to entropy within a year.
The team-scale additions are: the workspace itself (one root shared folder, markdown-first, Git-backed; the content is portable, the infrastructure is not worth defending separately); a growing, actively-curated skills library (the master skill today, five domain-specific skills this quarter, a hundred by year-end if the team is serious — curation is the load-bearing discipline because a library of 500 skills nobody owns is worse than 50 skills an owner maintains); and the practice of announcing hard shutdown dates on legacy tools and enforcing them, applied through every transition rather than only the first one.
The forward question this chapter sets up is how the workspace stays trustworthy past the first quarter. The next chapter takes that up directly: the enforcement layer that keeps the file as the state machine, hooks blocking what shouldn't happen, and a nightly integrity lint catching the drift the team would otherwise notice only after a senior leader receives a confidently wrong answer. Chapter 4.3 picks up there.
Run this week — six tasks to lay the substrate
A six-item time-boxed checklist for the team that wants the shared workspace running by the end of the week.
- Seed the workspace (2 hours). Create the workspace Git repo. Drop the four files at the root:
CLAUDE.md(40-60 lines, hand-written, names team and rules of engagement),context.md(today's messy export of products, pricing, ICP, OKRs), per-rolerole.mdfiles for the functions on the team, andlessons.mdempty. Commit, push, share the clone command in the team channel. Output: every team member can clone the repo and read what the team is. - Inventory the connector surface (half day). List the top eight systems the team writes to or reads from (chat, ticketing, CRM, mail, code, call recording, calendar, knowledge base). Per row: system name, access method (API / manual export / locked-UI), MCP connector availability, owner of the credentials, audit-log depth. Output: an eight-row table that drives the Connectors layer when the team passes the roughly 20-person threshold.
- Run the context-sufficiency diagnostic (1 hour). Pick one representative task. Give a fresh agent the team root and ask for three next actions on that task. Two or more strong responses means the workspace is carrying weight. Three generic responses means the four files are still thin. Output: a one-page note marking which of the four files were empty or stale, with one fix per gap.
- Draft the spine of one Master Skill (half day). Pick the highest-volume standardized process the team owns (client onboarding, monthly close, audit prep, support triage). Write a roughly 200-line markdown file that links to the sub-skills for each step and names the approval gates and escalation rules. Do not write the sub-skills yet; the spine is what reveals which sub-skills are actually needed. Output: one master-skill draft and a named owner.
- Schedule the nightly sync and assign the error-rate owner (1 hour). Cron job that distributes new artifacts, refreshes the lessons memory, syncs tasks, and pulls external documents. The same team member who owns the master skill owns the sync. Output: cron entry committed, named owner, P1 alert wired to the right channel.
- Announce a shutdown date for the legacy tool (30 minutes plus the political work). If a parallel system exists (the existing knowledge base, the shared drives accumulating canonical docs), pick a date four to six weeks out, announce it publicly, and put migration on the calendar between announcement and shutdown. The political work is roughly ten times the writing work, so the half-hour estimate is for the announcement only. Output: one all-hands announcement, one calendar block per migration step.
For solo founders and small teams (2-20 people)
Skip the three-layer architecture on day one. Start with one shared root folder extending the four-file Personal OS. The load-bearing decision is committing to one canonical location for company context and doing every piece of knowledge work through it, including the messy ones. Do not run two systems in parallel; the second system always becomes the "real" one in someone's mind and the workspace becomes a parallel artifact nobody trusts. When the team hits around 20 people or the first compliance or privacy boundary appears, split into the Vault / Connectors / Skills shape. The signal that the split is needed is usually the same week — a new compliance review and the first credential-sharing conversation arrive together.
For team leads introducing AI to a 20-200 person team
This is the chapter's center of gravity. The load-bearing decision is the forcing function: announce the date the current knowledge tool is shut down at the same moment the shared workspace is announced, and refuse to give the organization the option to run both. Spend Week 1 setting up the workspace and seeding the four files. Spend Weeks 2-5 on per-employee onboarding (one to two hours per person, on their own machine, ending with each person producing something useful). Stand up the champion network in Weeks 4-8 and run weekly peer demos. Build the first master skill in Weeks 6-12, starting from the team's highest-volume standardized process. Compensation and promotion criteria need to start shifting in the same window; changing the workspace without changing recognition produces a workspace nobody actually uses, because the career-advancement system is still rewarding the pre-AI behaviors.
For enterprise IT at scale (500+ people)
The mandate is never "replace everything company-wide" on Day 1. It is "start with one department, prove the master-skill onboarding compression on one process, and use that proof to fund the next department." Strangler-fig the legacy systems: the shared workspace absorbs responsibilities from the old knowledge base gradually until the old system has nothing the agents need and can be retired without drama. Budget roughly one month per management level for the first department to reach competence; the second department is faster because the patterns are reusable. The biggest non-technical risk is data-access approvals from IT and security. File those at the start of the department pilot rather than when the workspace is ready to launch, because a workspace that cannot read the team's actual systems is a demo that ages out before it gets used. Block's January 2026 publication of the AGENTS.md / HOWTOAI.md repo-readiness pattern is the citable public reference for how an enterprise of more than 7,500 weekly active AI users runs the same architecture this chapter describes.