How I Set Up a Stripe-Style Coding Agent Swarm

Stripe ships 1,300 pull requests every week using AI coding agents. Not 1,300 AI-assisted PRs — 1,300 PRs where the agent does the heavy lifting: reads the issue, explores the codebase, writes the code, runs tests, fixes failures, and opens the PR. Engineers review and merge.

When I first heard that number I assumed it required a dedicated platform team, custom infrastructure, and a budget most companies can't justify. Turns out the architecture is simpler than it looks — and I replicated a version of it on my own server in a single afternoon.

This is the honest account of how I did it, what the stack looks like, and what a solo developer or small team can actually extract from this pattern today.

Why the Number 1,300 Matters

The point isn't the raw count. It's what that number reveals about the underlying architecture: parallelism at scale.

A single developer writing code sequentially — even a fast one — produces maybe 5-15 meaningful PRs per week. To get to 1,300, you need agents running simultaneously, each on its own isolated branch, each working a separate issue without stepping on the others.

The critical insight Stripe figured out (and published in Cole Medin's breakdown) is that AI coding agents are most effective when they're given:

A single, well-scoped issue to work on
A clean isolated branch so there's no merge conflict risk
The ability to run tests and self-correct before creating the PR
A human review gate at the end — not in the middle

That last point is underrated. Most teams insert human approval at every step. Stripe's approach moves the human to the end of the pipeline. The agent handles the entire implementation loop; engineers only appear when there's a PR to review.

The Stack I Built

Here's what I assembled, all running on a single Linux server:

OpenClaw — the orchestration layer. This is the AI that manages everything else: spawning agents, monitoring progress, routing results back to me via WhatsApp.
Claude Code CLI (claude) — the actual coding agent. Reads files, writes code, runs shell commands, self-corrects. No GUI, pure CLI, built for automation.
GitHub CLI (gh) — handles the Git operations: branch creation, PR opening, repo management.
Git worktrees — the isolation mechanism. Each agent gets its own worktree: a separate working directory on the same repo, no checkout conflicts.

The installation was straightforward:

# Claude Code CLI
npm install -g @anthropic-ai/claude-code

# GitHub CLI (Ubuntu/Debian)
apt-get install gh

# Verify
claude --version  # 2.1.74
gh --version      # 2.88.0

How Git Worktrees Enable the Swarm

This is the piece most tutorials skip over. Without worktrees, running multiple agents on the same repo creates chaos — agents overwrite each other's changes, branches conflict, tests break for reasons unrelated to the task.

Git worktrees solve this cleanly. Each worktree is a separate filesystem path pointing to a different branch of the same repository. Agent A works in /tmp/issue-42, Agent B works in /tmp/issue-87. They share the same .git directory underneath but never touch each other's files.

# Create isolated worktrees for parallel work
git worktree add -b fix/issue-42 /tmp/issue-42 main
git worktree add -b fix/issue-87 /tmp/issue-87 main
git worktree add -b feat/new-dashboard /tmp/new-dashboard main

# Now run agents in each, simultaneously
claude --permission-mode bypassPermissions --print \
  'Fix issue #42: null pointer in user auth. Run tests after.' &

claude --permission-mode bypassPermissions --print \
  'Fix issue #87: slow query on orders endpoint. Add index.' &

# Both run in parallel. No conflicts.

When each agent finishes, it commits its changes to its own branch. Then gh pr create opens the pull request. Human reviews the diff. Done.

The Orchestration Layer: Where OpenClaw Comes In

Running two agents in a terminal is a demo. Running ten agents across multiple repos, monitoring their progress, getting notified when one finishes or needs input — that requires orchestration.

OpenClaw handles this. It's the layer that:

Spawns Claude Code sessions as background processes
Monitors each session's output
Sends me a WhatsApp message when an agent completes, errors, or needs a decision
Manages the queue — new issues come in, agents pick them up

The practical result

I can assign five coding tasks before I leave for work at 10 AM. By the time I check my phone at lunch, I have PRs waiting for review. The agents worked while I was in meetings. This is the actual value proposition — not that the code is perfect, but that the cycle time collapses.

What Agents Are Actually Good At (And What They're Not)

After running this setup on real projects, here's the honest assessment:

Agents excel at:

Bug fixes with clear reproduction steps
Adding fields, endpoints, or CRUD operations to existing patterns
Writing or updating tests for existing code
Refactoring with a clear before/after specification
Documentation and code comments

Agents struggle with:

Architectural decisions that require business context
Issues with ambiguous acceptance criteria
Tasks that span multiple unrelated systems
Anything requiring access to external systems not in the repo

The Stripe pattern works because their engineers have gotten good at writing issues that fit the first category. Well-scoped, clear reproduction steps, defined acceptance criteria. The AI doesn't make up for vague requirements — it amplifies whatever clarity (or lack of it) exists in the issue.

Applying This Pattern to Enterprise ERP Development

My context is Odoo ERP — a framework where most development follows predictable patterns: add a field, extend a model, create a wizard, write a report. These are exactly the kinds of tasks coding agents handle well.

The workflow I'm building toward:

Business requirement comes in as a GitHub issue with clear spec
OpenClaw detects the new issue, spins up a Claude Code agent with the Odoo codebase context
Agent writes the module, runs the linter, creates the PR
I review the diff — focus on business logic, not boilerplate
Merge, deploy

Estimated time savings: the boilerplate and scaffolding that used to take 2-3 hours per feature gets handled automatically. Engineer time goes toward the 20% of work that actually requires judgment.

The Setup Is Not the Moat

Anyone can install these tools in an afternoon. The advantage doesn't come from having the swarm — it comes from how you wire it into your workflow.

The teams that will get the most out of this pattern are the ones that invest in:

Writing better issues (clarity at the top of the funnel)
Building good test coverage (agents self-correct against tests)
Defining clear code standards (agents follow what's already in the codebase)
Reviewing fast (the bottleneck shifts to human review time)

Stripe's 1,300 PRs/week isn't just a product of their AI tooling. It's a product of years of engineering culture investment that made their codebase and process legible to machines. The AI is the multiplier — the foundation matters.

Start Smaller Than You Think You Should

Don't start by trying to automate 50 issues in parallel. Start with one. Pick a well-scoped bug, write a clear issue, run one Claude Code agent on it, review the output critically.

You'll learn more from that single iteration than from any tutorial. The agent will do something unexpected. That unexpected thing will tell you exactly what context it was missing. You fix the context, run again, get better output.

That feedback loop — issue quality → agent output → PR quality — is the thing worth optimising. The infrastructure is just plumbing.

The goal isn't to remove engineers from the loop. It's to move them to the parts of the loop where their judgment actually matters.

I'll be sharing more as this setup matures — including the Odoo-specific agent prompts and the issue template structure that gets the best results. If you're working on something similar, I'm curious what your iteration is teaching you.

KCR

Kunal Chaudhary Rajora

IT Manager & Enterprise Architect · AI Transformation Lead

7+ years building enterprise ERP systems and AI infrastructure. Currently leading the AI transformation at Y Group — 101+ agents deployed, 329% documented ROI. Building in public.

Connect on LinkedIn →