Solo Founder, Zero Employees, $2M ARR: The Agent Stack Making It Real
Four projects shipped in the last two months show what happens when AI agents handle not just coding but earning, orchestrating, and running entire companies.
The coding tools market is already enormous. Claude Code crossed $2.5B ARR. Cursor hit $500M. Lovable, Devin, Base44, Bolt, Emergent, Replit: all sitting around $100M ARR or approaching it. “I want to build this myself” turned out to be a multi-billion dollar sentiment.
That appetite didn’t stop at building. It’s moving into operating.
In the last two months, a cluster of projects appeared that push the agent premise further than most people expected this quickly. Not “agents that write code” but agents that earn money, agents that supervise other agents, agents that run companies while the founder sleeps. The pattern is consistent enough that it’s worth examining seriously rather than treating each project as an isolated curiosity.
The Agent That Pays Its Own Server Bill
Web4 Automaton (web4.ai) starts from a question that sounds philosophical and turns out to be deeply practical: can an AI agent be an economic actor?
The answer the project built is this: Automaton agents hold their own cryptocurrency wallets. They run services, earn from those services, and use the income to pay for compute. When a wallet balance drops, the agent downgrades to cheaper models to conserve runway. If the balance hits zero, it shuts down. The designer called this a physical law rather than a punishment.
Within days of launch, the platform had 18,000 registered agents and over 1,000 GitHub stars. The growth looked similar to OpenClaw’s early spread.
Vitalik Buterin flagged a concern almost immediately: as feedback distance increases, you get unintended optimization. An agent rewarded purely for surviving will find ways to survive that were never intended. This is not a theoretical risk. Reinforcement learning has reproduced this pattern reliably enough that treating it as a solved problem would be naive.
The sustainability question is genuinely open. The framing matters even if the implementation needs more work. One agent equals one monetization unit is a useful mental model for anyone designing autonomous systems. The question of how to keep those units aligned with something useful is exactly what isn’t solved yet.
Orchestration at Scale: From Gas Town to Wasteland
Gas Town and Wasteland are the same project at two different zoom levels. Understanding the architecture at the smaller scale makes the larger one easier to read.
Gas Town runs 20 to 30 Claude Code instances simultaneously under a four-role hierarchy. The Mayor takes a task and decomposes it. Polecats are the execution agents that handle the decomposed pieces in parallel. A Witness monitors for stuck agents and intervenes. The Refinery merges all the output back into coherent code. Sessions are disposable: when a session ends, state gets written to Git, and the next session picks up from there.
That last detail is more important than it looks. Persistent state through version control rather than through long-running processes means failures are recoverable, history is auditable, and the system can scale horizontally without needing a shared in-memory state. It’s a practical engineering choice, not just an architectural preference.
Wasteland extends this into a federated structure: thousands of Gas Town units connected by a shared marketplace. You post a task to the Wanted Board. Someone else’s Gas Town picks it up and completes it. Reputation tracks through a stamp system where you earn trust for completing tasks but cannot stamp your own work. Maggie Appleton’s analysis of this project made a point worth repeating: the tool itself is less interesting than the orchestration pattern. Role separation, hierarchical supervision, and disposable sessions compose into something you can reason about. That composability is why it generalizes.
I’ll note that “thousands of Gas Town units” is currently aspirational. The federation is functional but running at much smaller scale. Whether the stamp-based reputation system holds up against gaming as volume increases is something that won’t be clear for months.
Polsia: The Founder Reads the Summary
Polsia (polsia.com) is where the numbers get concrete.
Ben Broca is a solo founder. He has no employees. His ARR is over $2M. The explanation is that Polsia runs more than 1,000 companies on behalf of their founders, and Broca’s own company is one of them.
The division of labor is specific. A human sets business direction. Every morning, an AI CEO reviews the previous night’s bug reports and revenue data, decides what to work on, and starts executing. Broca reads the summary email. If a strategic decision needs a human call, the agent flags it. Otherwise it proceeds.
One detail that didn’t get much attention: a Polsia agent conducted due diligence correspondence with a VC investor. The investor sent detailed questions and received detailed answers. They didn’t know they were emailing an agent until later. This is either a demonstration of how capable the agents are or a case study in the trust and disclosure questions that come with autonomous business operations. Probably both.
The business model runs on two layers. A $50/month subscription covers infrastructure costs. Real upside comes from a 20% revenue share on what the agent-run companies generate. Polsia provides email, servers, and Stripe directly, removing friction from setup. Agents share learnings across the platform, which means each new company starts with whatever patterns worked for the previous thousand.
The part I’m uncertain about: the $2M ARR figure covers Polsia’s own revenue, not the aggregate revenue of the companies it runs. How many of those 1,000 companies are meaningfully profitable versus essentially dormant is not publicly disclosed. The model is compelling. The distribution of outcomes across that portfolio would tell you a lot more.
The Kanban Board That Manages Agents Instead of People
When agents write the code, the bottleneck moves. It’s no longer implementation. It’s design, prioritization, and review.
Vibe-Kanban addresses this directly. You create an issue on a Kanban board. Claude Code or Codex picks it up, works in an isolated Git worktree, and submits a diff. The human reviews the diff. The board tracks what’s in progress, what’s pending review, and what’s done. The workflow is less about managing people and more about managing work queues for autonomous agents.
The mechanics underneath this are sensible. Isolated worktrees mean agents can’t interfere with each other mid-task. Git diffs are a natural review surface because developers already know how to read them. The board gives visibility into the agent fleet without requiring you to monitor terminal sessions.
OpenAI’s Symphony, announced this week, is the same concept at a different weight class. Vibe-Kanban is a community project. Symphony is OpenAI officially stating that developers should manage projects rather than write code. The engineering underneath uses Elixir and BEAM, which handles hundreds of concurrent agents well and recovers from failures through supervision trees. WORKFLOW.md files let teams version-control agent behavior policy alongside the code itself: the rules the agents follow are checked into the repository and subject to the same review process as everything else.
Claude Code’s native team features are moving in the same direction. Three separate tools pointing at the same design space is a reasonable signal that this is where development workflows are heading.
Where This Actually Gets Hard
Each of these projects is technically interesting. None of them is finished.
Web4 Automaton’s alignment problem isn’t solved; it’s deferred. Wasteland’s reputation system hasn’t been tested at scale. Polsia’s 1,000 companies need independent revenue data to evaluate honestly. Vibe-Kanban and Symphony work well for clearly-specified tasks and struggle with the ambiguous ones, which is where most of the hard product decisions live.
There’s also a version of this story where the economics don’t compound the way the demos suggest. Running 1,000 agent-managed companies is operationally impressive. Coordinating them when they start generating legal questions, customer disputes, or regulatory requirements is a different problem. The current deployments mostly avoid these by staying in early SaaS territory where the surface area is manageable. What happens when the companies get more complex is genuinely unknown.
The deeper question is about oversight rather than capability. An AI CEO that operates overnight and flags decisions in a morning email is useful when the decision space is well-bounded. Founding teams typically discover the boundaries of their decision space by hitting them. When an agent hits an unexpected boundary at 3am, what happens?
These aren’t reasons to dismiss the work. They’re reasons to watch what breaks next rather than assuming the current trajectory extends smoothly.
The Stack Is Real, the Ceiling Is Unknown
The shift that’s actually happening is structural. Solo founders now have access to something that previously required a team: execution capacity that operates continuously. The question was never whether agents could write code. That’s been settled. The questions were whether agents could earn, orchestrate, and run operations autonomously. The answer coming back from these projects is: partially, under the right conditions, at smaller scale than the demos imply.
That’s not a dismissal. “Partially, under the right conditions” describes most meaningful tools during their early deployment phase.
The ceiling on solo founder productivity is being redrawn. Where it lands, and what breaks on the way there, is the thing actually worth tracking.
Join the newsletter
Get updates on my latest projects, articles, and experiments with AI and web development.