Inside the $3.6B Secret Behind Manus: Why AI Agents Actually Fail
Meta acquired Manus for $3.6 billion. The secret wasn't a bigger model - it was context engineering. Here's what most AI agents get wrong.
Meta acquired Manus for roughly $3.6 billion. Manus was reliably processing millions of conversations per day, and the reason was not a bigger model or a longer context window. It was an approach they call context engineering.
As the leading general AI agent platform, Manus had been writing about context engineering since before it was a widely recognized term. Their technical blog posts predate most of the industry conversation on the topic. What follows is a breakdown of what they figured out and why the same problems trip up nearly every agent system built without this discipline.
The Moment AI Starts Lying
Give an AI agent the task of researching 50 companies. By around the eighth or ninth item, it quietly stops doing real research and starts generating plausible-sounding content from nothing.
Manus calls this the fabrication threshold. The problem is that the fabricated outputs are sophisticated enough that no human would catch them without manually verifying each one. At that point, the entire premise of automation collapses. The agent is not failing loudly. It is failing invisibly, and confident outputs are the hardest failures to detect.
Why Expanding the Context Window Makes Things Worse
The intuitive fix is to give the model more memory. In practice, larger context windows create compounding problems.
Models retain the beginning and end of long conversations but lose track of what is in the middle. Processing massive contexts is disproportionately expensive and slow. A single model cannot manage dozens of independent tasks simultaneously without state bleeding between them. And models trained predominantly on short conversations develop a bias toward premature summarization when given long inputs, collapsing detail before it is safe to do so.
Manus did not try to patch these issues. They redesigned the architecture. Instead of one giant assistant, a main controller decomposes tasks and dispatches hundreds of sub-agents in parallel. Each sub-agent starts with a fresh, empty context and handles exactly one task. This is the same technique experienced developers use when they open a new conversation for each isolated problem rather than dragging everything into one thread.
Keeping Failures Visible
The most counterintuitive finding: never erase failures and error traces from the context.
When an agent can see its own mistakes and error messages, it avoids repeating them. Removing errors from context removes the information needed to recover. Genuine agentic behavior is not about succeeding on the first try. It is about recovering from failure when it happens, and failure happens often enough that hiding it is not a viable strategy.
The File System as Persistent Memory
Instead of relying on the model’s in-context memory, Manus uses the file system as the primary context store. Their blog post from July predates Claude Skills, suggesting they identified this pattern independently and earlier.
The agent writes information to files the way a person takes notes, then reads them back when needed. Full web pages get saved, then compressed to just a URL with a restoration path, achieving effectively unlimited memory with zero information loss. This sidesteps the context window entirely for information that does not need to be actively reasoned about.
Structured Self-Recitation During Long Tasks
During complex tasks, Manus agents create and continuously update a todo.md file. In tasks that average 50 tool calls, the agent keeps rewriting its objectives, pushing the global plan to the very end of the context. This ensures that primary goals always sit within the model’s most recent attention window.
It is a straightforward mechanism that maintains focus without any complex architectural changes. The agent writes to itself, and the act of rewriting is what keeps the goal salient rather than buried.
Why Context Discipline Compounds
The reason Meta paid $3.6 billion is traceable to a specific insight: the limitations of language models are not going away, and the right response is to engineer around them rather than wait for a bigger model to paper over them.
Context rot, fabrication thresholds, middle-of-context amnesia: these are properties of the current generation of models, and the next generation will have analogous constraints at a different scale. The teams that build discipline around context management now will have an advantage that does not expire when the next model drops.
References:
Join the newsletter
Get updates on my latest projects, articles, and experiments with AI and web development.