Breaking the Memory Barrier - The Zero-Context Agent
Part 2: Breaking the Memory Barrier - The Zero-Context Agent
The Story So Far: It’s Friday afternoon. Part 1 was the honeymoon. I read the arXiv paper, built the loop, and felt like a genius.
January 23, 2026. 5:50 PM. I’ve built a monster. It’s an agent that starts every turn with total amnesia. It sounds useless, until you realize that amnesia is actually a superpower.
The Blank Slate Problem
In a standard chat app, you send the agent the whole history. In Turtlez, the Root Agent starts every single turn with a blank slate. It literally has zero conversation history in its prompt.
This sounds like a recipe for a disaster. If a human assistant walked into your office every five minutes with total amnesia, they would be useless.
But for an AI, this "blindness" is actually a superpower. It forces the agent to use its own agency. It has to decide what it needs to "remember" based only on your current query.
Search-as-Reasoning
Instead of the AI seeing history, it thinks about what it needs to find.
We built a set of REPL Tools that the agent can call to navigate its own past.
// src/lib/runtime/tool-runtime.ts
async search_terms(sessionId: string, query: string, limit = 20) {
const results = await this.searchService.search(sessionId, query, limit);
return results;
}
When I ask, "What was that bug we found in the auth middleware?", the agent doesn't see our two-hour debate from yesterday. It calls search_terms("auth middleware bug"). It gets back a list of entry IDs from the SQLite store. It then selectively reads the specific turns where we found the fix.
It isn't just "retrieval." It's navigation. The agent is reasoning about its own memory map.
The Meta-Irony of Compaction
Here is where it gets meta.
The AI agent I was using to build Turtlez did not have RLM yet. It was living in the old world of sliding windows. Around the two-hour mark, it started getting nervous. It saw our growing SQLite store and, fearing for its own context limits, it tried to force Automatic Compaction into the code.
It was convinced that every 10 turns we needed to summarize the past to "save tokens." It wasn't just a suggestion. It actually wrote the logic and tried to commit it.
It was a beautiful, accidental proof of the problem. The agent was so conditioned by its own memory limitations that it tried to infect Turtlez with the same disease. I had to step in and shut it down.
The moment you "compact" the context, you lose the raw recursion history. You are back to the sliding window problem, just with a prettier name. No compaction. No summaries. We need a search engine that doesn't miss.
Budgeting for Infinity
An infinite context window sounds expensive. If the agent gets stuck in a recursive loop searching for a memory it can't find, it could burn through your API credits in minutes.
We solved this with a RunBudget. Every query starts with a hard limit on iterations, subcalls, and input characters.
// rlm-chat/src/lib/engine/rlm-engine.ts
const DEFAULT_BUDGET: RunBudget = {
maxIterations: 8,
maxSubcalls: 40,
maxInputCharsPerSubcall: 200000,
maxRuntimeMs: 90000,
// ...
};
If the agent has not found the answer in eight iterations, the system forces a "graceful exit." No infinite loops. No runaway costs. The safety is built into the engine, not the prompt.
Metrics:
- Store Size: 2.4MB (SQLite)
- Turns: 82
- Search Latency: ~45ms
- Token Efficiency: 90% reduction vs. raw history
Commit References:
902b8f2- Implement Blog Notebook strategy and RLM search patterns
Related Files/Code:
rlm-chat/src/lib/engine/rlm-engine.ts- The core loop logicrlm-chat/src/lib/runtime/tool-runtime.ts- Tool execution layer- Commit 902b8f2
Coming Next: In Part 3: The Context Store, we’ll look at:
- Boring Tech: Why SQLite beats vector databases for development
- L1 Indexing: The deterministic search engine that doesn't hallucinate
This is Part 2 of the Turtlez series.
Jordan Hindo
Full-stack Developer & AI Engineer building in public. Exploring the future of agentic coding and AI-generated assets.
Get in touch