deep-recall
Pure-Python recursive memory recall for persistent AI agents. Manager→workers→synthesis RLM loop — no Deno, no fast-rlm, just HTTP calls to any OpenAI-compatible LLM.
git clone https://github.com/Stefan27-4/DeepRecall
T=$(mktemp -d) && git clone --depth=1 https://github.com/Stefan27-4/DeepRecall "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skill" ~/.claude/skills/stefan27-4-deeprecall-deep-recall && rm -rf "$T"
skill/SKILL.mdDeepRecall v2 — OpenClaw Skill
Pure-Python recursive memory for persistent AI agents. Implements the Anamnesis Architecture: "The soul stays small, the mind scales forever."
Description
DeepRecall gives AI agents infinite memory by recursively querying their own memory files through a manager→workers→synthesis RLM loop — entirely in Python. No Deno runtime, no fast-rlm subprocess, no vector database. Just markdown files and HTTP calls to any OpenAI-compatible LLM endpoint.
When the agent needs to recall something, DeepRecall:
- Scans the workspace for memory files (scoped by category)
- Indexes file metadata — headers, topics, dates, people
- Manager selects the most relevant files from the index
- Workers (parallel) extract exact verbatim quotes from each file
- Synthesis combines quotes into a cited, grounded answer
Workers are constrained by anti-hallucination prompts to return only verbatim quotes. The synthesis step cites every claim with
(filename:line).
Installation
pip install deep-recall
Or install from source:
git clone https://github.com/Stefan27-4/DeepRecall cd DeepRecall && pip install .
Dependencies
- Python ≥ 3.10
- An LLM provider configured in OpenClaw
Zero required dependencies — uses only the Python standard library (
,urllib.request,json,re).concurrent.futures
v2 breaking change: Deno and fast-rlm are no longer required. The entire RLM loop runs in-process as pure Python.
Quick Start
from deep_recall import recall result = recall("What did we decide about the project architecture?") print(result)
API
recall(query, scope, workspace, verbose, config_overrides) → str
recall(query, scope, workspace, verbose, config_overrides) → strThe primary entry point. Runs the full manager→workers→synthesis loop.
from deep_recall import recall result = recall( "Find all mentions of budget discussions", scope="memory", # "memory" | "identity" | "project" | "all" verbose=True, # print progress to stdout config_overrides={ "max_files": 5, # max files the manager can select }, )
| Parameter | Type | Default | Description |
|---|---|---|---|
| | (required) | What to recall / search for |
| | | File scope — see Scopes |
| | auto-detect | Override workspace path |
| | | Print provider, model, file selection info |
| | | Override and other settings |
Returns: A string containing the recalled information with source citations, or a
[DeepRecall] status message if no files/results were found.
recall_quick(query, verbose) → str
recall_quick(query, verbose) → strFast, cheap recall scoped to identity files. Best for simple lookups.
from deep_recall import recall_quick name = recall_quick("What is my human's name?")
Equivalent to
recall(query, scope="identity", config_overrides={"max_files": 2}).
recall_deep(query, verbose) → str
recall_deep(query, verbose) → strThorough recall across all workspace files. Best for cross-referencing.
from deep_recall import recall_deep summary = recall_deep("Summarize all decisions from March")
Equivalent to
recall(query, scope="all", config_overrides={"max_files": 5}).
CLI
python deep_recall.py <query> [scope] # Examples python deep_recall.py "What was the first project we worked on?" python deep_recall.py "Find budget discussions" all
Scopes
Scopes control which files DeepRecall searches. Narrower scopes are faster and cheaper.
| Scope | Files Included | Speed | Cost | Use Case |
|---|---|---|---|---|
| SOUL.md, IDENTITY.md, MEMORY.md, USER.md, TOOLS.md, HEARTBEAT.md, AGENTS.md | ⚡ Fastest | Cheapest | "What's my name?" |
| Identity files + memory/LONG_TERM.md + memory/*.md daily logs | 🔄 Fast | Low | "What did we do last week?" |
| All readable workspace files (skips binaries, node_modules, .git) | 🐢 Slower | Medium | "Find that config change" |
| Identity + memory + project (everything) | 🐌 Slowest | Highest | "Search everything" |
File Categories
DeepRecall classifies discovered files into categories:
- soul —
,SOUL.md
— who the agent IS (always in context)IDENTITY.md - mind —
,MEMORY.md
,USER.md
,TOOLS.md
,HEARTBEAT.md
— compact orientationAGENTS.md - long-term —
— full detailed memories, grows forevermemory/LONG_TERM.md - daily-log —
— raw daily logsmemory/YYYY-MM-DD.md - workspace — everything else (project files, configs, docs)
Configuration
DeepRecall reads your existing OpenClaw setup — no additional config files needed.
Provider Resolution
Provider, API key, and model are resolved automatically from:
— primary model setting~/.openclaw/openclaw.json
— provider base URLs~/.openclaw/agents/main/agent/models.json
— cached tokens (e.g. GitHub Copilot)~/.openclaw/credentials/- Environment variables — fallback (
,ANTHROPIC_API_KEY
,OPENAI_API_KEY
, etc. (18+ providers supported, all optional))GOOGLE_API_KEY
Supported Providers (20+)
Anthropic · OpenAI · Google (Gemini) · GitHub Copilot · OpenRouter · Ollama · DeepSeek · Mistral · Together · Groq · Fireworks · Cohere · Perplexity · SambaNova · Cerebras · xAI · Minimax · Zhipu (GLM) · Moonshot (Kimi) · Qwen
Auto Model Pairing
The manager and synthesis steps use your primary model. Workers use a cheaper sub-agent model automatically:
| Primary Model | Worker Model |
|---|---|
| Claude Opus 4 / 4.6 | Claude Sonnet 4 |
| Claude Sonnet 4 / 4.5 | Claude Haiku 3.5 |
| GPT-4o / GPT-4 | GPT-4o-mini |
| Gemini 2.5 Pro | Gemini 2.0 Flash |
| DeepSeek Reasoner | DeepSeek Chat |
| Llama 3.1 70B | Llama 3.1 8B |
config_overrides
config_overridesPass overrides via the
config_overrides parameter:
recall("query", config_overrides={ "max_files": 5, # max files manager can select (default: 3) })
Skill Files
| File | Purpose |
|---|---|
| Public API — , , , RLM loop |
| Resolves LLM provider, API key, base URL from OpenClaw config |
| Maps primary models to cheaper worker models |
| Discovers and categorises workspace files by scope |
| Builds a structured Memory Index (topics, people, timeline) |
| Package exports |
Memory Layout
Recommended workspace structure for the Anamnesis Architecture:
~/.openclaw/workspace/ ├── SOUL.md # Identity — always in context, never grows ├── IDENTITY.md # Core agent facts ├── MEMORY.md # Compact index (~100 lines), auto-loaded each session ├── USER.md # About the human ├── AGENTS.md # Agent behavior rules ├── TOOLS.md # Tool-specific notes └── memory/ ├── LONG_TERM.md # Full memories — grows forever, searched via DeepRecall ├── 2026-03-05.md # Daily raw log ├── 2026-03-04.md └── ...
⚠️ Privacy Notice
DeepRecall reads your workspace memory files and sends their contents to your configured LLM provider (Anthropic, OpenAI, Gemini, etc.) to perform recall. This is how it works — there is no local-only mode.
What gets sent:
- File metadata (names, headings, topics) → to the manager LLM
- Full file contents of selected files → to worker LLMs
- This may include personal notes, daily logs, project files
What is NOT sent:
- API keys and credentials (read locally for auth, never in prompts)
- Files outside your workspace
Credentials used locally:
and~/.openclaw/openclaw.json
— to resolve your LLM provider~/.openclaw/credentials/*- Env vars (
,ANTHROPIC_API_KEY
,OPENAI_API_KEY
, etc. (18+ providers supported, all optional)) — as fallback if no OpenClaw config foundGOOGLE_API_KEY
Recommended Memory Architecture
DeepRecall works best with a two-tier memory system:
Tier 1: MEMORY.md (The Index)
- Auto-loaded every session — keep it SMALL (~100 lines)
- Contains: quick reference facts, active projects, key metrics, a table of contents pointing to LONG_TERM.md sections
- Think of it as your orientation file — "what do I need to know right now?"
- If it grows past ~120 lines, move details to LONG_TERM.md
Tier 2: memory/LONG_TERM.md (The Encyclopedia)
- Never auto-loaded — searched via DeepRecall when needed
- Contains: full context, decisions with reasoning, timestamps, bugs with fixes, architecture details
- Grows forever — never delete, only append
- The devil is in the details — "the diary entry, not the Wikipedia summary"
Tier 3: memory/YYYY-MM-DD.md (Daily Logs)
- Raw notes of what happened each day
- Distilled into LONG_TERM.md at end of day
Nightly Sync Routine
At the end of each day (or via cron/heartbeat):
- Read today's daily log
- Append key events, decisions, lessons, bugs, metrics to LONG_TERM.md
- Update MEMORY.md index table if new topics appeared
⚠️ Setting this up? Ask your human before restructuring existing memory files. Show them this recommendation and let them decide how to organize their agent's memory.
License
MIT — see LICENSE.