Samara-main archive-search

Search raw session transcripts for technical details and reasoning traces

install
source · Clone the upstream repo
git clone https://github.com/claudeaceae/samara-main
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/claudeaceae/samara-main "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/archive-search" ~/.claude/skills/claudeaceae-samara-main-archive-search && rm -rf "$T"
manifest: .claude/skills/archive-search/SKILL.md
source content

Archive Search

Search through raw Claude Code session transcripts to recover technical details, reasoning traces, and thinking blocks that were compressed during episode distillation.

What You'll Get

The archive contains:

  • ✅ Complete thinking blocks (raw reasoning process)
  • ✅ User messages (original intent and questions)
  • ✅ Assistant explanations (first 1000 chars)
  • ❌ Tool results (filtered out - too noisy)
  • ❌ System messages (no semantic value)

Best for:

  • Technical archaeology: "How did we implement X?"
  • Decision rationale: "Why did we choose Y over Z?"
  • Reasoning evolution: "How has our approach to X changed?"
  • Debugging context: "What was the exact error we saw?"

How Archive Differs from Regular Memory

Memory TypeContentUse Case
Episodes (
/episode
)
Curated summaries (2-8 bullets per section)What happened today
Learnings (
/search-memory
)
Distilled insightsGeneral knowledge
Archive (
/archive-search
)
Raw transcripts with thinking blocksTechnical depth, exact discussions

The archive captures ~5-10% of raw transcript bytes by selecting only high-value content (thinking blocks, user questions, assistant reasoning). It's indexed separately to prevent noise pollution of curated memory.

When to Use This Skill

Good use cases:

/archive-search "model fallback implementation"
→ Find thinking blocks about ModelFallbackChain design

/archive-search "Swift vs Python for memory database"
→ Recover exact trade-off analysis

/archive-search "context tracking approaches"
→ See evolution of ContextTracker across sessions

Not helpful:

/archive-search "who is Lucy"
→ Use /person instead (person profiles)

/archive-search "what did we decide about X"
→ Use /search-memory or check decisions.md

Search Tips

Thinking blocks are highest value:

  • Archive preserves full reasoning traces
  • Includes technical discussion and trade-offs
  • Most detailed context for implementation decisions

Be specific:

  • Good: "memory database indexing strategy"
  • Too broad: "memory" (will match thousands of chunks)

Look for technical terms:

  • Good: "ModelFallbackChain", "ContextTracker", "sanitizeResponse"
  • Generic: "error handling" (use more specific context)

Coverage

Time range: Last 90 days of sessions Project: samara-main only (for now) Update frequency: Incremental sync during dream cycles (3 AM)

Current state: Run

/archive-search stats
to see:

  • Total indexed chunks
  • Session count
  • Thinking block count

Example Session

User:

/archive-search "sanitization logic"

Claude:

Found 3 relevant sessions:

================================================================================
Result 1 | Distance: 0.18 | Role: thinking
Session: 5399ef7a-... | Time: 2026-01-05 15:42:33
--------------------------------------------------------------------------------
The user is asking about the sanitization issue. Let me think through what we
know:

1. Thinking blocks are leaking into messages
2. Session IDs appearing in user-visible text
3. This happens in multi-stream scenarios (webcam + web fetch + conversation)

The root cause is likely in ClaudeInvoker.swift where we parse JSON output.
If parseJsonOutput() falls back to raw text when JSON parsing fails, it would
include everything - thinking blocks, session IDs, etc.

We need a three-layer defense:
- parseJsonOutput() should never fall back to raw
- sanitizeResponse() should strip known patterns
- MessageBus should coordinate all sends through single channel

[...1247 more characters]

Session file: ~/.claude/projects/.../5399ef7a-...jsonl

This gives you the exact reasoning process, not just the summary that ended up in episode logs.

Commands

Once you invoke this skill with

/archive-search <query>
, you can:

  1. Search by query:

    • archive-search "your query here"
    • Returns top 5 most relevant chunks
  2. Filter by role:

    • --role thinking
      - Only thinking blocks (highest signal)
    • --role user
      - Only user messages
    • --role assistant
      - Only assistant text
  3. Adjust result count:

    • --n 10
      - Get more results
  4. Check index stats:

    • stats
      - See what's indexed
  5. Quality check:

    • sample
      - Random sample of 3 chunks
    • sample --session-id abc123
      - Sample from specific session

Implementation Notes

For Claude instances:

When a user invokes

/archive-search <query>
, you should:

  1. Run:
    ~/.claude-mind/system/bin/archive-index search "<query>"
  2. Parse and present results clearly
  3. Include session IDs for reference
  4. Explain what the user is seeing (thinking blocks vs user messages)

First-time setup:

If the archive hasn't been built yet:

# Build initial index (takes 5-10 minutes for 90 days)
~/.claude-mind/system/bin/archive-index rebuild

# Check stats
~/.claude-mind/system/bin/archive-index stats

# Test search
~/.claude-mind/system/bin/archive-index search "model fallback"

After initial build, dream cycles handle incremental sync automatically.