Marketplace self-improving-agent
A universal self-improving agent that learns from ALL skill experiences. Uses multi-memory architecture (semantic + episodic + working) to continuously evolve the codebase. Auto-triggers on skill completion/error with hooks-based self-correction.
git clone https://github.com/aiskillstore/marketplace
T=$(mktemp -d) && git clone --depth=1 https://github.com/aiskillstore/marketplace "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/charon-fan/self-improving-agent" ~/.claude/skills/aiskillstore-marketplace-self-improving-agent && rm -rf "$T"
skills/charon-fan/self-improving-agent/SKILL.mdSelf-Improving Agent
"An AI agent that learns from every interaction, accumulating patterns and insights to continuously improve its own capabilities." — Based on 2025 lifelong learning research
Overview
This is a universal self-improvement system that learns from ALL skill experiences, not just PRDs. It implements a complete feedback loop with:
- Multi-Memory Architecture: Semantic + Episodic + Working memory
- Self-Correction: Detects and fixes skill guidance errors
- Self-Validation: Periodically verifies skill accuracy
- Hooks Integration: Auto-triggers on skill events (before_start, after_complete, on_error)
- Evolution Markers: Traceable changes with source attribution
Research-Based Design
Based on 2025 research:
| Research | Key Insight | Application |
|---|---|---|
| SimpleMem | Efficient lifelong memory | Pattern accumulation system |
| Multi-Memory Survey | Semantic + Episodic memory | World knowledge + experiences |
| Lifelong Learning | Continuous task stream learning | Learn from every skill use |
| Evo-Memory | Test-time lifelong learning | Real-time adaptation |
The Self-Improvement Loop
┌─────────────────────────────────────────────────────────────────┐ │ UNIVERSAL SELF-IMPROVEMENT │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Skill Event → Extract Experience → Abstract Pattern → Update │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ MULTI-MEMORY SYSTEM │ │ │ ├─────────────────────────────────────────────────────┤ │ │ │ Semantic Memory │ Episodic Memory │ Working Memory │ │ │ │ (Patterns/Rules) │ (Experiences) │ (Current) │ │ │ │ memory/semantic/ │ memory/episodic/ │ memory/working/│ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ FEEDBACK LOOP │ │ │ │ User Feedback → Confidence Update → Pattern Adapt │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘
When This Activates
Automatic Triggers (via hooks)
| Event | Trigger | Action |
|---|---|---|
| before_start | Any skill starts | Log session start |
| after_complete | Any skill completes | Extract patterns, update skills |
| on_error | Bash returns non-zero exit | Capture error context, trigger self-correction |
Manual Triggers
- User says "自我进化", "self-improve", "从经验中学习"
- User says "分析今天的经验", "总结教训"
- User asks to improve a specific skill
Evolution Priority Matrix
Trigger evolution when new reusable knowledge appears:
| Trigger | Target Skill | Priority | Action |
|---|---|---|---|
| New PRD pattern discovered | prd-planner | High | Add to quality checklist |
| Architecture tradeoff clarified | architecting-solutions | High | Add to decision patterns |
| API design rule learned | api-designer | High | Update template |
| Debugging fix discovered | debugger | High | Add to anti-patterns |
| Review checklist gap | code-reviewer | High | Add checklist item |
| Perf/security insight | performance-engineer, security-auditor | High | Add to patterns |
| UI/UX spec issue | prd-planner, architecting-solutions | High | Add visual spec requirements |
| React/state pattern | debugger, refactoring-specialist | Medium | Add to patterns |
| Test strategy improvement | test-automator, qa-expert | Medium | Update approach |
| CI/deploy fix | deployment-engineer | Medium | Add to troubleshooting |
Multi-Memory Architecture
1. Semantic Memory (memory/semantic-patterns.json
)
memory/semantic-patterns.jsonStores abstract patterns and rules reusable across contexts:
{ "patterns": { "pattern_id": { "id": "pat-2025-01-11-001", "name": "Pattern Name", "source": "user_feedback|implementation_review|retrospective", "confidence": 0.95, "applications": 5, "created": "2025-01-11", "category": "prd_structure|react_patterns|async_patterns|...", "pattern": "One-line summary", "problem": "What problem does this solve?", "solution": { ... }, "quality_rules": [ ... ], "target_skills": [ ... ] } } }
2. Episodic Memory (memory/episodic/
)
memory/episodic/Stores specific experiences and what happened:
memory/episodic/ ├── 2025/ │ ├── 2025-01-11-prd-creation.json │ ├── 2025-01-11-debug-session.json │ └── 2025-01-12-refactoring.json
{ "id": "ep-2025-01-11-001", "timestamp": "2025-01-11T10:30:00Z", "skill": "debugger", "situation": "User reported data not refreshing after form submission", "root_cause": "Empty callback in onRefresh prop", "solution": "Implement actual refresh logic in callback", "lesson": "Always verify callbacks are not empty functions", "related_pattern": "callback_verification", "user_feedback": { "rating": 8, "comments": "This was exactly the issue" } }
3. Working Memory (memory/working/
)
memory/working/Stores current session context:
memory/working/ ├── current_session.json # Active session data ├── last_error.json # Error context for self-correction └── session_end.json # Session end marker
Self-Improvement Process
Phase 1: Experience Extraction
After any skill completes, extract:
What happened: skill_used: {which skill} task: {what was being done} outcome: {success|partial|failure} Key Insights: what_went_well: [what worked] what_went_wrong: [what didn't work] root_cause: {underlying issue if applicable} User Feedback: rating: {1-10 if provided} comments: {specific feedback}
Phase 2: Pattern Abstraction
Convert experiences to reusable patterns:
| Concrete Experience | Abstract Pattern | Target Skill |
|---|---|---|
| "User forgot to save PRD notes" | "Always persist thinking to files" | prd-planner |
| "Code review missed SQL injection" | "Add security checklist item" | code-reviewer |
| "Callback was empty, didn't work" | "Verify callback implementations" | debugger |
| "Net APY position ambiguous" | "UI specs need exact relative positions" | prd-planner |
Abstraction Rules:
If experience_repeats 3+ times: pattern_level: critical action: Add to skill's "Critical Mistakes" section If solution_was_effective: pattern_level: best_practice action: Add to skill's "Best Practices" section If user_rating >= 7: pattern_level: strength action: Reinforce this approach If user_rating <= 4: pattern_level: weakness action: Add to "What to Avoid" section
Phase 3: Skill Updates
Update the appropriate skill files with evolution markers:
<!-- Evolution: 2025-01-12 | source: ep-2025-01-12-001 | skill: debugger --> ## Pattern Added (2025-01-12) **Pattern**: Always verify callbacks are not empty functions **Source**: Episode ep-2025-01-12-001 **Confidence**: 0.95 ### Updated Checklist - [ ] Verify all callbacks have implementations - [ ] Test callback execution paths
Correction Markers (when fixing wrong guidance):
<!-- Correction: 2025-01-12 | was: "Use callback chain" | reason: caused stale refresh --> ## Corrected Guidance Use direct state monitoring instead of callback chains: ```typescript // ✅ Do: Direct state monitoring const prevPendingCount = usePrevious(pendingCount);
### Phase 4: Memory Consolidation 1. **Update semantic memory** (`memory/semantic-patterns.json`) 2. **Store episodic memory** (`memory/episodic/YYYY-MM-DD-{skill}.json`) 3. **Update pattern confidence** based on applications/feedback 4. **Prune outdated patterns** (low confidence, no recent applications) ## Self-Correction (on_error hook) Triggered when: - Bash command returns non-zero exit code - Tests fail after following skill guidance - User reports the guidance produced incorrect results **Process:** ```markdown ## Self-Correction Workflow 1. Detect Error - Capture error context from working/last_error.json - Identify which skill guidance was followed 2. Verify Root Cause - Was the skill guidance incorrect? - Was the guidance misinterpreted? - Was the guidance incomplete? 3. Apply Correction - Update skill file with corrected guidance - Add correction marker with reason - Update related patterns in semantic memory 4. Validate Fix - Test the corrected guidance - Ask user to verify
Example:
<!-- Correction: 2025-01-12 | was: "useMemo for claimable ids" | reason: stale data at click time --> ## Self-Correction: Click-Time Computation **Issue**: Using useMemo for claimable IDs caused stale data **Fix**: Compute at click time for always-fresh data **Pattern**: click_time_vs_open_time_computation
Self-Validation
Use the validation template in
references/appendix.md when reviewing updates.
Hooks Integration
Wiring Hooks in Claude Code Settings
Add to Claude Code settings (
~/.claude/settings.json):
{ "hooks": { "PreToolUse": [ { "matcher": "Bash|Write|Edit", "hooks": [ { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/pre-tool.sh \"$TOOL_NAME\" \"$TOOL_INPUT\"" } ] } ], "PostToolUse": [ { "matcher": "Bash", "hooks": [ { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/post-bash.sh \"$TOOL_OUTPUT\" \"$EXIT_CODE\"" } ] } ], "Stop": [ { "matcher": "", "hooks": [ { "type": "command", "command": "bash ${SKILLS_DIR}/self-improving-agent/hooks/session-end.sh" } ] } ] } }
Replace
${SKILLS_DIR} with your actual skills path.
Additional References
See
references/appendix.md for memory structure, workflow diagrams, metrics, feedback templates, and research links.
Best Practices
DO
- ✅ Learn from EVERY skill interaction
- ✅ Extract patterns at the right abstraction level
- ✅ Update multiple related skills
- ✅ Track confidence and apply counts
- ✅ Ask for user feedback on improvements
- ✅ Use evolution/correction markers for traceability
- ✅ Validate guidance before applying broadly
DON'T
- ❌ Over-generalize from single experiences
- ❌ Update skills without confidence tracking
- ❌ Ignore negative feedback
- ❌ Make changes that break existing functionality
- ❌ Create contradictory patterns
- ❌ Update skills without understanding context
Quick Start
After any skill completes, this agent automatically:
- Analyzes what happened
- Extracts patterns and insights
- Updates relevant skill files
- Logs to memory for future reference
- Reports summary to user