Trending-skills metaclaw-evolving-agent
Deploy and configure MetaClaw — an agent that meta-learns and evolves from live conversations using skills injection, RL training, and smart scheduling.
git clone https://github.com/Aradotso/trending-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/Aradotso/trending-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/metaclaw-evolving-agent" ~/.claude/skills/aradotso-trending-skills-metaclaw-evolving-agent && rm -rf "$T"
skills/metaclaw-evolving-agent/SKILL.mdMetaClaw Evolving Agent
Skill by ara.so — Daily 2026 Skills collection
MetaClaw is an OpenAI-compatible proxy agent that intercepts conversations, injects learned skills, and continuously improves itself through real-world interactions. It supports three modes: lightweight skills injection, immediate RL training, and a smart "madmax" scheduler that defers weight updates to idle/sleep windows.
Installation
# Minimal — skills injection only, no GPU required pip install -e . # Full RL training support (torch, transformers, tinker) pip install -e ".[rl]" # Skill evolution via LLM summarization pip install -e ".[evolve]" # Google Calendar scheduler for madmax mode pip install -e ".[scheduler]" # Recommended: everything pip install -e ".[rl,evolve,scheduler]"
Quick Start
# One-time interactive config wizard metaclaw setup # Start in default madmax mode (skills + RL + smart scheduler) metaclaw start # Skills only — no GPU, no Tinker needed metaclaw start --mode skills_only # RL mode — trains immediately when batch is full metaclaw start --mode rl # RL without scheduler (same as above, explicit) metaclaw start --mode rl
After
metaclaw start, a local OpenAI-compatible proxy is running. Point your client (OpenClaw or any OpenAI SDK consumer) at http://localhost:<port> instead of the upstream LLM endpoint.
Configuration
metaclaw setup writes a config file (default: ~/.metaclaw/config.yaml). You can also edit it directly:
# ~/.metaclaw/config.yaml proxy: host: 0.0.0.0 port: 8080 llm: provider: kimi # kimi | qwen | claude | minimax | openai | gemini base_url: https://api.moonshot.cn/v1 model: moonshot-v1-8k # api_key loaded from env: METACLAW_LLM_API_KEY skills: enabled: true max_injected: 5 # max skills injected per turn summarize_after_session: true rl: enabled: true backend: auto # auto | tinker | mint batch_size: 32 algorithm: grpo opd_teacher: false # optional teacher distillation scheduler: # madmax mode only enabled: true sleep_hours: [22, 7] # local 22:00–07:00 idle_timeout_minutes: 15 google_calendar: false # set true + configure OAuth for meeting detection logging: level: info log_dir: ~/.metaclaw/logs
Environment Variables
export METACLAW_LLM_API_KEY="your-llm-api-key" export METACLAW_TINKER_API_KEY="your-tinker-api-key" # rl mode export METACLAW_MINT_API_KEY="your-mint-api-key" # if backend=mint export GOOGLE_CALENDAR_CREDENTIALS_PATH="path/to/creds.json" # scheduler
Operating Modes
| Mode | Command | GPU Required | Description |
|---|---|---|---|
| | No | Proxy + skills injection + auto-summarization |
| | Via API | Skills + GRPO training when batch fills |
| | Via API | Skills + RL + scheduler (trains only during idle/sleep/meetings) |
Python API
Programmatic startup
import asyncio from metaclaw import MetaClawAgent, AgentConfig, Mode async def main(): config = AgentConfig.from_yaml("~/.metaclaw/config.yaml") agent = MetaClawAgent(config, mode=Mode.MADMAX) await agent.start() asyncio.run(main())
Manual skill injection
from metaclaw.skills import SkillStore, SkillInjector store = SkillStore(path="~/.metaclaw/skills") # Add a skill manually store.add( name="code-review-checklist", content="Always check for: 1) error handling, 2) type hints, 3) docstrings.", tags=["code", "review"] ) # Retrieve top-k relevant skills for a query injector = SkillInjector(store) relevant = injector.retrieve(query="review my Python function", top_k=3) for skill in relevant: print(skill.name, skill.score)
Intercepting and recording conversations
from metaclaw.proxy import ConversationInterceptor from metaclaw.memory import ExperienceBuffer buffer = ExperienceBuffer(max_size=1000) interceptor = ConversationInterceptor( upstream_url="https://api.moonshot.cn/v1", on_complete=buffer.record # called after each turn with (messages, response) ) # buffer.record signature: async def on_complete(messages: list[dict], response: dict) -> None: ...
Triggering RL training manually
from metaclaw.training import RLTrainer, TrainingConfig trainer = RLTrainer( config=TrainingConfig( backend="tinker", # or "mint" algorithm="grpo", batch_size=32, lora_rank=16, ) ) # Collect a batch from the experience buffer and train async def run_training(buffer): batch = buffer.sample(n=32, split="support") # support/query separation result = await trainer.train(batch) print(f"Training complete. Loss: {result.loss:.4f}, Steps: {result.steps}")
Reward modeling
from metaclaw.rewards import RewardModel reward_model = RewardModel(provider="llm") # uses configured LLM for scoring async def score_turn(prompt: str, response: str) -> float: score = await reward_model.score(prompt=prompt, response=response) return score # float in [-1.0, 1.0]
Skills Lifecycle
Conversation turn │ ▼ SkillInjector.retrieve() ← vector search over SkillStore │ injects top-k skills into system prompt ▼ LLM responds │ ▼ ExperienceBuffer.record() ← stores (context, response, metadata) │ ▼ (end of session) SkillSummarizer.run() ← LLM extracts reusable patterns │ ▼ SkillStore.upsert() ← new/updated skills persisted to disk
Integration: OpenAI SDK as Client
Point any OpenAI SDK client at the MetaClaw proxy:
from openai import OpenAI # MetaClaw proxy is running on localhost:8080 client = OpenAI( base_url="http://localhost:8080/v1", api_key="not-used-but-required-by-sdk" ) response = client.chat.completions.create( model="moonshot-v1-8k", # passed through to upstream messages=[ {"role": "user", "content": "Review my pull request strategy."} ] ) print(response.choices[0].message.content)
Skills are injected transparently — the client code does not change.
Scheduler (MadMax Mode)
The scheduler ensures RL weight updates never interrupt active use:
from metaclaw.scheduler import MadMaxScheduler, SchedulerConfig scheduler = MadMaxScheduler( config=SchedulerConfig( sleep_hours=(22, 7), # train between 22:00–07:00 local time idle_timeout_minutes=15, # train after 15 min of no conversations google_calendar=True, # also train during calendar meetings credentials_path="creds.json" ) ) # Check if it's safe to train right now if await scheduler.is_training_window(): await trainer.train(batch)
Google Calendar Setup
# 1. Enable Google Calendar API in Google Cloud Console # 2. Download OAuth2 credentials as creds.json # 3. Set path in config or env export GOOGLE_CALENDAR_CREDENTIALS_PATH="/path/to/creds.json" # 4. First run will open browser for OAuth consent metaclaw start
Support/Query Set Separation
MetaClaw separates experience into support and query sets to prevent stale rewards from polluting updates:
from metaclaw.memory import ExperienceBuffer buffer = ExperienceBuffer( max_size=2000, support_ratio=0.5 # 50% support, 50% query ) # During training: support_batch = buffer.sample(n=16, split="support") # used to compute reward signal query_batch = buffer.sample(n=16, split="query") # used for gradient update await trainer.train_meta(support=support_batch, query=query_batch)
RL Backends
Tinker (default)
rl: backend: tinker tinker_project: my-metaclaw-project lora_rank: 16 learning_rate: 1e-4
MinT
# Install MinT compatibility layer separately pip install metaclaw-mint
rl: backend: mint mint_endpoint: https://your-mint-endpoint
Auto-detection
rl: backend: auto # tries tinker first, falls back to mint, errors if neither available
Troubleshooting
Proxy not reachable after metaclaw start
- Check port conflicts:
lsof -i :8080 - Change
in config and restartproxy.port
mode: "No training backend available"rl
- Ensure
completed successfullypip install -e ".[rl]" - Verify
orMETACLAW_TINKER_API_KEY
is setMETACLAW_MINT_API_KEY - Try
explicitly instead ofrl.backend: tinkerauto
Skills not persisting between sessions
- Confirm
in configskills.summarize_after_session: true - Check write permissions on
~/.metaclaw/skills/ - Run
to inspect stored skillsmetaclaw skills list
Madmax mode never trains
- Verify
covers your timezone's nightscheduler.sleep_hours - Lower
for testing (e.g.,scheduler.idle_timeout_minutes
)1 - Check scheduler logs:
~/.metaclaw/logs/scheduler.log
Google Calendar integration fails
- Re-run OAuth flow: delete
and restart~/.metaclaw/token.json - Ensure Calendar API is enabled in your Google Cloud project
OPD teacher distillation errors
- Only supported with
rl.backend: tinker - Requires a separate teacher model endpoint in config:
rl: opd_teacher: true teacher_base_url: https://api.openai.com/v1 teacher_model: gpt-4o
CLI Reference
metaclaw setup # interactive config wizard metaclaw start # start in madmax mode metaclaw start --mode skills_only metaclaw start --mode rl metaclaw start --config path/to/config.yaml metaclaw skills list # show all stored skills metaclaw skills delete <name> # remove a skill metaclaw skills export skills.json metaclaw status # show proxy, scheduler, training status metaclaw logs # tail all logs metaclaw logs --component scheduler