Asi AgentDB Learning Plugins

Create and train RL learning plugins with AgentDB's plugin system.

install
source · Clone the upstream repo
git clone https://github.com/plurigrid/asi
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/agentdb-learning-plugins" ~/.claude/skills/plurigrid-asi-agentdb-learning-plugins && rm -rf "$T"
manifest: skills/agentdb-learning-plugins/SKILL.md
source content

AgentDB Learning Plugins

CLI Quick Start

# Interactive wizard
npx agentdb@latest create-plugin

# Use specific template
npx agentdb@latest create-plugin -t decision-transformer -n my-agent

# Preview without creating
npx agentdb@latest create-plugin -t q-learning --dry-run

# Custom output directory
npx agentdb@latest create-plugin -t actor-critic -o ./plugins

# List available templates
npx agentdb@latest list-templates

# List installed plugins
npx agentdb@latest list-plugins

# Get plugin info
npx agentdb@latest plugin-info my-agent

API Quick Start

import { createAgentDBAdapter } from 'agentic-flow/reasoningbank';

const adapter = await createAgentDBAdapter({
  dbPath: '.agentdb/learning.db',
  enableLearning: true,
  enableReasoning: true,
  cacheSize: 1000,
});

Algorithm Templates and Configs

1. Decision Transformer (Recommended)

Offline RL -- learns from logged experiences without online interaction.

npx agentdb@latest create-plugin -t decision-transformer -n dt-agent
{
  "algorithm": "decision-transformer",
  "model_size": "base",
  "context_length": 20,
  "embed_dim": 128,
  "n_heads": 8,
  "n_layers": 6
}

2. Q-Learning

Off-policy, value-based. Best for discrete action spaces.

npx agentdb@latest create-plugin -t q-learning -n q-agent
{
  "algorithm": "q-learning",
  "learning_rate": 0.001,
  "gamma": 0.99,
  "epsilon": 0.1,
  "epsilon_decay": 0.995
}

3. SARSA

On-policy, value-based. More conservative than Q-Learning -- better for safety-critical tasks.

npx agentdb@latest create-plugin -t sarsa -n sarsa-agent
{
  "algorithm": "sarsa",
  "learning_rate": 0.001,
  "gamma": 0.99,
  "epsilon": 0.1
}

4. Actor-Critic

Policy gradient with value baseline. Works for continuous and discrete action spaces.

npx agentdb@latest create-plugin -t actor-critic -n ac-agent
{
  "algorithm": "actor-critic",
  "actor_lr": 0.001,
  "critic_lr": 0.002,
  "gamma": 0.99,
  "entropy_coef": 0.01
}

5. Curiosity-Driven

npx agentdb@latest create-plugin -t curiosity-driven -n curious-agent

Templates 5-9

Also available via

list-templates
:
active-learning
,
adversarial-training
,
curriculum-learning
,
federated-learning
,
multi-task-learning
. These have no dedicated CLI template flag -- use the interactive wizard (
create-plugin
with no
-t
).


Training Workflow

Store Experiences

await adapter.insertPattern({
  id: '',
  type: 'experience',
  domain: 'task-domain',
  pattern_data: JSON.stringify({
    embedding: await computeEmbedding(JSON.stringify(step)),
    pattern: {
      state: step.state,
      action: step.action,
      reward: step.reward,
      next_state: step.next_state,
      done: step.done,
    },
  }),
  confidence: step.reward > 0 ? 0.9 : 0.5,
  usage_count: 1,
  success_count: step.reward > 0 ? 1 : 0,
  created_at: Date.now(),
  last_used: Date.now(),
});

Train

const metrics = await adapter.train({
  epochs: 100,
  batchSize: 64,
  learningRate: 0.001,
  validationSplit: 0.2,
});
// Returns: { loss, valLoss, duration, epochs }

Evaluate

const result = await adapter.retrieveWithReasoning(testQuery, {
  domain: 'task-domain',
  k: 10,
  synthesizeContext: true,
});
const suggestedAction = result.memories[0].pattern.action;
const confidence = result.memories[0].similarity;

Prioritized Experience Replay

// Store with TD error as priority
await adapter.insertPattern({
  // ... standard fields
  confidence: tdError,  // TD error = priority
});

// Retrieve only high-priority experiences
const highPriority = await adapter.retrieveWithReasoning(queryEmbedding, {
  domain: 'task-domain',
  k: 32,
  minConfidence: 0.7,
});

Multi-Agent Training

for (const agent of agents) {
  const experience = await agent.step();
  await adapter.insertPattern({
    domain: `multi-agent/${agent.id}`,
    // ... experience data
  });
}

await adapter.train({ epochs: 50, batchSize: 64 });

Combined Learning + Reasoning

await adapter.train({ epochs: 50, batchSize: 32 });

const result = await adapter.retrieveWithReasoning(queryEmbedding, {
  domain: 'decision-making',
  k: 10,
  useMMR: true,
  synthesizeContext: true,
  optimizeMemory: true,
});

Troubleshooting

Not converging: Lower

learningRate
(try
0.0001
).

Overfitting: Add

validationSplit: 0.2
, enable
optimizeMemory: true
to consolidate patterns.

Slow training: Enable quantization (

quantizationType: 'binary'
).