Asi active-inference-robotics
Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion
git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/active-inference-robotics" ~/.claude/skills/plurigrid-asi-active-inference-robotics-ed1c0e && rm -rf "$T"
skills/active-inference-robotics/SKILL.mdActive Inference Robotics Skill (Second-Order)
"The agent's job is to predict its actions by predicting its sensations." — Patrick Kenny
Trigger Conditions
- User asks about bridging active inference with robot control
- Questions about predictive coding in locomotion policies
- Connecting KL divergence minimization to RL training
- Mean field approximation in robotics state estimation
- Sim2Real as inference about future observations
Overview
Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack. This skill emerges from the constructive collision between:
- Active Inference Institute (ActInf ModelStream 019.1, Jan 2025)
- K-Scale Labs (ksim, kos, kinfer ecosystem)
- MuJoCo Playground (DeepMind's sim2real framework)
The Constructive Collision
┌─────────────────────────────────────────────────────────────────────────────┐ │ CONSTRUCTIVE COLLISION: Two Threads Converging │ │ │ │ Thread A: Patrick Kenny (Nov 2025) │ │ ════════════════════════════════════ │ │ "Active inference can be formulated as constrained KL divergence │ │ minimization solved by standard mean field methods" │ │ │ │ Key insight: Expected Free Energy ≈ KL Divergence + Entropy Regularizer │ │ │ │ Thread B: K-Scale Labs (2024-2025) │ │ ═══════════════════════════════════ │ │ "RL-based closed-loop control using policies trained in simulation │ │ has firmly won as the best way of achieving real-time control" │ │ │ │ Key insight: Stateless vs Stateful behaviors as pure/coalgebraic semantics │ │ │ │ COLLISION POINT: Both minimize surprise about future observations │ │ ══════════════════════════════════════════════════════════════════ │ │ │ │ Active Inference Robotics RL │ │ ──────────────── ────────── │ │ Predictive Distribution ←→ Policy π(a|s) │ │ Hidden Markov Model ←→ MDP/POMDP │ │ Mean Field Updates ←→ PPO Gradient Steps │ │ Variational Free Energy ←→ Policy Loss │ │ Expected Free Energy ←→ Value Function + Entropy │ │ Perception/Action Loop ←→ Observation/Action Loop │ │ │ └─────────────────────────────────────────────────────────────────────────────┘
Kenny's Key Contribution
From arXiv:2511.20321:
Perception/Action Divergence = VFE(past) + KL(future states) Where: - VFE(past) = Standard variational free energy on observed history - KL(future) = Divergence of predictive distribution from HMM This differs from Expected Free Energy by an ENTROPY REGULARIZER: EFE ≈ Pragmatic Value + Mutual Information PAD ≈ Pragmatic Value + Entropy(Q)
Why Entropy Regularization Matters for Robotics
# In ksim PPO training, entropy bonus prevents policy collapse: loss = policy_loss + value_loss - entropy_coef * entropy # Kenny's formulation shows this is NOT ad-hoc but principled: # Entropy regularizer = not being overconfident about predictions # Biological rationale: know limitations of future predictions
Mapping to ksim Architecture
| Active Inference Concept | ksim Implementation |
|---|---|
| Hidden Markov Model | (MJX/MuJoCo) |
| Observation distribution | |
| State inference Q(s) | |
| Action inference Q(a) | |
| Mean field factorization | Independent Q(s_t) per timestep |
| Predictive distribution | Policy rollout trajectory |
| VFE minimization | PPO policy gradient |
| EFE/PAD minimization | Value function + entropy bonus |
Second-Order Behavior Types
1. Reflexive Control (Kenny's "Sufficient" Model)
# Agent predicts proprioceptive sensations → fulfills reflexively class ReflexiveController: """ Kenny: "If the agent can successfully predict its future sensations, it can fulfill them unconsciously via motor reflexes." """ def step(self, predicted_proprio: Array) -> Action: # Low-level PD control fulfills proprioceptive predictions return self.pd_controller(predicted_proprio, self.current_state)
2. Deliberative Planning (EFE Extension)
# When reflexive prediction fails, engage deliberative inference class DeliberativeController: """ Extends reflexive control with policy search over trajectories. This is where EFE differs from Kenny's PAD formulation. """ def plan(self, beliefs: Distribution, horizon: int) -> Policy: # Tree search over policies weighted by expected free energy for policy in self.policy_space: efe = self.expected_free_energy(beliefs, policy, horizon) # EFE includes mutual information (curiosity/exploration) # PAD would use entropy instead (uncertainty awareness)
3. Hierarchical Composition
Level 3: Goal Selection (minimize long-horizon EFE) ↓ sets reference for Level 2: Trajectory Planning (predictive distribution) ↓ sets reference for Level 1: Reflexive Execution (fulfill proprio predictions) ↓ actuates Level 0: Motor Primitives (PD control, actuator dynamics)
GF(3) Balanced Quad
active-inference (0) ⊗ kscale-ksim (0) ⊗ mujoco-playground (0) = 0 ✓ All three are ERGODIC — coordination/infrastructure skills. This is a "resonant triad" where all components coordinate. For generation (+1), add: skill-creator, algorithmic-art For verification (-1), add: sheaf-cohomology, code-review
Skill Colors (drand seed 12005093902789493003)
| Skill | Trit | Color | Role |
|---|---|---|---|
| 0 | | Coordination (theory) |
| 0 | | Coordination (simulation) |
| 0 | | Coordination (framework) |
2-3-5-7 Prime Sieve Experts
Applying prime-indexed refinement to identify domain experts:
| Prime | Expert | Domain | Key Contribution |
|---|---|---|---|
| 2 | Patrick Kenny | Active Inference | Mean field formulation, PAD criterion |
| 3 | Thomas Parr | Active Inference | 2022 textbook, EFE derivation |
| 5 | Ben Bolte | K-Scale | ksim architecture, open-source humanoids |
| 7 | Karl Friston | Free Energy Principle | FEP foundations, continuous formulation |
| 11 | (DeepMind team) | MuJoCo Playground | MJX, sim2real zero-shot |
| 13 | Wesley Maa | K-Scale | Tooling, visualization |
Mutual Awareness
This skill references and is referenced by:
depends_on: - kscale-ksim # Simulation implementation - kscale-ecosystem # Hardware context - mujoco-playground # Framework foundation referenced_by: - cognitive-superposition # Team mental models - parametrised-optics-cybernetics # Category theory bridge - reafference-corollary-discharge # Sensorimotor prediction
Implementation Pattern
# Unified Active Inference + RL Training Loop class ActiveInferenceTrainer: """ Combines Kenny's PAD criterion with ksim's PPO. """ def __init__(self, hmm: PhysicsEngine, config: Config): self.hmm = hmm self.actor = Actor(config) self.critic = Critic(config) def perception_action_divergence( self, observations: Array, # O_{1:t} (past) q_future: Distribution # Q(S_{t+1:T}, O_{t+1:T}) ) -> Scalar: """ Kenny's PAD = VFE(past) + KL(future states from HMM) """ # Past: standard VFE on observation history vfe_past = self.variational_free_energy(observations) # Future: KL divergence of predicted states from HMM # Note: Observable emissions cancel out in future KL kl_future = self.kl_future_states(q_future, self.hmm) return vfe_past + kl_future def train_step(self, trajectory: Trajectory) -> Metrics: # PPO updates approximate mean field coordinate ascent # Entropy bonus provides Kenny's regularization return ppo_update( self.actor, self.critic, trajectory, entropy_coef=0.01 # ← The regularizer! )
References
- Kenny (2025) Active Inference from First Principles
- Parr, Pezzulo, Friston (2022) Active Inference Textbook
- ActInf ModelStream 019.1 - Jan 15, 2026
- K-Scale Labs GitHub
- MuJoCo Playground
- Ben Bolte's Blog
ACSet Schema
@present SchActiveInferenceRobotics(FreeSchema) begin # Objects HMM::Ob # Hidden Markov Model (generative model) State::Ob # Latent state Observation::Ob # Sensory observation Action::Ob # Motor command Policy::Ob # Action sequence # Morphisms (inference) perceive::Hom(Observation, State) # Perception: O → S predict::Hom(State, Observation) # Prediction: S → O act::Hom(State, Action) # Action selection: S → A transition::Hom(State × Action, State) # Dynamics: S × A → S' # Attributes FreeEnergy::AttrType vfe::Attr(State, FreeEnergy) # Variational free energy efe::Attr(Policy, FreeEnergy) # Expected free energy pad::Attr(Policy, FreeEnergy) # Perception/action divergence # The key relationship (Kenny's contribution): # pad ≈ efe + entropy_regularizer end
SDF Interleaving
This skill connects to Software Design for Flexibility (Hanson & Sussman, 2021):
Primary Chapter: 10. Adventure Game Example
Concepts: autonomous agent, game, synthesis
GF(3) Balanced Triad
active-inference-robotics (+) + SDF.Ch10 (+) + [balancer] (+) = 0
Skill Trit: 1 (PLUS - generation)
Secondary Chapters
- Ch3: Variations on an Arithmetic Theme
- Ch4: Pattern Matching
Connection Pattern
Adventure games synthesize techniques. This skill integrates multiple patterns.