Claude-skill-registry ksim-rl
RL training library for humanoid locomotion and manipulation built on MuJoCo and JAX. Provides PPO, AMP, and custom task abstractions for sim-to-real robotics policy training.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/ksim-rl" ~/.claude/skills/majiayu000-claude-skill-registry-ksim-rl && rm -rf "$T"
manifest:
skills/data/ksim-rl/SKILL.mdsource content
KSIM-RL Skill
Trit: -1 (MINUS - analysis/verification) Color: #3A2F9E (Deep Purple) URI: skill://ksim-rl#3A2F9E
Overview
KSIM is K-Scale Labs' reinforcement learning library for humanoid robot locomotion and manipulation. Built on MuJoCo for physics simulation and JAX for hardware-accelerated training.
Core Architecture
┌─────────────────────────────────────────────────────────────────┐ │ KSIM ARCHITECTURE │ ├─────────────────────────────────────────────────────────────────┤ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ RLTask │ │ PPOTask │ │ AMPTask │ │ │ │ (abstract) │──│ (PPO impl) │──│ (Adversarial Motion) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ PhysicsEngine │ │ │ │ ┌───────────────┐ ┌───────────────────────────────┐ │ │ │ │ │ MujocoEngine │ │ MjxEngine (JAX-accelerated) │ │ │ │ │ └───────────────┘ └───────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ ▼ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ Environment Components │ │ │ │ • Actuators: Position, Velocity, Torque control │ │ │ │ • Observations: Joint states, IMU, local view │ │ │ │ • Rewards: Velocity tracking, gait, energy, stability │ │ │ │ • Terminations: Fall detection, boundary violations │ │ │ └─────────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘
Key Features
- JAX-Accelerated: Uses MJX for parallel environment simulation on GPU/TPU
- PPO Training: Proximal Policy Optimization with configurable hyperparameters
- AMP Support: Adversarial Motion Priors for realistic humanoid locomotion
- Modular Rewards: Composable reward functions for gait, velocity, energy
- Domain Randomization: Built-in randomizers for sim-to-real transfer
API Usage
import ksim from ksim import PPOTask, MjxEngine from ksim.tasks.humanoid import HumanoidWalkingTask # Define custom task class KBotWalkingTask(PPOTask): model_path = "kbot.mjcf" # Observations observations = [ ksim.JointPosition(), ksim.JointVelocity(), ksim.IMUAngularVelocity(), ksim.BaseOrientation(), ] # Rewards rewards = [ ksim.LinearVelocityReward(scale=1.0), ksim.GaitPhaseReward(scale=0.5), ksim.EnergyPenalty(scale=-0.01), ] # Actuators actuators = [ ksim.PositionActuator( joint_name=".*", kp=100.0, kd=10.0, action_scale=0.5, ) ] # Train task = KBotWalkingTask() task.run_training( num_envs=4096, num_steps=1000000, learning_rate=3e-4, )
GF(3) Triads
This skill participates in balanced triads:
ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ mujoco-scenes (0) = 0 ✓ ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ urdf2mjcf (0) = needs balancing
Key Contributors
- codekansas (Ben Bolte): Core architecture, PPO, rewards
- b-vm: Randomizers, disturbances, policy training
- carlosdp: Adaptive KL, action scaling
- WT-MM: Visualization, markers
Related Skills
(+1): Robot firmware and gRPC serviceskos-firmware
(0): Scene composition for MuJoComujoco-scenes
(-1): Vision-language-action modelsevla-vla
(-1): URDF to MJCF conversionurdf2mjcf
(-1): Servo tuning for sim2realktune-sim2real
References
@misc{ksim2024, title={K-Sim: RL Training for Humanoid Locomotion}, author={K-Scale Labs}, year={2024}, url={https://github.com/kscalelabs/ksim} }
SDF Interleaving
This skill connects to Software Design for Flexibility (Hanson & Sussman, 2021):
Primary Chapter: 5. Evaluation
Concepts: eval, apply, interpreter, environment
GF(3) Balanced Triad
ksim-rl (○) + SDF.Ch5 (−) + [balancer] (+) = 0
Skill Trit: 0 (ERGODIC - coordination)
Secondary Chapters
- Ch2: Domain-Specific Languages
Connection Pattern
Evaluation interprets expressions. This skill processes or generates evaluable forms.