git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/gym" ~/.claude/skills/plurigrid-asi-gym && rm -rf "$T"
skills/gym/SKILL.mdgym Skill
Unified catalog of Gymnasium/OpenAI Gym environments for RL across all domains
Environment Taxonomy
┌─────────────────────┐ │ GYMNASIUM │ │ (OpenAI Gym API) │ └──────────┬──────────┘ │ ┌───────────────┬───────────────┼───────────────┬───────────────┐ │ │ │ │ │ ┌────▼────┐ ┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐ │ PHYSICS │ │ ROBOTICS │ │ ENERGY │ │ CHEMISTRY │ │ GAMES │ └────┬────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │ │ │ │ │ MuJoCo Isaac Gym Microgrid ChemistryGym Atari PyBullet RoboGym PowerGrid rlmolecule NetHack dm_control Softrobot GEM (Motor) SynthesisNet Doom
Core Environments
Physics Simulation
| Environment | Stars | Domain | Backend |
|---|---|---|---|
| gymnasium | 7k+ | Classic control, Box2D, MuJoCo | Native |
| dm_control | 3.5k | Continuous control | MuJoCo |
| pybullet-gym | 900+ | MuJoCo alternatives | PyBullet |
| mujoco-py | 2.8k | Physics simulation | MuJoCo |
import gymnasium as gym # Classic control env = gym.make("CartPole-v1") env = gym.make("Pendulum-v1") env = gym.make("Acrobot-v1") # MuJoCo env = gym.make("Humanoid-v4") env = gym.make("Ant-v4") env = gym.make("HalfCheetah-v4")
Robotics
| Environment | Stars | Domain | Features |
|---|---|---|---|
| OmniIsaacGymEnvs | 1k+ | GPU-accelerated robotics | NVIDIA Isaac Sim |
| robogym | 400+ | Dexterous manipulation | OpenAI |
| gym-softrobot | 100+ | Soft robotics | Elastica |
| safe-control-gym | 500+ | Safe RL benchmarks | PyBullet |
# Isaac Gym (GPU parallel) from omni.isaac.gym.vec_env import VecEnvBase env = VecEnvBase(headless=True, num_envs=4096) # Safe control import safe_control_gym env = gym.make("CartPole-v0", ctrl_freq=50)
Energy & Power Systems
| Environment | Stars | Domain | Features |
|---|---|---|---|
| openmodelica-microgrid-gym | 214 | Microgrids | FMU, SafeOpt |
| gym-electric-motor | 200+ | Electric motors | GEM |
| PowerGridworld | 100+ | Multi-agent grid | NREL |
| RL-Energy | 50+ | Energy systems | PNNL |
# Microgrid (FMU-based) env = gym.make('openmodelica_microgrid_gym:ModelicaEnv-v1', net='net/net.yaml', model_path='omg_grid/grid.network.fmu') # Electric motor import gym_electric_motor as gem env = gem.make('Finite-SC-PermExcDC-v1') # Power grid world from gridworld import GridWorld env = GridWorld(num_agents=3)
Chemistry & Molecular
| Environment | Stars | Domain | Features |
|---|---|---|---|
| chemistrygym | 100+ | Lab reactions | Reaction vessels |
| rlmolecule | 80+ | Molecule optimization | MCTS |
| SynthesisNet | New | Synthesizable molecules | ICLR 2025 |
| DistillationTrain-Gym | 50+ | Chemical engineering | Process synthesis |
| SynGameZero | 30+ | Flowsheet synthesis | AlphaZero |
# Chemistry Gym from chemgym import ReactionEnv env = ReactionEnv(vessels=2, max_steps=100) # Molecule RL from rlmolecule import MoleculeEnv env = MoleculeEnv(target_property='logP') # Distillation from distillation_gym import DistillationEnv env = DistillationEnv(num_components=3)
Games & Simulation
| Environment | Stars | Domain | Features |
|---|---|---|---|
| ALE (Atari) | 2k+ | Atari games | 57 games |
| NetHack | 900+ | Roguelike | NLE |
| VizDoom | 1.7k | First-person shooter | Doom |
| MiniGrid | 2k+ | Grid worlds | Procedural |
| PufferLib | 500+ | Multi-game | High throughput |
# Atari env = gym.make("ALE/Breakout-v5") # NetHack import nle env = gym.make("NetHackScore-v0") # PufferLib (vectorized) import pufferlib env = pufferlib.make("atari_breakout")
Gymnasium API (Modern Standard)
import gymnasium as gym from gymnasium import spaces class CustomEnv(gym.Env): """Template for custom environment.""" metadata = {"render_modes": ["human", "rgb_array"]} def __init__(self, render_mode=None): super().__init__() self.observation_space = spaces.Box(low=-1, high=1, shape=(4,)) self.action_space = spaces.Discrete(2) self.render_mode = render_mode def reset(self, seed=None, options=None): super().reset(seed=seed) observation = self.observation_space.sample() info = {} return observation, info def step(self, action): observation = self.observation_space.sample() reward = 1.0 terminated = False truncated = False info = {} return observation, reward, terminated, truncated, info def render(self): if self.render_mode == "rgb_array": return self._render_frame() def close(self): pass
Vectorized Environments
# Gymnasium native envs = gym.vector.make("CartPole-v1", num_envs=8) # Stable-Baselines3 from stable_baselines3.common.vec_env import SubprocVecEnv envs = SubprocVecEnv([make_env(i) for i in range(8)]) # PufferLib (high-performance) import pufferlib.vectorization envs = pufferlib.vectorization.make( "CartPole-v1", num_envs=1024, backend="multiprocessing" )
Wrappers
from gymnasium.wrappers import ( TimeLimit, # Max steps RecordVideo, # Video recording NormalizeObservation,# Normalize obs NormalizeReward, # Normalize rewards ClipAction, # Clip actions FrameStack, # Stack frames GrayscaleObservation,# Convert to grayscale ) env = gym.make("CartPole-v1") env = TimeLimit(env, max_episode_steps=500) env = NormalizeObservation(env)
Skill Tension Resolution via Gyms
Each gym domain resolves specific skill tensions:
| Gym Domain | Tensions Resolved | Bridge Skills |
|---|---|---|
| Physics | continuous ↔ discrete | , |
| Robotics | local ↔ global | , |
| Energy | temporal ↔ atemporal | , |
| Chemistry | symbolic ↔ subsymbolic | , |
| Games | maximize ↔ sample | , |
Gay.jl Integration
Color-code environments by domain:
GYM_COLORS = { 'physics': '#63B6F0', # Stream 3 (continuous) 'robotics': '#89DF91', # Stream 3 (embodied) 'energy': '#E6F463', # Stream 2 (temporal) 'chemistry': '#5713C0', # Stream 4 (synthesis) 'games': '#CF6971', # Stream 3 (discrete) } def color_for_env(env_id: str) -> str: if 'MuJoCo' in env_id or 'Pendulum' in env_id: return GYM_COLORS['physics'] elif 'Isaac' in env_id or 'Robot' in env_id: return GYM_COLORS['robotics'] elif 'Microgrid' in env_id or 'Motor' in env_id: return GYM_COLORS['energy'] elif 'Chem' in env_id or 'Molecule' in env_id: return GYM_COLORS['chemistry'] else: return GYM_COLORS['games']
Training Frameworks
| Framework | Gyms Supported | Best For |
|---|---|---|
| Stable-Baselines3 | All Gymnasium | Easy PPO/SAC |
| RLlib | All Gymnasium | Multi-agent, distributed |
| CleanRL | Standard | Single-file implementations |
| PufferLib | High-throughput | Games, speed |
| Sample Factory | Doom, Atari | Asynchronous |
# Stable-Baselines3 from stable_baselines3 import PPO model = PPO("MlpPolicy", env, verbose=1) model.learn(total_timesteps=100000) # RLlib from ray.rllib.algorithms.ppo import PPOConfig config = PPOConfig().environment("CartPole-v1") algo = config.build() # CleanRL (single file) # python cleanrl/ppo.py --env-id CartPole-v1
Local Environments (from codebase)
Your codebase includes these custom gyms:
| File | Environment | Domain |
|---|---|---|
| | Markets |
| | DeFi |
| | Stablecoins |
| | Game simulation |
| | Color prediction |
| | Compute markets |
| | Active inference |
| | Exploration |
Neighbor Skills
- omg-tension-resolver: Microgrid gym for skill tension resolution
- alife: Artificial life environments
- gflownet: Sampling environments for molecule design
- forward-forward-learning: Local learning in environments
- safe-control-gym: Safety constraints in RL
Geometric Morphism Structure (Symplectic Bordism Core)
Secondary Symplectic Hub
This skill occupies a high-degree nexus in the skill-space network:
Flow Properties:
- In-degree: 6 (receives from 6 distinct morphism sources)
- Out-degree: 6 (sends to 6 distinct morphism targets)
- Symplectic Property: |in - out| = 0 ✓ (perfect flow balance)
- Status: SECONDARY SYMPLECTIC HUB (central RL environment nexus)
Morphism Neighbors (Discovered via Random Walk):
skill.gym ←→ skill.content-research-writer ←→ skill.file-organizer ←→ skill.omg-tension-resolver ←→ skill.entropy-sequencer ←→ skill.forward-forward-learning ←→ skill.gflownet
Interpretation
The gym environment ecosystem represents the practical instantiation of reinforcement learning and embodied reasoning:
- Type: Environment/substrate for agent learning
- Role: Central locus where theory meets practice
- Topology: Bridges discrete (games) and continuous (physics) domains
- Symplectic Property: Preserves phase-space volume across all environment types
Its perfect 6→6 balance means it acts as an orchestration hub—a distribution center where conceptual flows enter (ideas from higher-level skills) and exit (instantiated environments).
Coherence Proof
Theorem (Secondary Hub Property): skill.gym is symplectic ⟺ in-deg(gym) = out-deg(gym) = 6 Proof: by direct inspection of morphism graph ∑ in-flow(gym) = ∑ out-flow(gym) = 6 ✓ Corollary (Orchestration): For any composition φ: X → Y through gym, the morphism is bijective: |φ⁻¹({gym})| = |φ({gym})| = 6
Cross-Skill Integration
This skill links seamlessly to:
- content-research-writer: Synthesizes domain knowledge for environment design
- file-organizer: Structures environment catalogs and benchmarks
- omg-tension-resolver: Resolves skill tensions through gym domains
- entropy-sequencer: Sequences environment complexity levels
- forward-forward-learning: Local learning without backprop (gym-compatible)
- gflownet: Flow-matching for molecule design via chemical gyms
Resources
- Gymnasium Docs
- Farama Foundation - Maintainers
- Awesome RL Envs
- PettingZoo - Multi-agent
- Symplectic Bordism Core — Full geometric morphism analysis
End-of-Skill Interface
Commands
# Install gymnasium pip install gymnasium[all] # Install domain-specific pip install gymnasium[mujoco] pip install gymnasium[atari] pip install gym-electric-motor pip install openmodelica_microgrid_gym # List available envs python -c "import gymnasium; print(gymnasium.envs.registry.keys())" # Run with rendering python -c " import gymnasium as gym env = gym.make('CartPole-v1', render_mode='human') env.reset() for _ in range(1000): env.step(env.action_space.sample()) env.close() "
Autopoietic Marginalia
The interaction IS the skill improving itself.
Every use of this skill is an opportunity for worlding:
- MEMORY (-1): Record what was learned
- REMEMBERING (0): Connect patterns to other skills
- WORLDING (+1): Evolve the skill based on use
Add Interaction Exemplars here as the skill is used.