Awesome-Agent-Skills-for-Empirical-Research vmas-simulator-guide
Vectorized multi-agent reinforcement learning simulator
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/ai-ml/vmas-simulator-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-vmas-simulator-gu && rm -rf "$T"
manifest:
skills/43-wentorai-research-plugins/skills/domains/ai-ml/vmas-simulator-guide/SKILL.mdsource content
VMAS: Vectorized Multi-Agent Simulator Guide
Overview
VMAS is a vectorized simulator for multi-agent reinforcement learning (MARL) that runs thousands of parallel environments on GPU via PyTorch. It provides a diverse set of 2D cooperative, competitive, and mixed scenarios for benchmarking multi-agent algorithms. Orders of magnitude faster than CPU-based simulators, enabling rapid research iteration on multi-agent coordination problems.
Installation
pip install vmas
Quick Start
import vmas # Create vectorized environment env = vmas.make_env( scenario="simple_spread", num_envs=1024, # Parallel environments num_agents=3, device="cuda", # GPU acceleration continuous_actions=True, ) # Environment loop obs = env.reset() for step in range(100): # Random actions for demonstration actions = [env.action_space[i].sample() for i in range(env.n_agents)] obs, rewards, dones, infos = env.step(actions) # obs: list of [num_envs, obs_dim] tensors # rewards: list of [num_envs] tensors
Scenarios
| Scenario | Type | Agents | Description |
|---|---|---|---|
| simple_spread | Cooperative | 3 | Cover N landmarks |
| simple_tag | Competitive | 4 | Predator-prey |
| transport | Cooperative | 4 | Move package to goal |
| wheel | Cooperative | 4 | Coordination on wheel |
| flocking | Cooperative | 5+ | Reynolds flocking |
| discovery | Cooperative | 3 | Explore and discover |
| navigation | Mixed | N | Multi-agent navigation |
Integration with MARL Libraries
# With TorchRL from torchrl.envs import VmasEnv env = VmasEnv( scenario="simple_spread", num_envs=512, device="cuda", ) # With RLlib from ray.rllib.env import MultiAgentEnv # VMAS provides RLlib-compatible wrapper # With CleanRL / custom training import torch env = vmas.make_env("transport", num_envs=2048, device="cuda") obs = env.reset() # All tensors on GPU — train directly without CPU transfer policy_output = policy_network(obs[0]) # Agent 0 observations
Custom Scenarios
from vmas import Scenario, Agent, World, Landmark class MyScenario(Scenario): def make_world(self, batch_dim, device): world = World(batch_dim=batch_dim, device=device) world.add_agent(Agent(name="agent_0")) world.add_agent(Agent(name="agent_1")) world.add_landmark(Landmark(name="goal")) return world def reset_world(self, env, world): # Randomize positions for agent in world.agents: agent.set_pos(torch.rand(env.batch_dim, 2) * 2 - 1) def reward(self, agent, world): # Distance to goal goal = world.landmarks[0] return -torch.linalg.norm(agent.state.pos - goal.state.pos, dim=-1) # Register and use env = vmas.make_env(MyScenario(), num_envs=512)
Use Cases
- MARL research: Benchmark multi-agent algorithms
- Cooperative learning: Study emergent coordination
- Scalability testing: GPU-accelerated parallel training
- Custom scenarios: Design domain-specific multi-agent tasks
- Education: Teach multi-agent RL concepts