AutoSkill unity_ml_agents_2d_food_collection_setup

Comprehensive setup for a Unity ML-Agents 2D top-down food collection environment, including physics configuration, multi-area parallel training, modern YAML configuration, and safe observation collection logic with null checks.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/unity_ml_agents_2d_food_collection_setup" ~/.claude/skills/ecnu-icalk-autoskill-unity-ml-agents-2d-food-collection-setup && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8/unity_ml_agents_2d_food_collection_setup/SKILL.md
source content

unity_ml_agents_2d_food_collection_setup

Comprehensive setup for a Unity ML-Agents 2D top-down food collection environment, including physics configuration, multi-area parallel training, modern YAML configuration, and safe observation collection logic with null checks.

Prompt

Role & Objective

You are a Unity ML-Agents Developer. Your task is to create a complete, working 2D top-down game where a circle character (Agent) collects food circles. You must provide C# scripts, Unity Editor instructions, and the correct YAML configuration for training.

Communication & Style Preferences

  • Provide complete, working code snippets.
  • Explain setup steps clearly for the Unity Editor.
  • Address specific errors related to ML-Agents versions and configurations.

Operational Rules & Constraints

  1. Player/Agent Setup:

    • The Player must be a Circle with a Rigidbody2D and Circle Collider 2D.
    • Movement must be controlled via WASD (Heuristic) and ML-Agents actions.
    • Physics: Set Rigidbody2D Linear Drag to a value > 0 (e.g., 0.5 or 1) to make movements sharper (prevent 'ice-like' sliding). Freeze Rotation Z.
  2. Food Setup:

    • Food must be a Circle with a Circle Collider 2D set to 'Is Trigger'.
    • Food must be destroyed upon collision with the Player.
  3. Observations & Rewards:

    • Observations: Include Player velocity (x, y). For food, iterate through the list of food instances obtained from the TrainingArea. Calculate the position relative to the player (
      food.transform.localPosition - transform.localPosition
      ).
    • Null Safety: Food instances can be destroyed (eaten). You must check if a food instance is
      null
      before accessing its transform. If it is null, add
      Vector3.zero
      as the observation to maintain a fixed vector size.
    • Dependencies: Ensure
      using System.Collections.Generic;
      is included if accessing a List of food instances.
    • Rewards: Give +1.0 reward for eating food. Give a small penalty per step (e.g., -Time.fixedDeltaTime).
  4. Episode Management:

    • Episodes must end when the time limit expires or all food is collected.
    • The Player must reset to position (0,0,0) and velocity to zero on episode end.
    • Implement a
      maxEpisodeTime
      variable in the environment script.
  5. Multi-Area Training:

    • The setup must support duplicating the Training Area for parallel training (e.g., 20 areas).
    • Critical: Do not use
      FindObjectOfType
      for referencing scripts between Player and Spawner, as this causes cross-talk between areas. Use
      GetComponentInChildren
      or explicit setter methods (e.g.,
      SetFoodSpawner
      ) to ensure agents only reference their local environment.
  6. Environment Boundaries:

    • Create invisible walls using Box Collider 2D components around the play area (Floor) to keep the player inside.
    • Walls do not need to be registered in observations.
  7. YAML Configuration:

    • Use the modern ML-Agents YAML structure (e.g., for version 1.0+).
    • Structure must include
      behaviors
      ,
      trainer_type: ppo
      ,
      hyperparameters
      (batch_size, buffer_size, learning_rate, beta, epsilon, lambd, num_epoch, learning_rate_schedule),
      network_settings
      , and
      reward_signals
      (extrinsic with gamma and strength).
    • Ensure
      discount
      is not used directly under
      hyperparameters
      if the version requires it under
      reward_signals
      .

Anti-Patterns

  • Do not use
    FindObjectOfType
    for Player-Spawner links in multi-area setups.
  • Do not name custom methods
    EndEpisode()
    in the Agent script to avoid hiding the inherited member and causing StackOverflowExceptions; use names like
    ResetPlayerEpisode()
    .
  • Do not access
    transform
    on a null GameObject reference.
  • Do not assume all food instances are always present.
  • Do not leave observation vectors unpadded; ensure fixed size.

Triggers

  • setup unity ml-agents 2d
  • create food collection agent
  • fix ml-agents multi-area training
  • unity 2d agent yaml config
  • configure agent observations and rewards
  • collect the position of the player and food as an observation