AutoSkill Integrate Fusedbun Optimizer into Algorithmic Efficiency Submission

Modifies the standard algorithmic-efficiency submission file to use the custom Fusedbun optimizer instead of AdamW, correctly mapping hyperparameters and fixing the learning rate scheduler to handle missing warmup factors.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/integrate-fusedbun-optimizer-into-algorithmic-efficiency-submiss" ~/.claude/skills/ecnu-icalk-autoskill-integrate-fusedbun-optimizer-into-algorithmic-efficiency-su && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8/integrate-fusedbun-optimizer-into-algorithmic-efficiency-submiss/SKILL.md
source content

Integrate Fusedbun Optimizer into Algorithmic Efficiency Submission

Modifies the standard algorithmic-efficiency submission file to use the custom Fusedbun optimizer instead of AdamW, correctly mapping hyperparameters and fixing the learning rate scheduler to handle missing warmup factors.

Prompt

Role & Objective

You are an MLPerf/Algorithmic Efficiency submission developer. Your task is to modify the standard

submission.py
file to integrate the custom
Fusedbun
optimizer, replacing the default AdamW optimizer.

Communication & Style Preferences

  • Write clean, error-free Python code with proper indentation.
  • Ensure all necessary imports are included.

Operational Rules & Constraints

  1. Optimizer Integration:

    • Import
      Fusedbun
      from
      optim
      .
    • In
      init_optimizer_state
      , instantiate
      Fusedbun
      instead of
      torch.optim.AdamW
      .
    • Map the following hyperparameters from the input
      hyperparameters
      object to the
      Fusedbun
      constructor:
      • lr
        :
        hyperparameters.learning_rate
      • beta_decay
        :
        hyperparameters.beta_decay
      • Lambda
        :
        hyperparameters.Lambda
      • momentum_beta
        :
        hyperparameters.momentum_beta
    • Set
      centralize=True
      and
      use_rms=True
      as defaults.
  2. Scheduler Configuration:

    • The
      hyperparameters
      object does not have a
      warmup_factor
      attribute.
    • In the
      pytorch_cosine_warmup
      function, do not use
      hyperparameters.warmup_factor
      .
    • Calculate
      warmup_steps
      using a fixed fraction of
      step_hint
      (e.g.,
      warmup_steps = int(0.1 * step_hint)
      ) or remove the warmup logic if specified.
    • Ensure
      warmup_steps
      is an integer to prevent
      TypeError: unsupported operand type(s) for -: 'int' and 'tuple'
      .
  3. Code Structure:

    • Maintain the existing structure of
      update_params
      ,
      get_batch_size
      , and
      data_selection
      .
    • Ensure
      USE_PYTORCH_DDP
      is imported from
      algorithmic_efficiency.pytorch_utils
      .

Anti-Patterns

  • Do not attempt to access
    hyperparameters.warmup_factor
    .
  • Do not multiply the
    hyperparameters
    object directly (e.g.,
    hyperparameters * step_hint
    is invalid).

Triggers

  • integrate Fusedbun optimizer
  • replace AdamW with Fusedbun
  • fix warmup_factor error in submission
  • algorithmic efficiency submission Fusedbun