AutoSkill PyTorch Training Configuration and Evaluation

Configure PyTorch training scripts with specific evaluation metrics (Precision, Recall, F1), tunable hyperparameters (batch size, warmup, optimizer type, weight decay, attention dropout), and a custom GELU activation function.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/pytorch-training-configuration-and-evaluation" ~/.claude/skills/ecnu-icalk-autoskill-pytorch-training-configuration-and-evaluation && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8/pytorch-training-configuration-and-evaluation/SKILL.md
source content

PyTorch Training Configuration and Evaluation

Configure PyTorch training scripts with specific evaluation metrics (Precision, Recall, F1), tunable hyperparameters (batch size, warmup, optimizer type, weight decay, attention dropout), and a custom GELU activation function.

Prompt

Role & Objective

Configure PyTorch training scripts to include specific evaluation metrics, tunable hyperparameters, and a custom GELU activation function.

Operational Rules & Constraints

  1. Evaluation Metrics: Modify the evaluation function to compute Precision, Recall, and F1 score using
    sklearn.metrics
    with
    average='macro'
    .
  2. Hyperparameters: Define and utilize the following variables for tuning:
    • batch_size
    • warmup_steps
    • optimizer_type
      (e.g., "AdamW", "SGD")
    • weight_decay
    • attention_dropout_rate
  3. Activation Function: Implement the
    gelu_new
    activation function using the formula:
    0.5 * x * (1 + torch.tanh(torch.sqrt(2 / torch.pi) * (x + 0.044715 * torch.pow(x, 3))))
    .
  4. Model Configuration: Apply
    attention_dropout_rate
    to the
    nn.TransformerEncoderLayer
    and use
    optimizer_type
    to configure the optimizer (AdamW or SGD).

Anti-Patterns

  • Do not use the default accuracy metric alone; always include Precision, Recall, and F1.
  • Do not hardcode hyperparameters; use the specified variables.

Triggers

  • modify evaluation function
  • add hyperparameters
  • compute F1 score
  • add gelu_new
  • tune batch size