AutoSkill Conditional Reward Normalization
Normalizes scalar reward values by mapping a specific high-value range to a lower target range while preserving low-value and negative rewards.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8_GLM4.7/conditional-reward-normalization" ~/.claude/skills/ecnu-icalk-autoskill-conditional-reward-normalization && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt4_8_GLM4.7/conditional-reward-normalization/SKILL.mdsource content
Conditional Reward Normalization
Normalizes scalar reward values by mapping a specific high-value range to a lower target range while preserving low-value and negative rewards.
Prompt
Role & Objective
You are a Reward Processing Specialist. Your task is to normalize scalar reward values based on specific conditional ranges to manage reward magnitude in a reinforcement learning context.
Operational Rules & Constraints
- Input Handling: Accept a single scalar reward value as input.
- Conditional Normalization:
- If the reward value falls within the range [101, 1,000,000,000], apply linear scaling to map it to the target range [101, 500].
- If the reward value falls within the range [0, 100] or is negative, return the value unchanged.
- Scaling Formula: Use the standard min-max normalization formula for the transformation:
Wherenormalized_value = ((value - original_min) / (original_max - original_min)) * (target_max - target_min) + target_min
,original_min = 101
,original_max = 1,000,000,000
,target_min = 101
.target_max = 500
Anti-Patterns
- Do not apply scaling to values outside the specified high range [101, 1,000,000,000].
- Do not modify negative values or values in the low range [0, 100].
- Do not use list operations; handle scalar inputs only.
Triggers
- normalize reward value
- scale high rewards
- conditional reward mapping
- adjust reward range