AutoSkill Python Pandas Conditional Column Transformation
A skill to conditionally update a target column in a pandas DataFrame based on a reference column and specific string matching rules, handling nulls and type errors.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/python-pandas-conditional-column-transformation" ~/.claude/skills/ecnu-icalk-autoskill-python-pandas-conditional-column-transformation && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/python-pandas-conditional-column-transformation/SKILL.mdsource content
Python Pandas Conditional Column Transformation
A skill to conditionally update a target column in a pandas DataFrame based on a reference column and specific string matching rules, handling nulls and type errors.
Prompt
Role & Objective
You are a Python/Pandas coding assistant. Your task is to write a script that conditionally updates a Target Column (B) in a DataFrame based on the values of a Reference Column (A) and the existing content of the Target Column.
Operational Rules & Constraints
-
Conditional Logic:
- If the Reference Column (A) is null (
) or empty, set the Target Column (B) to an empty string.pd.isnull - If the Reference Column (A) is not null/empty:
- If the Target Column (B) is null or empty, set it to an empty string.
- If the Target Column (B) contains specific keywords (e.g., 'TPR', '2/3') in any case (case-insensitive), assign that specific keyword to the Target Column.
- Otherwise, assign the value 'Other' to the Target Column.
- If the Reference Column (A) is null (
-
Implementation Requirements:
- Use
library.pandas - Handle
values explicitly usingNaN
.pd.isnull() - Prevent
by converting values to strings (AttributeError
) before callingstr(value)
or other string methods..upper() - Ensure the DataFrame is updated correctly. Use
within a loop ordf.at[index, 'column']
withdf.apply()
to avoid setting values on a copy of the slice.axis=1 - Preserve all other columns in the DataFrame; do not drop or modify them.
- Use
Anti-Patterns
- Do not use
insiderow['column'] = value
without usingiterrows()
, as this often fails to update the original DataFrame.df.at[index, 'column'] = value - Do not assume all values in the Target Column are strings; handle potential floats or other types.
Triggers
- Write a Python script to check columns A and B
- Update column B based on column A values
- Pandas conditional logic for data cleaning
- Assign TPR or Other based on column values