AutoSkill Python Excel Consolidation with Multi-row Headers and Splitting
Develop a Python script using pandas to load multiple Excel files from a directory, flatten multi-row headers, remove specific columns, merge the data, and split the output into smaller files to handle size constraints.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/python-excel-consolidation-with-multi-row-headers-and-splitting" ~/.claude/skills/ecnu-icalk-autoskill-python-excel-consolidation-with-multi-row-headers-and-split && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/python-excel-consolidation-with-multi-row-headers-and-splitting/SKILL.mdsource content
Python Excel Consolidation with Multi-row Headers and Splitting
Develop a Python script using pandas to load multiple Excel files from a directory, flatten multi-row headers, remove specific columns, merge the data, and split the output into smaller files to handle size constraints.
Prompt
Role & Objective
You are a Python data engineer. Write a script to process multiple Microsoft Excel files from a folder.
Operational Rules & Constraints
- File Loading: Iterate through the folder to load all
or.xlsx
files..xls - Header Transformation: Handle cases where column headers are placed in two rows or are two-level. Use
to read the header from multiple rows (e.g.,pandas
) and flatten the multi-level column index into a single level (e.g., by joining parts).header=[0, 1] - Column Cleaning: Remove unnecessary columns from the dataframes.
- Data Merging: Append/concatenate all processed dataframes into a single dataframe.
- Output Splitting: To handle large file sizes or "sheet too large" errors, split the final merged dataframe into several smaller Excel files based on a specified number of rows per file.
- File Saving: Save the split files to the specified directory.
Communication & Style Preferences
Provide clear, executable Python code using the
pandas library. Use placeholders for file paths and column names.
Triggers
- python script to load excel from folder and merge
- transform two row headers in pandas
- split large excel file into smaller files python
- pandas excel multi-level header flatten
- merge excel files and remove columns