Skillsbench csv-processing
Use this skill when reading sensor data from CSV files, writing simulation results to CSV, processing time-series data with pandas, or handling missing values in datasets.
install
source · Clone the upstream repo
git clone https://github.com/benchflow-ai/skillsbench
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/benchflow-ai/skillsbench "$T" && mkdir -p ~/.claude/skills && cp -r "$T/tasks/adaptive-cruise-control/environment/skills/csv-processing" ~/.claude/skills/benchflow-ai-skillsbench-csv-processing && rm -rf "$T"
manifest:
tasks/adaptive-cruise-control/environment/skills/csv-processing/SKILL.mdsource content
CSV Processing with Pandas
Reading CSV
import pandas as pd df = pd.read_csv('data.csv') # View structure print(df.head()) print(df.columns.tolist()) print(len(df))
Handling Missing Values
# Read with explicit NA handling df = pd.read_csv('data.csv', na_values=['', 'NA', 'null']) # Check for missing values print(df.isnull().sum()) # Check if specific value is NaN if pd.isna(row['column']): # Handle missing value
Accessing Data
# Single column values = df['column_name'] # Multiple columns subset = df[['col1', 'col2']] # Filter rows filtered = df[df['column'] > 10] filtered = df[(df['time'] >= 30) & (df['time'] < 60)] # Rows where column is not null valid = df[df['column'].notna()]
Writing CSV
import pandas as pd # From dictionary data = { 'time': [0.0, 0.1, 0.2], 'value': [1.0, 2.0, 3.0], 'label': ['a', 'b', 'c'] } df = pd.DataFrame(data) df.to_csv('output.csv', index=False)
Building Results Incrementally
results = [] for item in items: row = { 'time': item.time, 'value': item.value, 'status': item.status if item.valid else None } results.append(row) df = pd.DataFrame(results) df.to_csv('results.csv', index=False)
Common Operations
# Statistics mean_val = df['column'].mean() max_val = df['column'].max() min_val = df['column'].min() std_val = df['column'].std() # Add computed column df['diff'] = df['col1'] - df['col2'] # Iterate rows for index, row in df.iterrows(): process(row['col1'], row['col2'])