LoongFlow file-processing
Data file processing utilities for CSV, JSON, and text files. Provides helpers for reading, transforming, and validating structured data.
install
source · Clone the upstream repo
git clone https://github.com/baidu-baige/LoongFlow
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/baidu-baige/LoongFlow "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/file-processing" ~/.claude/skills/baidu-baige-loongflow-file-processing && rm -rf "$T"
manifest:
.claude/skills/file-processing/SKILL.mdsource content
File Processing Skill
This skill provides utilities and guidance for building robust file processing applications.
Purpose
Use this skill when your task involves:
- Reading and parsing CSV, JSON, or text files
- Data validation and cleaning
- File format conversions
- Batch processing of multiple files
- Generating reports from data files
Key Capabilities
1. Data Reading
- CSV parsing with header detection
- JSON file handling (single object or array)
- Text file processing line-by-line
- Error handling for malformed files
2. Data Validation
- Check for required fields
- Validate data types
- Handle missing values
- Report data quality issues
3. Data Transformation
- Filter rows based on conditions
- Calculate statistics (sum, avg, count)
- Format conversions
- Data aggregation
4. Output Generation
- Write processed data to new files
- Generate summary reports
- Create multiple output formats
Best Practices
Project Structure for File Processing
project/ ├── main.py # Entry point with CLI ├── file_reader.py # File I/O operations ├── data_processor.py # Core processing logic ├── validator.py # Data validation ├── config.py # Configuration constants └── utils.py # Helper functions
Error Handling Pattern
def read_file_safely(filepath): """Read file with proper error handling""" try: if not os.path.exists(filepath): raise FileNotFoundError(f"File not found: {filepath}") with open(filepath, 'r', encoding='utf-8') as f: return f.read() except Exception as e: print(f"Error reading file: {e}") return None
CSV Processing Template
import csv def process_csv(input_file, output_file): """Process CSV with header detection""" with open(input_file, 'r', encoding='utf-8') as f: reader = csv.DictReader(f) processed = [] for row in reader: # Transform each row processed_row = transform_row(row) processed.append(processed_row) # Write results with open(output_file, 'w', encoding='utf-8') as f: if processed: writer = csv.DictWriter(f, fieldnames=processed[0].keys()) writer.writeheader() writer.writerows(processed)
JSON Processing Template
import json def process_json(input_file, output_file): """Process JSON data""" with open(input_file, 'r', encoding='utf-8') as f: data = json.load(f) # Process data (handle both list and dict) processed = process_data(data) with open(output_file, 'w', encoding='utf-8') as f: json.dump(processed, f, indent=2, ensure_ascii=False)
Common Patterns
1. CLI with Argument Parsing
import argparse def main(): parser = argparse.ArgumentParser(description='File Processor') parser.add_argument('input', help='Input file path') parser.add_argument('output', help='Output file path') parser.add_argument('--format', choices=['csv', 'json'], default='csv') args = parser.parse_args() process_file(args.input, args.output, args.format)
2. Batch Processing
import glob def process_directory(input_dir, output_dir, pattern='*.csv'): """Process all matching files in directory""" files = glob.glob(os.path.join(input_dir, pattern)) for filepath in files: filename = os.path.basename(filepath) output_path = os.path.join(output_dir, f"processed_{filename}") process_file(filepath, output_path)
3. Progress Reporting
def process_with_progress(items): """Process items with progress feedback""" total = len(items) for i, item in enumerate(items, 1): process_item(item) print(f"Progress: {i}/{total} ({i*100//total}%)", end='\r') print() # New line when complete
Tools Available
When implementing file processing tasks, you have access to:
- Read file contentsRead
- Create new filesWrite
- Modify existing filesEdit
- Find files by patternGlob
- Run shell commands (e.g.,Bash
,wc -l
)head
Testing Tips
Always test your file processor with:
- Empty files - Should handle gracefully
- Malformed data - CSV with wrong column count, invalid JSON
- Missing files - Should provide clear error messages
- Large files - Consider memory usage
- Special characters - Unicode, newlines in CSV fields
Example Task Breakdown
Task: "Create a CSV analyzer that calculates statistics"
Suggested Steps:
- Read CSV file and detect headers
- Parse data into structured format
- Calculate statistics (count, sum, average) per column
- Generate summary report
- Write results to output file
Recommended Structure:
- Main programcsv_analyzer.py
- Statistics calculationsstats.py
- Format outputreport_generator.py
References
For more complex tasks, consider:
- Python's
module for CSV handlingcsv
module for JSON operationsjson
for cross-platform file pathspathlib
for advanced data processing (if allowed)pandas