Learn-skills.dev xargs-parallel
Parallel execution with xargs, GNU parallel, and batch processing patterns. Use when user mentions "xargs", "parallel", "batch processing", "run in parallel", "parallel execution", "process list of files", "bulk operations", "concurrent commands", "map over files", or running commands on multiple inputs.
install
source · Clone the upstream repo
git clone https://github.com/NeverSight/learn-skills.dev
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/1mangesh1/dev-skills-collection/xargs-parallel" ~/.claude/skills/neversight-learn-skills-dev-xargs-parallel && rm -rf "$T"
manifest:
data/skills-md/1mangesh1/dev-skills-collection/xargs-parallel/SKILL.mdsource content
xargs and Parallel Execution
xargs Basics
Read from stdin and pass as arguments to a command:
# Basic usage: pass stdin lines as arguments echo "file1.txt file2.txt" | xargs rm # -I {} sets a placeholder for each input line cat urls.txt | xargs -I {} curl -O {} # -n controls how many arguments per command invocation echo "a b c d e f" | xargs -n 2 echo # Output: # a b # c d # e f # -t prints each command before executing (trace mode) ls *.log | xargs -t rm # Read arguments from a file xargs -a filelist.txt rm
xargs with find
Always use
-print0 / -0 to handle filenames with spaces and special characters:
# Safe deletion of files matching a pattern find . -name "*.tmp" -print0 | xargs -0 rm -f # Count lines in all Python files find . -name "*.py" -print0 | xargs -0 wc -l # Change permissions on specific files find /var/log -name "*.log" -print0 | xargs -0 chmod 644 # Grep across files found by find find src/ -name "*.js" -print0 | xargs -0 grep -l "TODO"
Parallel Execution with xargs -P
-P N runs up to N processes in parallel:
# Compress files using 4 parallel jobs find . -name "*.log" -print0 | xargs -0 -P 4 -I {} gzip {} # Download URLs in parallel (8 at a time) cat urls.txt | xargs -P 8 -I {} curl -sO {} # Convert images in parallel find . -name "*.png" -print0 | xargs -0 -P 4 -I {} convert {} -resize 50% resized_{} # Use all available cores find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip
GNU parallel Basics
GNU parallel offers more features if installed (
brew install parallel / apt install parallel):
# Basic usage (similar to xargs -P) cat urls.txt | parallel curl -sO {} # Control job count find . -name "*.csv" | parallel -j 8 gzip {} # Progress bar find . -name "*.mp4" | parallel --bar ffmpeg -i {} -vf scale=640:-1 small_{} # Retry failed jobs cat urls.txt | parallel --retries 3 curl -sO {} # Distribute jobs across multiple machines (SSH) parallel -S server1,server2 --transferfile {} gzip ::: *.log # Keep output order matching input order seq 10 | parallel -k 'sleep $((RANDOM % 3)); echo {}'
Common Patterns
Bulk Rename Files
# Add a prefix ls *.jpg | xargs -I {} mv {} archive_{} # Change extension (using parameter expansion in a subshell) find . -name "*.txt" -print0 | xargs -0 -I {} bash -c 'mv "$1" "${1%.txt}.md"' _ {} # Lowercase all filenames in current directory ls | xargs -I {} bash -c 'mv "$1" "$(echo "$1" | tr "A-Z" "a-z")"' _ {}
Process All Files Matching a Pattern
# Format all Go files find . -name "*.go" -print0 | xargs -0 gofmt -w # Lint all JS files find src/ -name "*.js" -print0 | xargs -0 eslint --fix # Run a script against each config file find /etc -name "*.conf" -print0 | xargs -0 -I {} ./validate-config.sh {}
Download a List of URLs
# Download all URLs from a file, 10 in parallel cat urls.txt | xargs -P 10 -I {} curl -sfLO {} # Download with wget, retrying failures cat urls.txt | xargs -P 5 -I {} wget -q --retry-connrefused --tries=3 {} # With GNU parallel and a progress bar parallel --bar -j 10 curl -sfLO {} < urls.txt
Run Tests in Parallel
# Run test files in parallel find tests/ -name "test_*.py" | xargs -P 4 -I {} python -m pytest {} -v # Run multiple test suites concurrently echo "unit integration e2e" | xargs -n 1 -P 3 -I {} make test-{}
Batch API Calls
# POST each JSON file to an API endpoint find data/ -name "*.json" -print0 | xargs -0 -P 5 -I {} \ curl -s -X POST -H "Content-Type: application/json" -d @{} https://api.example.com/ingest # Process user IDs from a file cat user_ids.txt | xargs -P 10 -I {} \ curl -s "https://api.example.com/users/{}" -o "responses/{}.json"
Parallel Image Compression
# Compress PNGs in parallel with pngquant find . -name "*.png" -print0 | xargs -0 -P "$(nproc)" -I {} pngquant --force --quality=65-80 {} --output {} # Resize JPEGs with ImageMagick find photos/ -name "*.jpg" -print0 | xargs -0 -P 4 -I {} \ convert {} -resize 1920x1080\> -quality 85 optimized/{}
Bulk Git Operations Across Repos
# Pull latest in all repos under a directory find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \ xargs -0 -P 8 -I {} git -C "{}/.." pull --ff-only # Check status of all repos find ~/projects -maxdepth 2 -name ".git" -type d | \ xargs -I {} bash -c 'echo "=== $(dirname {}) ===" && git -C "{}/.." status -s' # Garbage collect all repos in parallel find ~/projects -maxdepth 2 -name ".git" -type d -print0 | \ xargs -0 -P 4 -I {} git -C "{}/.." gc --quiet
Handling Filenames with Spaces
# -0 expects null-delimited input (pair with find -print0) find . -name "*.txt" -print0 | xargs -0 wc -l # -d '\n' treats newlines as delimiters (not spaces) ls | xargs -d '\n' -I {} echo "File: {}" # On macOS (BSD xargs lacks -d), use -0 with tr ls | tr '\n' '\0' | xargs -0 -I {} echo "File: {}"
Dry Run Before Executing
# Preview what would be deleted find . -name "*.bak" -print0 | xargs -0 echo rm # Use -p to prompt before each execution find . -name "*.tmp" -print0 | xargs -0 -p rm # With -t to trace commands as they run find . -name "*.log" -print0 | xargs -0 -t gzip
Error Handling
# xargs exits with 123 if any command fails find . -name "*.sh" -print0 | xargs -0 -P 4 bash # check $? # GNU parallel: halt on first failure cat jobs.txt | parallel --halt now,fail=1 process_job {} # GNU parallel: halt when 20% of jobs fail cat jobs.txt | parallel --halt soon,fail=20% process_job {} # Capture per-job exit codes with GNU parallel cat jobs.txt | parallel --joblog joblog.txt process_job {} # joblog.txt contains exit status for every job
xargs vs for Loops vs while read
Use xargs when:
- Processing output from
or another commandfind - You want built-in parallelism (
)-P - Batching multiple arguments per invocation (
)-n
Use
when:while read
- You need complex logic per iteration (if/else, multiple commands)
- The loop body uses shell variables that must persist across iterations
Use
loops when:for
- Iterating over a known, small list of items
- Glob expansion is sufficient (
)for f in *.txt - Readability matters more than performance
# for loop -- simple, readable, no parallelism for f in *.txt; do wc -l "$f"; done # while read -- complex logic per item find . -name "*.csv" | while read -r f; do count=$(wc -l < "$f") [ "$count" -gt 1000 ] && echo "Large: $f ($count lines)" done # xargs -- fast, parallel, concise find . -name "*.csv" -print0 | xargs -0 -P 4 wc -l
Resource-Aware Parallelism
# Use nproc to match available CPU cores find . -name "*.gz" -print0 | xargs -0 -P "$(nproc)" gunzip # Use half the cores to leave room for other work find . -name "*.log" -print0 | xargs -0 -P "$(( $(nproc) / 2 ))" gzip # GNU parallel: percentage-based, relative, or load-based limits parallel -j 50% gzip ::: *.log # 50% of cores parallel -j -2 gzip ::: *.log # cores minus 2 parallel --load 80% process_job ::: * # limit by load average # Limit concurrency for I/O-bound tasks (network, disk) cat urls.txt | xargs -P 5 -I {} curl -sO {} # Monitor parallel job resource usage parallel --joblog jobs.log -j 4 heavy_task ::: input_* && column -t jobs.log