Claude-skill-registry extracting-session-data
Locates, lists, filters, and extracts structured data from Claude Code native session logs. Supports both single and multiple session analysis.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/extracting-session-data" ~/.claude/skills/majiayu000-claude-skill-registry-extracting-session-data && rm -rf "$T"
skills/data/extracting-session-data/SKILL.mdExtracting Session Data Skill
Core Responsibility
Provide raw access to Claude Code session logs stored in
~/.claude/projects/{project-dir}/{session-id}.jsonl.
Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.
Available Scripts
All scripts located in
scripts/ subdirectory relative to this skill.
1. locate-logs.sh
Find log directory or specific session file path.
# Get logs directory for current working directory scripts/locate-logs.sh # Get logs directory for specific project scripts/locate-logs.sh /path/to/project # Get specific session log file path scripts/locate-logs.sh /path/to/project abc123-session-id
Use when: Building dynamic paths, verifying logs exist before processing.
2. list-sessions.sh
Enumerate all sessions with metadata (ID, size, lines, date, branch).
# List all sessions (table format) scripts/list-sessions.sh # JSON output scripts/list-sessions.sh --format json # Sort by size or lines scripts/list-sessions.sh --sort size scripts/list-sessions.sh --sort lines # Specific project scripts/list-sessions.sh /path/to/project
Output formats:
table, json, csv
Sort options: date, size, lines
Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.
3. extract-data.sh
Parse JSONL logs and extract specific data types.
Available extraction types:
- Session info (ID, timestamps, branch, working dir)metadata
- All user messagesuser-prompts
- Tool call statisticstool-usage
- Failed tool calls with timestampserrors
- Thinking blocks (if extended thinking enabled)thinking
- Assistant text responses onlytext-responses
- Session metrics (message counts, tool calls, errors)statistics
- Combined extractionall
# Extract from specific session scripts/extract-data.sh --type statistics --session SESSION_ID scripts/extract-data.sh --type errors --session SESSION_ID scripts/extract-data.sh --type tool-usage --session SESSION_ID # Extract from all sessions (omit --session) scripts/extract-data.sh --type statistics # Limit output scripts/extract-data.sh --type user-prompts --limit 10 # Different project scripts/extract-data.sh --type metadata --project /path/to/project
Use when: Need specific data without loading entire log, generating metrics, identifying errors.
4. filter-sessions.sh
Find sessions matching criteria.
Filter options:
- Sessions modified since date ("2 days ago", "2025-10-20")--since DATE
- Sessions modified until date--until DATE
- Sessions on specific git branch--branch NAME
- Minimum file size ("1M", "500K")--min-size SIZE
- Maximum file size--max-size SIZE
- Minimum line count--min-lines N
- Maximum line count--max-lines N
- Only sessions with failed tool calls--has-errors
- Sessions containing keyword--keyword WORD
Output formats:
list, paths, json
# Recent sessions scripts/filter-sessions.sh --since "2 days ago" # Large sessions with errors scripts/filter-sessions.sh --min-lines 500 --has-errors # Sessions on main branch in last week scripts/filter-sessions.sh --branch main --since "7 days ago" # Sessions containing keyword scripts/filter-sessions.sh --keyword "authentication" # Get paths only (for piping) scripts/filter-sessions.sh --since "1 day ago" --format paths
Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.
Working Process
Single Session Analysis
# 1. Verify session exists and get metadata scripts/extract-data.sh --type metadata --session SESSION_ID # 2. Get session statistics (to determine size) scripts/extract-data.sh --type statistics --session SESSION_ID # 3. Extract specific data as needed scripts/extract-data.sh --type errors --session SESSION_ID scripts/extract-data.sh --type tool-usage --session SESSION_ID
Multiple Session Analysis
# 1. Filter to find relevant sessions scripts/filter-sessions.sh --since "7 days ago" --branch main # 2. Extract data from all filtered sessions scripts/extract-data.sh --type statistics # 3. Or iterate through filtered subset SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths) for session in $SESSIONS; do SESSION_ID=$(basename "$session" .jsonl) scripts/extract-data.sh --type errors --session $SESSION_ID done
Integration Pattern for Calling Skills
When another skill (like retrospecting) needs session data:
- Discovery: Use
orlist-sessions.sh
to find relevant sessionsfilter-sessions.sh - Size Check: Use
to determine session complexityextract-data.sh --type statistics - Targeted Extraction: Use
with specific types for needed dataextract-data.sh - Return Raw Data: Return extracted data to caller for analysis
Example:
# Get latest session ID LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId') # Check size before processing STATS=$(scripts/extract-data.sh --type statistics --session $LATEST) LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}') # Extract based on size if [ "$LINE_COUNT" -lt 500 ]; then # Small session: extract detail scripts/extract-data.sh --type errors --session $LATEST scripts/extract-data.sh --type tool-usage --session $LATEST else # Large session: summary only scripts/extract-data.sh --type statistics --session $LATEST fi
Context Budget Management
CRITICAL: This skill is designed for context efficiency
Use Bash Processing, Not Read Tool
# GOOD: Extract via bash, stays in bash context STATS=$(scripts/extract-data.sh --type statistics) # Process $STATS in bash # BAD: Reading full log files Read ~/.claude/projects/-path/session.jsonl # Loads entire file into context unnecessarily
Check Session Size Before Loading
Never load full session logs into context without checking size first.
# Always check statistics first scripts/extract-data.sh --type statistics --session SESSION_ID # Shows total lines, message counts, etc. # Decision rules: # - Small (<500 lines): Can extract detail safely # - Medium (500-2000 lines): Use selective extraction # - Large (>2000 lines): Statistics only, offer targeted deep-dives
Return Raw Data to Caller
This skill should:
- Execute bash scripts to extract data
- Return raw text output to calling skill
- Let calling skill manage context for analysis
- Avoid interpretation or analysis within this skill
Output Format
Return raw extracted data with minimal formatting:
# Statistics output Session: abc123-def456-ghi789 Total Lines: 450 User Messages: 12 Assistant Messages: 23 Tool Calls: 45 Errors: 2 # Tool usage output === Tool Usage: abc123-def456-ghi789 === Read 15 Bash 12 Edit 8 Grep 5 Write 3
No analysis, no interpretation - just data extraction.
Error Handling
All scripts exit with non-zero status on errors and output to stderr.
Check exit status before processing:
if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then # Handle: logs directory doesn't exist echo "Project has no session logs yet" fi if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then # Handle: session doesn't exist echo "Session not found" fi
Common error messages:
Error: Logs directory not found: ~/.claude/projects/-pathError: Session file not found: ~/.claude/projects/-path/session-id.jsonlError: --type is requiredError: jq is required but not installed. Install with: brew install jq
Path Calculation
Claude Code stores sessions using this pattern:
~/.claude/projects/{project-identifier}/{session-id}.jsonl
Where
{project-identifier} is calculated by replacing all / with - in the absolute working directory path:
# Example: /Users/user/project → -Users-user-project PROJECT_ID=$(echo "${PWD}" | sed 's/\//\-/g') LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"
All scripts use
locate-logs.sh internally for consistent path calculation.
Anti-Patterns to Avoid
Don't:
- Load full session logs into context without checking size
- Parse JSONL manually - use
extract-data.sh - Hardcode log paths - use
locate-logs.sh - Analyze or interpret data - return raw data to caller
- Process large logs synchronously without user awareness
Do:
- Check session size with
before processing--type statistics - Use appropriate extraction type for specific needs
- Filter sessions before extraction for efficiency
- Stream/pipe data when processing multiple sessions
- Return raw data for caller to analyze
Success Criteria
Effective use of this skill means:
- Efficient Discovery: Quickly find relevant sessions without manual searching
- Targeted Extraction: Get exactly the data needed, nothing more
- Context Preservation: Avoid loading unnecessary data into context
- Raw Data Focus: Return unprocessed data for caller to analyze
- Multi-Session Support: Handle analysis across timeframes or branches efficiently
Dependencies
Required:
(v4.0+)bash
(JSON parser)jq
Scripts check for
jq and provide installation instructions if missing:
Error: jq is required but not installed. Install with: brew install jq