Claude-skill-registry extracting-session-data

Locates, lists, filters, and extracts structured data from Claude Code native session logs. Supports both single and multiple session analysis.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/extracting-session-data" ~/.claude/skills/majiayu000-claude-skill-registry-extracting-session-data && rm -rf "$T"
manifest: skills/data/extracting-session-data/SKILL.md
source content

Extracting Session Data Skill

Core Responsibility

Provide raw access to Claude Code session logs stored in

~/.claude/projects/{project-dir}/{session-id}.jsonl
.

Key Principle: This skill extracts data only - return raw data to calling skills for analysis. Do not analyze or interpret within this skill.

Available Scripts

All scripts located in

scripts/
subdirectory relative to this skill.

1. locate-logs.sh

Find log directory or specific session file path.

# Get logs directory for current working directory
scripts/locate-logs.sh

# Get logs directory for specific project
scripts/locate-logs.sh /path/to/project

# Get specific session log file path
scripts/locate-logs.sh /path/to/project abc123-session-id

Use when: Building dynamic paths, verifying logs exist before processing.

2. list-sessions.sh

Enumerate all sessions with metadata (ID, size, lines, date, branch).

# List all sessions (table format)
scripts/list-sessions.sh

# JSON output
scripts/list-sessions.sh --format json

# Sort by size or lines
scripts/list-sessions.sh --sort size
scripts/list-sessions.sh --sort lines

# Specific project
scripts/list-sessions.sh /path/to/project

Output formats:

table
,
json
,
csv
Sort options:
date
,
size
,
lines

Use when: Starting retrospective, showing available sessions to user, checking for recent sessions.

3. extract-data.sh

Parse JSONL logs and extract specific data types.

Available extraction types:

  • metadata
    - Session info (ID, timestamps, branch, working dir)
  • user-prompts
    - All user messages
  • tool-usage
    - Tool call statistics
  • errors
    - Failed tool calls with timestamps
  • thinking
    - Thinking blocks (if extended thinking enabled)
  • text-responses
    - Assistant text responses only
  • statistics
    - Session metrics (message counts, tool calls, errors)
  • all
    - Combined extraction
# Extract from specific session
scripts/extract-data.sh --type statistics --session SESSION_ID
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID

# Extract from all sessions (omit --session)
scripts/extract-data.sh --type statistics

# Limit output
scripts/extract-data.sh --type user-prompts --limit 10

# Different project
scripts/extract-data.sh --type metadata --project /path/to/project

Use when: Need specific data without loading entire log, generating metrics, identifying errors.

4. filter-sessions.sh

Find sessions matching criteria.

Filter options:

  • --since DATE
    - Sessions modified since date ("2 days ago", "2025-10-20")
  • --until DATE
    - Sessions modified until date
  • --branch NAME
    - Sessions on specific git branch
  • --min-size SIZE
    - Minimum file size ("1M", "500K")
  • --max-size SIZE
    - Maximum file size
  • --min-lines N
    - Minimum line count
  • --max-lines N
    - Maximum line count
  • --has-errors
    - Only sessions with failed tool calls
  • --keyword WORD
    - Sessions containing keyword

Output formats:

list
,
paths
,
json

# Recent sessions
scripts/filter-sessions.sh --since "2 days ago"

# Large sessions with errors
scripts/filter-sessions.sh --min-lines 500 --has-errors

# Sessions on main branch in last week
scripts/filter-sessions.sh --branch main --since "7 days ago"

# Sessions containing keyword
scripts/filter-sessions.sh --keyword "authentication"

# Get paths only (for piping)
scripts/filter-sessions.sh --since "1 day ago" --format paths

Use when: User requests analysis of recent sessions, finding sessions for specific feature/branch, identifying problematic sessions.

Working Process

Single Session Analysis

# 1. Verify session exists and get metadata
scripts/extract-data.sh --type metadata --session SESSION_ID

# 2. Get session statistics (to determine size)
scripts/extract-data.sh --type statistics --session SESSION_ID

# 3. Extract specific data as needed
scripts/extract-data.sh --type errors --session SESSION_ID
scripts/extract-data.sh --type tool-usage --session SESSION_ID

Multiple Session Analysis

# 1. Filter to find relevant sessions
scripts/filter-sessions.sh --since "7 days ago" --branch main

# 2. Extract data from all filtered sessions
scripts/extract-data.sh --type statistics

# 3. Or iterate through filtered subset
SESSIONS=$(scripts/filter-sessions.sh --has-errors --format paths)
for session in $SESSIONS; do
    SESSION_ID=$(basename "$session" .jsonl)
    scripts/extract-data.sh --type errors --session $SESSION_ID
done

Integration Pattern for Calling Skills

When another skill (like retrospecting) needs session data:

  1. Discovery: Use
    list-sessions.sh
    or
    filter-sessions.sh
    to find relevant sessions
  2. Size Check: Use
    extract-data.sh --type statistics
    to determine session complexity
  3. Targeted Extraction: Use
    extract-data.sh
    with specific types for needed data
  4. Return Raw Data: Return extracted data to caller for analysis

Example:

# Get latest session ID
LATEST=$(scripts/list-sessions.sh --format json --sort date | jq -r '.[0].sessionId')

# Check size before processing
STATS=$(scripts/extract-data.sh --type statistics --session $LATEST)
LINE_COUNT=$(echo "$STATS" | grep "Total Lines:" | awk '{print $3}')

# Extract based on size
if [ "$LINE_COUNT" -lt 500 ]; then
    # Small session: extract detail
    scripts/extract-data.sh --type errors --session $LATEST
    scripts/extract-data.sh --type tool-usage --session $LATEST
else
    # Large session: summary only
    scripts/extract-data.sh --type statistics --session $LATEST
fi

Context Budget Management

CRITICAL: This skill is designed for context efficiency

Use Bash Processing, Not Read Tool

# GOOD: Extract via bash, stays in bash context
STATS=$(scripts/extract-data.sh --type statistics)
# Process $STATS in bash

# BAD: Reading full log files
Read ~/.claude/projects/-path/session.jsonl
# Loads entire file into context unnecessarily

Check Session Size Before Loading

Never load full session logs into context without checking size first.

# Always check statistics first
scripts/extract-data.sh --type statistics --session SESSION_ID
# Shows total lines, message counts, etc.

# Decision rules:
# - Small (<500 lines): Can extract detail safely
# - Medium (500-2000 lines): Use selective extraction
# - Large (>2000 lines): Statistics only, offer targeted deep-dives

Return Raw Data to Caller

This skill should:

  • Execute bash scripts to extract data
  • Return raw text output to calling skill
  • Let calling skill manage context for analysis
  • Avoid interpretation or analysis within this skill

Output Format

Return raw extracted data with minimal formatting:

# Statistics output
Session: abc123-def456-ghi789
  Total Lines: 450
  User Messages: 12
  Assistant Messages: 23
  Tool Calls: 45
  Errors: 2

# Tool usage output
=== Tool Usage: abc123-def456-ghi789 ===
Read                          15
Bash                          12
Edit                          8
Grep                          5
Write                         3

No analysis, no interpretation - just data extraction.

Error Handling

All scripts exit with non-zero status on errors and output to stderr.

Check exit status before processing:

if ! scripts/locate-logs.sh /path/to/project &>/dev/null; then
    # Handle: logs directory doesn't exist
    echo "Project has no session logs yet"
fi

if ! scripts/extract-data.sh --type metadata --session abc123 &>/dev/null; then
    # Handle: session doesn't exist
    echo "Session not found"
fi

Common error messages:

  • Error: Logs directory not found: ~/.claude/projects/-path
  • Error: Session file not found: ~/.claude/projects/-path/session-id.jsonl
  • Error: --type is required
  • Error: jq is required but not installed. Install with: brew install jq

Path Calculation

Claude Code stores sessions using this pattern:

~/.claude/projects/{project-identifier}/{session-id}.jsonl

Where

{project-identifier}
is calculated by replacing all
/
with
-
in the absolute working directory path:

# Example: /Users/user/project → -Users-user-project
PROJECT_ID=$(echo "${PWD}" | sed 's/\//\-/g')
LOGS_DIR="${HOME}/.claude/projects/${PROJECT_ID}"

All scripts use

locate-logs.sh
internally for consistent path calculation.

Anti-Patterns to Avoid

Don't:

  • Load full session logs into context without checking size
  • Parse JSONL manually - use
    extract-data.sh
  • Hardcode log paths - use
    locate-logs.sh
  • Analyze or interpret data - return raw data to caller
  • Process large logs synchronously without user awareness

Do:

  • Check session size with
    --type statistics
    before processing
  • Use appropriate extraction type for specific needs
  • Filter sessions before extraction for efficiency
  • Stream/pipe data when processing multiple sessions
  • Return raw data for caller to analyze

Success Criteria

Effective use of this skill means:

  1. Efficient Discovery: Quickly find relevant sessions without manual searching
  2. Targeted Extraction: Get exactly the data needed, nothing more
  3. Context Preservation: Avoid loading unnecessary data into context
  4. Raw Data Focus: Return unprocessed data for caller to analyze
  5. Multi-Session Support: Handle analysis across timeframes or branches efficiently

Dependencies

Required:

  • bash
    (v4.0+)
  • jq
    (JSON parser)

Scripts check for

jq
and provide installation instructions if missing:

Error: jq is required but not installed. Install with: brew install jq