Claude-skill-registry debug-fetcher
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/debug-fetcher" ~/.claude/skills/majiayu000-claude-skill-registry-debug-fetcher && rm -rf "$T"
manifest:
skills/data/debug-fetcher/SKILL.mdsource content
Debug-Fetcher Skill
Automated fetch failure handling that:
- Queries /memory first - applies learned strategies before trying defaults
- Exhausts all strategies - direct, playwright, wayback, brave, jina, proxy, UA rotation
- Stores successes - saves working strategies to /memory for future runs
- Collaborates with humans - uses /interview when all automated strategies fail
Quick Start
# Fetch single URL with failure handling ./run.sh fetch https://example.com # Fetch batch with failure handling ./run.sh fetch-batch urls.txt # Check what was learned about a domain ./run.sh recall example.com # Export all learned strategies ./run.sh export-learnings
How It Works
URL Request │ ▼ ┌──────────────────────────┐ │ 1. Query /memory │ │ "What works for this │ │ domain?" │ └──────────────────────────┘ │ ▼ ┌──────────────────────────┐ │ 2. Try learned strategy │ │ (if exists) │ └──────────────────────────┘ │ ▼ (fail or no learned strategy) ┌──────────────────────────┐ │ 3. Exhaust strategies: │ │ - direct fetch │ │ - playwright │ │ - wayback machine │ │ - brave alternates │ │ - jina reader │ │ - proxy rotation │ │ - user-agent rotation │ └──────────────────────────┘ │ ▼ (all fail) ┌──────────────────────────┐ │ 4. Launch /interview │ │ Ask human for help: │ │ - Credentials? │ │ - Mirror URL? │ │ - Manual download? │ │ - Skip this URL? │ └──────────────────────────┘ │ ▼ ┌──────────────────────────┐ │ 5. Store to /memory │ │ - Successful strategy │ │ - Domain patterns │ │ - Human-provided info │ └──────────────────────────┘
Memory Schema
Each learned strategy stores:
| Field | Description |
|---|---|
| Target domain (e.g., "nytimes.com") |
| URL path pattern (e.g., "/article/*") |
| What worked (e.g., "playwright") |
| Custom headers that helped |
| How long the fetch took |
| Historical success rate |
| How many times this domain failed |
| Timestamp of last use |
| When strategy was first learned |
Commands
| Command | Description |
|---|---|
| Fetch single URL with failure handling |
| Fetch list of URLs with failure handling |
| Show learned strategies for domain |
| Export all strategies to JSON |
Environment Variables
| Variable | Description |
|---|---|
| Memory scope for storing strategies (default: "fetcher_strategies") |
| Max retries per strategy (default: 2) |
| Min failures before triggering interview (default: 3) |
Integration with Fetcher
Debug-fetcher wraps the standard fetcher skill and adds failure handling capabilities. All fetcher environment variables (BRAVE_API_KEY, FETCHER_EMIT_MARKDOWN, etc.) are respected.
Examples
Learning from Failures
After fetching a batch of URLs, debug-fetcher stores successful strategies:
# Fetch a batch ./run.sh fetch-batch urls.txt --output results.jsonl # View what was learned ./run.sh recall attack.mitre.org # Output: # Domain: attack.mitre.org # Strategy: playwright # Success rate: 95% # Last used: 2025-01-30 # Next time, playwright will be tried first for attack.mitre.org ./run.sh fetch https://attack.mitre.org/techniques/T1059
Human-in-the-Loop Interview
When all strategies fail, an interview is generated:
# Fetch batch with failures ./run.sh fetch-batch difficult_urls.txt # Interview generated at: /tmp/interview_abc123.json # Run: ./agents/skills/interview/run.sh /tmp/interview_abc123.json # Example interview questions: # - "Failed 5 URLs from nytimes.com. Do you have credentials?" # - "archive.org not working. Try a mirror URL?"
YouTube URL Handling
YouTube URLs are automatically detected and handled via the
/ingest-youtube skill:
# YouTube URLs use transcript extraction ./run.sh fetch https://www.youtube.com/watch?v=abc123 # Uses: /ingest-youtube skill for transcript extraction # Falls back to other strategies if transcript unavailable
Batch Analysis
After a batch run, analyze patterns:
from debug_fetcher.batch_analyzer import analyze_batch, get_failure_summary # Get summary summary = get_failure_summary(results) # { # "total": 1000, # "success": 850, # "failed": 150, # "success_rate": "85.0%", # "top_failing_domains": [ # {"domain": "nytimes.com", "count": 45}, # {"domain": "wsj.com", "count": 30} # ], # "patterns": [ # "All 45 URLs from nytimes.com returned HTTP 403", # "High failure rate: 50% of failures are paywalled sites" # ] # }
Recovery Actions
When human provides help via interview:
| Action Type | Description | Example |
|---|---|---|
| Login credentials provided | username/password for site |
| Alternative URL to try | archive.org mirror |
| Human downloaded file manually | Path to local PDF |
| URL not needed | "Not critical" |
| Try again later | Server was down |
| Specific approach suggested | "Use proxy" |
Files
.agents/skills/debug-fetcher/ ├── SKILL.md # This file ├── run.sh # Entry point ├── pyproject.toml # Dependencies └── debug_fetcher/ # Python package ├── __init__.py ├── cli.py # CLI commands ├── memory_schema.py # FetchStrategy dataclass ├── memory_bridge.py # Recall/learn from /memory ├── strategy_engine.py # Strategy exhaustion loop ├── batch_analyzer.py # Analyze batch failures ├── interview_generator.py # Generate /interview JSON ├── interview_processor.py # Process interview responses ├── recovery_executor.py # Execute recovery actions └── pdf_bridge.py # Cross-skill integration with debug-pdf
Companion Skill: debug-pdf
debug-fetcher and debug-pdf work together in the pipeline:
URL → debug-fetcher → /fetcher → /extractor → debug-pdf ↓ ↓ fetch fail extraction fail ↓ ↓ retry/recover analyze PDF issues ↓ ↓ /memory /memory
Shared failure patterns:
| Pattern | debug-fetcher | debug-pdf |
|---|---|---|
| HTTP 401/403 | N/A |
| HTTP 403 | N/A |
| Soft paywall | N/A |
| N/A | Encrypted PDF |
| N/A | No text layer |
| Wayback wrapper | Wayback wrapper |
Cross-skill notifications:
- When debug-fetcher successfully fetches a PDF but detects issues (password protected, scanned), it notifies debug-pdf via agent-inbox
- When debug-fetcher fails to fetch a PDF URL, it notifies debug-pdf for tracking
Related Skills
- Stores learned fetch strategies/memory
- Human collaboration for unrecoverable URLs/interview
- YouTube transcript extraction/ingest-youtube
- Core URL fetching functionality/fetcher
- Content extraction from fetched documents/extractor
- Companion skill for PDF extraction failures/debug-pdf