install
source · Clone the upstream repo
git clone https://github.com/MacPhobos/research-mind
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/MacPhobos/research-mind "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/universal-data-sec-edgar-pipeline" ~/.claude/skills/macphobos-research-mind-universal-data-sec-edgar-pipeline && rm -rf "$T"
manifest:
.claude/skills/universal-data-sec-edgar-pipeline/skill.mdsource content
SEC EDGAR Pipeline
Overview
This pipeline is centered on
edgar-analyzer and the EDGAR data sources. The core loop is: configure credentials, create a project with examples, analyze patterns, generate code, run extraction, and export reports.
Setup (Keys + User Agent)
Use the setup wizard to configure required keys:
python -m edgar_analyzer setup # or edgar-analyzer setup
Required entries:
OPENROUTER_API_KEY- (Optional)
JINA_API_KEY
user agent string ("Name email@example.com")EDGAR
End-to-End CLI Workflow
# 1. Create project edgar-analyzer project create my_project --template minimal # 2. Add examples + project.yaml # projects/my_project/examples/*.json # 3. Analyze examples edgar-analyzer analyze-project projects/my_project # 4. Generate extraction code edgar-analyzer generate-code projects/my_project # 5. Run extraction edgar-analyzer run-extraction projects/my_project --output-format csv
Outputs land in
projects/<name>/output/.
EDGAR-Specific Conventions
- CIK values are 10-digit, zero-padded (e.g.,
).0000320193 - Rate limit: SEC API allows 10 requests/sec. Scripts use ~0.11s delays.
- User agent is mandatory; include name + email.
Scripted Example (Apple DEF 14A)
edgar/scripts/fetch_apple_def14a.py shows the direct flow:
- Fetch latest DEF 14A metadata
- Download HTML
- Parse Summary Compensation Table (SCT)
- Save raw HTML + extracted JSON + ground truth
Recipe-Driven Extraction
edgar/recipes/sct_extraction/config.yaml defines a multi-step pipeline:
- Fetch DEF 14A filings by company list
- Extract SCT tables with
SCTAdapter - Validate with
sct_validator - Write results to
output/sct
Report Generation
edgar/scripts/create_csv_reports.py converts JSON results into:
executive_compensation_<timestamp>.csvtop_25_executives_<timestamp>.csvcompany_summary_<timestamp>.csv
Troubleshooting
- No filings found: confirm CIK formatting and filing type (DEF 14A vs DEF 14A/A).
- API errors: slow down requests and confirm user-agent is set.
- Extraction errors: regenerate code or use manual ground truth in POC scripts.
Related Skills
universal/data/reporting-pipelinestoolchains/python/testing/pytest