Claude-skill-registry alma-scraper
Intelligent scraper for Australian youth justice sources. Discovers, extracts, and learns from government, Indigenous, research, and media sources.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/alma-scraper" ~/.claude/skills/majiayu000-claude-skill-registry-alma-scraper && rm -rf "$T"
manifest:
skills/data/alma-scraper/SKILL.mdsource content
ALMA Intelligent Scraper
When to Use
- Finding new youth justice information
- Updating ALMA intelligence
- Discovering new sources
- Analyzing coverage gaps
- Checking what's new in youth justice
Commands
| Command | Purpose | Duration |
|---|---|---|
| Top 10 high-value sources | 5 min |
| All 50+ sources with discovery | 30-60 min |
| Follow discovered links | Variable |
| Deep dive specific jurisdiction | 15 min |
| Show coverage gaps | 2 min |
| Current knowledge state | Instant |
Learning Cycle
SCRAPE → EXTRACT → EVALUATE → LEARN → STORE (Claude) (Quality) (Patterns)
Quality Signals
| Signal | Weight |
|---|---|
| Relevance (AU youth justice?) | 30% |
| Novelty (new info?) | 25% |
| Specificity (concrete details?) | 20% |
| Evidence (research backed?) | 15% |
| Actionability (useful?) | 10% |
Priority Formula
priority = (quality × 0.4) + (freshness_need × 0.3) + (coverage_gap × 0.3)
Sacred Boundaries
Never scrape: Private info, court records, social media, paywalled Always mark: Community Controlled, Indigenous orgs, cultural knowledge Always check: Consent level, cultural authority, data sovereignty
File References
| Need | Reference |
|---|---|
| Database schema | |
| Extraction patterns | |
| Coverage tracking | |
| Implementation code | |