Claude-skill-registry alma-scraper

Intelligent scraper for Australian youth justice sources. Discovers, extracts, and learns from government, Indigenous, research, and media sources.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/alma-scraper" ~/.claude/skills/majiayu000-claude-skill-registry-alma-scraper && rm -rf "$T"
manifest: skills/data/alma-scraper/SKILL.md
source content

ALMA Intelligent Scraper

When to Use

  • Finding new youth justice information
  • Updating ALMA intelligence
  • Discovering new sources
  • Analyzing coverage gaps
  • Checking what's new in youth justice

Commands

CommandPurposeDuration
quick
Top 10 high-value sources5 min
deep
All 50+ sources with discovery30-60 min
discover
Follow discovered linksVariable
source "QLD"
Deep dive specific jurisdiction15 min
gaps
Show coverage gaps2 min
status
Current knowledge stateInstant

Learning Cycle

SCRAPE → EXTRACT → EVALUATE → LEARN → STORE
         (Claude)   (Quality)  (Patterns)

Quality Signals

SignalWeight
Relevance (AU youth justice?)30%
Novelty (new info?)25%
Specificity (concrete details?)20%
Evidence (research backed?)15%
Actionability (useful?)10%

Priority Formula

priority = (quality × 0.4) + (freshness_need × 0.3) + (coverage_gap × 0.3)

Sacred Boundaries

Never scrape: Private info, court records, social media, paywalled Always mark: Community Controlled, Indigenous orgs, cultural knowledge Always check: Consent level, cultural authority, data sovereignty

File References

NeedReference
Database schema
references/database-schema.md
Extraction patterns
references/extraction-patterns.md
Coverage tracking
references/coverage-tracking.md
Implementation code
references/implementation.md