Gbrain media-ingest
install
source · Clone the upstream repo
git clone https://github.com/garrytan/gbrain
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/garrytan/gbrain "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/media-ingest" ~/.claude/skills/garrytan-gbrain-media-ingest && rm -rf "$T"
manifest:
skills/media-ingest/SKILL.mdsource content
Media Ingest Skill
Ingest video, audio, PDF, book, screenshot, and GitHub repo content into the brain.
Filing rule: Read
before creating any new page.skills/_brain-filing-rules.md
Contract
This skill guarantees:
- Every ingested media item has a brain page with analysis (not just a transcript dump)
- Transcripts (video/audio) saved in raw and human-readable formats
- Entity extraction: every person and company mentioned gets back-linked
- Raw source files preserved via
gbrain files upload-raw - Filing by primary subject, not by media format
Iron Law: Back-Linking (MANDATORY)
Every mention of a person or company with a brain page MUST create a back-link.
Phases
Phase 1: Identify format and fetch
| Format | Action |
|---|---|
| YouTube/video URL | Fetch transcript (Whisper, transcription service, or captions) |
| Audio file | Transcribe with available STT service |
| Extract text (OCR if needed) | |
| Book PDF | Extract text, identify chapters/sections |
| Screenshot/image | OCR via vision model, extract text and entities |
| GitHub repo | Clone, read README + key files, summarize architecture |
Phase 2: Upload raw source
Save the original file for provenance:
gbrain files upload-raw <file> --page <slug>
Phase 3: Create brain page
File by primary subject (not format). Use this template:
# {Title} **Source:** {URL or file path} **Format:** {video/audio/PDF/book/screenshot/repo} **Created:** {date} ## Summary {Key points, not a transcript dump} ## Key Segments / Highlights {For video/audio: timestamped highlights. For books: chapter summaries.} ## People Mentioned {List with links to brain pages} ## Companies Mentioned {List with links to brain pages}
Phase 4: Entity extraction and propagation
For every person and company mentioned:
- Check brain for existing page
- Create/enrich if needed (delegate to enrich skill)
- Add back-link from entity page to this media page
- Add timeline entry on entity page
A media item is NOT fully ingested until entity propagation is complete.
Phase 5: Sync
gbrain sync to update the index.
Output Format
Brain page created with summary, highlights, and entity cross-links. Report to user: "Ingested {title}: {N} entities detected, {N} pages updated."
Anti-Patterns
- Dumping raw transcripts without analysis
- Skipping entity extraction ("I'll do that separately")
- Filing by format (all videos in
) instead of by subjectmedia/videos/ - Not preserving raw source files
- Creating stub pages without meaningful content