Gbrain media-ingest

install

source · Clone the upstream repo

git clone https://github.com/garrytan/gbrain

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/garrytan/gbrain "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/media-ingest" ~/.claude/skills/garrytan-gbrain-media-ingest && rm -rf "$T"

manifest: skills/media-ingest/SKILL.md

source content

Media Ingest Skill

Ingest video, audio, PDF, book, screenshot, and GitHub repo content into the brain.

Filing rule: Read
skills/_brain-filing-rules.md
before creating any new page.

Contract

This skill guarantees:

Every ingested media item has a brain page with analysis (not just a transcript dump)
Transcripts (video/audio) saved in raw and human-readable formats
Entity extraction: every person and company mentioned gets back-linked
Raw source files preserved via
```
gbrain files upload-raw
```
Filing by primary subject, not by media format

Iron Law: Back-Linking (MANDATORY)

Every mention of a person or company with a brain page MUST create a back-link.

Phases

Phase 1: Identify format and fetch

Format	Action
YouTube/video URL	Fetch transcript (Whisper, transcription service, or captions)
Audio file	Transcribe with available STT service
PDF	Extract text (OCR if needed)
Book PDF	Extract text, identify chapters/sections
Screenshot/image	OCR via vision model, extract text and entities
GitHub repo	Clone, read README + key files, summarize architecture

Phase 2: Upload raw source

Save the original file for provenance:

gbrain files upload-raw <file> --page <slug>

Phase 3: Create brain page

File by primary subject (not format). Use this template:

# {Title}

**Source:** {URL or file path}
**Format:** {video/audio/PDF/book/screenshot/repo}
**Created:** {date}

## Summary
{Key points, not a transcript dump}

## Key Segments / Highlights
{For video/audio: timestamped highlights. For books: chapter summaries.}

## People Mentioned
{List with links to brain pages}

## Companies Mentioned
{List with links to brain pages}

Phase 4: Entity extraction and propagation

For every person and company mentioned:

Check brain for existing page
Create/enrich if needed (delegate to enrich skill)
Add back-link from entity page to this media page
Add timeline entry on entity page

A media item is NOT fully ingested until entity propagation is complete.

Phase 5: Sync

gbrain sync

to update the index.

Output Format

Brain page created with summary, highlights, and entity cross-links. Report to user: "Ingested {title}: {N} entities detected, {N} pages updated."

Anti-Patterns

Dumping raw transcripts without analysis
Skipping entity extraction ("I'll do that separately")
Filing by format (all videos in
```
media/videos/
```
) instead of by subject
Not preserving raw source files
Creating stub pages without meaningful content