Asi transcript-search
Intelligent semantic search over voice memo and video transcript DuckDB databases. Use when searching transcripts for topics, colors, tabs, concepts, or any content. NEVER dump full transcript text — use sentence-level extraction with context windows.
git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/transcript-search" ~/.claude/skills/plurigrid-asi-transcript-search && rm -rf "$T"
plugins/asi/skills/transcript-search/SKILL.mdtranscript-search
Search transcripts intelligently without loading entire texts into context.
Trit: 0 (ERGODIC - coordination/retrieval)
CRITICAL RULE
NEVER run
SELECT text FROM transcripts or load full transcript bodies into context.
Always use sentence-level extraction with regexp_extract_all or string_split + filtering.
Known Databases
| Path | Schema | Content |
|---|---|---|
| | 174 voice memos + whisper transcripts |
| , , , | Speaker-diarized audio with GF(3) |
| varies | Aqua Voice transcriptions |
| same as above | Atlas copy |
Search Patterns
1. Sentence-Level Context Extraction (PRIMARY)
Extract sentences matching keywords with surrounding context:
-- Find sentences about a topic with ±250 char context window SELECT id, source, timestamp, trim(chunk) as context FROM ( SELECT id, source, timestamp, unnest(regexp_extract_all(text, '[^.]{0,250}KEYWORD[^.]{0,250}', 0)) as chunk FROM transcripts ) WHERE length(trim(chunk)) > 15 ORDER BY id;
2. Multi-Keyword Intersection
Find sentences where multiple concepts co-occur:
-- Sentences mentioning BOTH term1 AND term2 WITH sentences AS ( SELECT id, source, unnest(string_split(text, '.')) as sentence FROM transcripts ) SELECT id, source, trim(sentence) as sentence FROM sentences WHERE lower(sentence) LIKE '%term1%' AND lower(sentence) LIKE '%term2%' AND length(trim(sentence)) > 20;
3. Quick Count Before Deep Dive
Always count first to avoid surprise data dumps:
-- How many transcripts mention X? SELECT COUNT(*) as hits, array_agg(id ORDER BY id) as transcript_ids FROM transcripts WHERE lower(text) LIKE '%keyword%';
4. Temporal Search
-- Recent transcripts mentioning X SELECT id, source, timestamp, left(text, 200) as preview FROM transcripts WHERE lower(text) LIKE '%keyword%' AND timestamp > NOW() - INTERVAL '7 days' ORDER BY timestamp DESC;
5. Co-occurrence Matrix
-- Which transcripts mention both colors AND tabs? SELECT id, source, timestamp FROM transcripts WHERE lower(text) LIKE '%color%' AND (lower(text) LIKE '%tab%' OR lower(text) LIKE '%tile%') ORDER BY id;
Workflow
- Count first: How many transcripts match? Get IDs.
- Extract sentences: Use regex context windows, NOT full text.
- Narrow: Add more keywords to intersect.
- Report: Show relevant sentences with transcript ID + timestamp.
Known Color-Tab Mappings (from transcript #149)
From voice memo session #149, the color system for tabs/tiles:
- Green = Emacs / conventional flow / "zero" baseline / bridging
- Blue = secondary workspace
- Red = active/alert state
- Orange = Barton's aesthetic (shirt, rollers — transcript #168, #171)
- Colors map to styles/environments in tiled terminal sessions
- "Any color, any style, any tab, associated rows" — colors ARE the tab identifiers
Key quote: "And so what colors? Can you talk about color a little bit? Green is for what?" → Green was Emacs. "Currently green and red, there's 4 tiles" → tiled terminal layout.
Anti-Patterns
| ❌ Bad | ✅ Good |
|---|---|
| |
| |
| Loading 174 full transcripts | Count → filter IDs → extract sentences |
| Grepping raw text blobs | DuckDB regex with context windows |
Related Skills
| Skill | Relationship |
|---|---|
| Creates transcript DuckDBs from YouTube playlists |
| Captures voice memos via whisper-cpp |
| Interactome analytics over transcripts |
| DuckDB file discovery |
| Transcripts were shared to Barton via Beeper |