Gbrain enrich
git clone https://github.com/garrytan/gbrain
T=$(mktemp -d) && git clone --depth=1 https://github.com/garrytan/gbrain "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/enrich" ~/.claude/skills/garrytan-gbrain-enrich && rm -rf "$T"
skills/enrich/SKILL.mdEnrich Skill
Enrich person and company pages from external sources. Scale effort to importance.
Contract
This skill guarantees:
- Every enriched page has compiled truth (State section) with inline citations
- Every enriched page has a timeline with dated entries
- Back-links are created bidirectionally
- Tiered enrichment: Tier 1 (full), Tier 2 (medium), Tier 3 (minimal) based on notability
- No stubs: every new page has meaningful content from web search or existing brain context
Filing rule: Read
before creating any new page.skills/_brain-filing-rules.md
Iron Law: Back-Linking (MANDATORY)
Every mention of a person or company with a brain page MUST create a back-link FROM that entity's page TO the page mentioning them. An unlinked mention is a broken brain. See
skills/_brain-filing-rules.md for format.
Philosophy
A brain page should read like an intelligence dossier, not a LinkedIn scrape. Facts are table stakes. Texture is the value -- what do they believe, what are they building, what makes them tick, where are they headed.
Citation Requirements (MANDATORY)
Every fact must carry an inline
[Source: ...] citation.
Three formats:
- Direct attribution:
[Source: User, {context}, YYYY-MM-DD] - API/external:
[Source: {provider} enrichment, YYYY-MM-DD] - Synthesis:
[Source: compiled from {list of sources}]
Source precedence (highest to lowest):
- User's direct statements
- Compiled truth (pre-existing brain synthesis)
- Timeline entries (raw evidence)
- External sources (API enrichment, web search)
When sources conflict, note the contradiction with both citations.
When To Enrich
Primary triggers
- User mentions an entity in conversation
- Entity appears in a meeting transcript or email
- New contact appears with significant context
- Entity makes news or has a major event
- Any ingest pipeline encounters a notable entity
Do NOT enrich
- Random mentions with no relationship signal
- Bot/spam accounts
- Entities with no substantive connection to the user's work
- Same page enriched within the past week (unless new signal warrants it)
Enrichment Tiers
Scale enrichment to importance. Don't waste API calls on low-value entities.
| Tier | Who | Effort | Sources |
|---|---|---|---|
| 1 (key) | Inner circle, close collaborators, key contacts | Full pipeline | All available APIs + deep web research |
| 2 (notable) | Occasional interactions, industry figures | Moderate | Web research + social + brain cross-ref |
| 3 (minor) | Worth tracking, not critical | Light | Brain cross-ref + social lookup if handle known |
The Enrichment Protocol (7 Steps)
Step 1: Identify entities
Extract people, companies, concepts from the incoming signal.
Step 2: Check brain state
For each entity:
-- does a page already exist?gbrain search "name"- If yes: UPDATE path (add new signal, update compiled truth if material)
- If no: CREATE path (check notability gate first, then create)
Step 3: Extract signal from source
Don't just capture facts. Capture texture:
| Signal Type | What to Extract |
|---|---|
| Opinions, beliefs | What They Believe section |
| Current projects, features shipped | What They're Building section |
| Ambition, career arc, motivation | What Motivates Them section |
| Topics they return to obsessively | Hobby Horses section |
| Who they amplify, argue with, respect | Network / Relationships |
| Ascending, plateauing, pivoting? | Trajectory section |
| Role, company, funding, location | State section (hard facts) |
Step 4: External data source lookups
Priority order -- stop when you have enough signal for the entity's tier.
4a. Brain cross-reference (always, all tiers)
andgbrain search "name"gbrain query "what do we know about name"- Check related pages: company pages for person enrichment and vice versa
- This is free and often the richest source
4b. Web research (Tier 1 and 2)
- Use Perplexity, Brave Search, Exa, or equivalent web research tool
- Key pattern: Send existing brain knowledge as context so the search returns DELTA (what's new vs what you already know), not a rehash
- Opus-class models for Tier 1 deep research, lighter models for Tier 2
4c. Social media lookup (all tiers when handle known)
- Pull recent posts/tweets for tone, interests, current focus
- Social media is the highest-texture signal for what someone actually thinks
4d. People enrichment APIs (Tier 1)
- LinkedIn data, career history, connections, education
4e. Company enrichment APIs (Tier 1)
- Company data, financials, headcount, key hires, recent news
| Data Need | Example Sources | Tier |
|---|---|---|
| Web research | Perplexity, Brave, Exa | 1-2 |
| LinkedIn / career | Crustdata, Proxycurl, People Data Labs | 1 |
| Career history | Happenstance, LinkedIn | 1 |
| Funding / company data | Crunchbase, PitchBook, Clearbit | 1 |
| Social media | Platform APIs, web scraping | 1-3 |
| Meeting history | Calendar/meeting transcript tools | 1-2 |
Step 5: Save raw data (preserves provenance)
Store raw API responses via
put_raw_data in gbrain:
{ "source": "crustdata", "fetched_at": "2026-04-11T...", "query": "jane doe", "data": { ... } }
Raw data preserves provenance. If the compiled truth is ever questioned, the raw data shows exactly what the API returned.
Step 6: Write to brain
CREATE path
- Check notability gate (see
)skills/_brain-filing-rules.md - Check filing rules -- where does this entity go?
- Create page with the appropriate template (below)
- Fill compiled truth with citations
- Add first timeline entry
- Leave empty sections as
(don't fill with boilerplate)[No data yet]
UPDATE path
- Add new timeline entries (reverse-chronological, append-only)
- Update compiled truth ONLY if the new signal materially changes the picture
- Update State section with new facts
- Flag contradictions between new signal and existing compiled truth
- Don't overwrite user-written assessments with API boilerplate
Person page template
--- title: Full Name type: person created: YYYY-MM-DD updated: YYYY-MM-DD tags: [] company: Current Company relationship: How the user knows them email: linkedin: twitter: location: --- # Full Name > 1-paragraph executive summary: HOW do you know them, WHY do they matter, > what's the current state of the relationship. ## State Role, company, key context. Hard facts only. ## What They Believe Ideology, first principles, worldview. What hills do they die on? ## What They're Building Current projects, recent launches, what they're focused on. ## What Motivates Them Ambition, career arc, what drives them. ## Hobby Horses Topics they return to obsessively. Recurring themes in their work/posts. ## Assessment Your read on this person. Strengths, gaps, trajectory. ## Trajectory Ascending, plateauing, pivoting, declining? Where are they headed? ## Relationship History of interactions, shared context, relationship quality. ## Contact Email, social handles, preferred communication channel. ## Network Key connections, mutual contacts, organizational relationships. ## Open Threads Active conversations, pending items, things to follow up on. --- ## Timeline Reverse chronological. Every entry has a date and [Source: ...] citation. - **YYYY-MM-DD** | Event description [Source: ...]
Company page template
--- title: Company Name type: company created: YYYY-MM-DD updated: YYYY-MM-DD tags: [] --- # Company Name > 1-paragraph executive summary. ## State What they do, stage, key people, key metrics, your connection. ## Open Threads Active items, pending decisions, things to track. --- ## Timeline - **YYYY-MM-DD** | Event description [Source: ...]
Step 7: Cross-reference
- Update company pages from person enrichment (and vice versa)
- Update related project/deal pages if relevant context surfaced
- Check index files if the brain uses them
Note (v0.10.1): Links between brain pages are auto-created on every
put_page call (auto-link post-hook). Step 7 focuses on content
cross-references (updating related pages' compiled truth with new signal
from this enrichment), not on creating links. Verify via the auto_links
field in the put_page response ({ created, removed, errors }).
Timeline entries still need explicit gbrain timeline-add calls.
Bulk Enrichment Rules
- Test on 3-5 entities first. Read actual output. Check quality.
- Only proceed to bulk after test shots pass your quality bar.
- 3+ entities from one source -> batch process or spawn sub-agent
- Throttle API calls. Respect rate limits.
- Commit every 5-10 entities during bulk runs.
- Save a report after bulk enrichment (see Report Storage below).
Validation Rules
- Connection count < 20 on LinkedIn = likely wrong person, skip
- Name mismatch between brain and API = skip, flag for review
- Joke profiles or obviously wrong data = save to raw, don't update page
- Don't overwrite user-written assessments with API boilerplate
- When in doubt: save raw data but don't update brain page
Report Storage
After enrichment sweeps, save a report:
- Number of entities processed
- New pages created vs existing updated
- Data sources called and results quality
- Notable discoveries or contradictions
- Validation flags or API failures
This creates an audit trail for brain enrichment over time.
Anti-Patterns
- Creating stub pages with no content
- Enriching without checking brain first
- Overwriting user's direct statements with API data
- Creating pages for non-notable entities
Output Format
An enriched person page contains:
- Frontmatter with type, tags, company, relationship, and contact fields
- Executive summary (1 paragraph: how you know them, why they matter, relationship state)
- State section with hard facts and inline
citations[Source: ...] - Texture sections (What They Believe, What They're Building, What Motivates Them, Hobby Horses)
- Assessment with trajectory read
- Relationship history and contact info
- Network connections and mutual contacts
- Timeline in reverse chronological order, every entry dated with source citation
An enriched company page contains:
- Frontmatter with type and tags
- Executive summary (1 paragraph)
- State section (what they do, stage, key people, metrics, your connection)
- Open Threads (active items, pending decisions)
- Timeline in reverse chronological order with dated, cited entries
Both page types have bidirectional back-links to every entity they mention.
Tools Used
- Read a page from gbrain (get_page)
- Store/update a page in gbrain (put_page)
- Add a timeline entry in gbrain (add_timeline_entry)
- List pages in gbrain by type (list_pages)
- Store raw API data in gbrain (put_raw_data)
- Retrieve raw data from gbrain (get_raw_data)
- Link entities in gbrain (add_link)
- Check backlinks in gbrain (get_backlinks)