Claude-code-plugins-plus-skills glean-cost-tuning

install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/saas-packs/glean-pack/skills/glean-cost-tuning" ~/.claude/skills/jeremylongshore-claude-code-plugins-plus-skills-glean-cost-tuning && rm -rf "$T"
manifest: plugins/saas-packs/glean-pack/skills/glean-cost-tuning/SKILL.md
source content

Glean Cost Tuning

Overview

Glean pricing scales with indexed content volume and per-seat user count, making document indexing volume and search query frequency the primary cost drivers. Enterprise deployments typically connect dozens of datasources, each pushing thousands of documents into the index. Without active content governance, stale drafts, archived pages, and near-empty documents inflate the index by 30-50%, driving up costs with zero search value. Pruning irrelevant content and using incremental indexing are the highest-leverage optimizations.

Cost Breakdown

ComponentCost DriverOptimization
Document indexingVolume of indexed content across all sourcesFilter drafts, templates, and archived content pre-index
User seatsPer-seat licensingAudit active users quarterly; deprovision inactive accounts
Search queriesQuery volume across the organizationCache frequent queries; use search analytics to identify redundant patterns
Datasource connectorsNumber of active connectors to maintainConsolidate overlapping sources; remove unused connectors
Content storageSize of indexed documentsTruncate body to 50KB; skip attachments over 10MB

API Call Reduction

class GleanIndexFilter {
  private staleThreshold = 365 * 24 * 60 * 60 * 1000; // 12 months

  shouldIndex(doc: { status: string; updatedAt: number; title: string; content: string }): boolean {
    if (doc.status === 'draft' || doc.status === 'archived') return false;
    if (Date.now() - doc.updatedAt > this.staleThreshold) return false;
    if (doc.title.startsWith('[Template]')) return false;
    if (doc.content.length < 50) return false;
    return true;
  }

  async incrementalIndex(docs: any[], lastSyncTimestamp: number): Promise<any[]> {
    // Only process documents modified since last sync — reduces indexing calls by 80-90%
    const modified = docs.filter(d => d.updatedAt > lastSyncTimestamp);
    const eligible = modified.filter(d => this.shouldIndex(d));
    return eligible.map(d => ({
      ...d,
      content: d.content.slice(0, 50_000) // Truncate to 50KB
    }));
  }
}

Usage Monitoring

class GleanCostMonitor {
  private indexedDocs = 0;
  private queriesThisHour = 0;
  private budgetDocs = 100_000;

  recordIndexed(count: number): void {
    this.indexedDocs += count;
    const utilization = (this.indexedDocs / this.budgetDocs) * 100;
    if (utilization > 80) {
      console.warn(`Glean index at ${utilization.toFixed(0)}% capacity: ${this.indexedDocs}/${this.budgetDocs} docs`);
    }
  }

  getUtilization(): string {
    return `${((this.indexedDocs / this.budgetDocs) * 100).toFixed(1)}% index capacity used`;
  }
}

Cost Optimization Checklist

  • Filter drafts, templates, and archived documents before indexing
  • Prune documents not updated in 12+ months
  • Use incremental indexing — only process changed documents
  • Truncate document bodies to 50KB maximum
  • Consolidate overlapping datasource connectors
  • Audit user seats quarterly and deprovision inactive accounts
  • Skip attachments larger than 10MB
  • Monitor index utilization with 80% threshold alerts

Error Handling

IssueCauseFix
Index bloat exceeding budgetNo content filtering on connectorsApply shouldIndex filter to all datasource pipelines
Stale search resultsDeleted docs still in indexRun nightly reconciliation to remove orphaned entries
Connector timeoutsSource system rate limitingImplement backoff and schedule syncs during off-peak
Duplicate documents indexedSame content in multiple datasourcesDeduplicate by content hash before indexing
Query costs spikingBot or automated search trafficRate-limit API search consumers; whitelist known clients

Resources

Next Steps

See

glean-performance-tuning
.