Claude-skill-registry libingest

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/libingest" ~/.claude/skills/majiayu000-claude-skill-registry-libingest && rm -rf "$T"
manifest: skills/data/libingest/SKILL.md
source content

libingest Skill

When to Use

  • Converting PDF documents to structured HTML
  • Processing PowerPoint presentations for indexing
  • Extracting semantic content from images via OCR
  • Building document ingestion pipelines

Key Concepts

IngestPipeline: Orchestrates a sequence of transformation steps defined in config/ingest.yml.

IngestStep: Individual processing step (pdf-to-images, images-to-html, extract-context, annotate-html, normalize-html).

Usage Patterns

Pattern 1: Run ingestion via CLI

# Drop files in data/ingest/in/
cp document.pdf data/ingest/in/

# Run pipeline
make ingest

Pattern 2: Programmatic ingestion

import { IngestPipeline } from "@copilot-ld/libingest";

const pipeline = new IngestPipeline(config, storage, llmClient);
const result = await pipeline.process("document.pdf");
// result.output points to final HTML

Integration

Configured via config/ingest.yml. Uses libllm for vision processing. Output stored in data/ingest/pipeline/.