Kreuzberg api-server-mcp

api server mcp

install
source · Clone the upstream repo
git clone https://github.com/kreuzberg-dev/kreuzberg
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/kreuzberg-dev/kreuzberg "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.ai-rulez/skills/api-server-mcp" ~/.claude/skills/kreuzberg-dev-kreuzberg-api-server-mcp && rm -rf "$T"
manifest: .ai-rulez/skills/api-server-mcp/SKILL.md
source content

API Server & MCP Protocol

Axum server design for document extraction endpoints, middleware, async processing, and Model Context Protocol integration for AI agents

Kreuzberg API Architecture

Location:

crates/kreuzberg/src/api/
,
crates/kreuzberg-cli/

Kreuzberg provides a dual REST API + MCP server built with Axum + Tokio.

Request Flow:
HTTP Client / AI Agent (Claude)
    |
[Transport Layer]
├── REST API (Axum HTTP)
└── MCP Protocol (HTTP or Stdio)
    |
[Middleware Layer]
├── CORS, Request Logging (TraceLayer)
├── Request/Response size limits
└── Rate limiting (optional)
    |
[Router]
├── REST Endpoints
│   ├── POST /extract - File upload extraction
│   ├── POST /extract-url - URL-based extraction
│   ├── GET /formats - List supported formats
│   ├── GET /health - Server health check
│   ├── POST /batch - Batch document processing
│   ├── GET /cache/stats - Cache statistics
│   └── DELETE /cache - Clear extraction cache
├── MCP Endpoints
│   ├── POST /mcp/tools - List available tools
│   ├── POST /mcp/tools/call - Call a tool
│   ├── GET /mcp/resources - List resources
│   ├── GET /mcp/resources/:uri - Read resource
│   ├── GET /mcp/prompts - List prompts
│   └── GET /mcp/prompts/:name - Get prompt
    |
[Handler / Tool Layer]
├── extract_handler / extract_file tool
├── batch_handler / batch_extract tool
├── health_handler / get_capabilities tool
└── format_handler
    |
[Extraction Core]
├── Format detection
├── Extraction pipeline
├── Post-processing (chunking, embeddings)
└── Result formatting
    |
JSON Response / MCP ToolResult

Server Setup & Configuration

Location:

crates/kreuzberg/src/api/server.rs

Server initialization pattern: Create

ApiState
(holds
ExtractionConfig
+
ExtractionCache
), build Axum
Router
with all REST + MCP routes, apply middleware layers (body limits, CORS, tracing), serve via
tokio::net::TcpListener
.

Key middleware layers applied in order:

  • DefaultBodyLimit::max(100MB)
    +
    RequestBodyLimitLayer
    -- configurable via env vars
  • CorsLayer::permissive()
    -- restrict in production via
    CORS_ALLOWED_ORIGINS
  • TraceLayer::new_for_http()
    -- request/response logging

Core REST Handlers

Location:

crates/kreuzberg/src/api/handlers.rs

HandlerMethodDescription
extract_handler
POST /extractMultipart upload: parse file + optional config JSON, check cache, call
extract_bytes()
, cache result
extract_url_handler
POST /extract-urlFetch URL via reqwest, extract bytes
batch_handler
POST /batchParallel extraction with
Semaphore
-limited concurrency (default: CPU count)
health_handler
GET /healthReport status, version, uptime, feature availability (OCR, embeddings), cache stats
formats_handler
GET /formatsReturn supported format categories (office, pdf, images, web, email, archives, academic)
cache_stats_handler
GET /cache/statsHit/miss counts and hit rate
cache_clear_handler
DELETE /cacheClear LRU cache

Caching Strategy

Location:

crates/kreuzberg/src/cache/mod.rs

LRU cache keyed by

SHA256(file_content)
, stores
Arc<ExtractionResult>
. Default 1000 entries. Thread-safe via
RwLock
. Tracks hit/miss counters with
AtomicU64
for stats endpoint.

Error Handling

Location:

crates/kreuzberg/src/api/error.rs

ApiError
enum maps to HTTP status codes:

  • MissingFile
    -> 400,
    FileNotFound
    -> 404
  • OnnxRuntimeMissing
    /
    TesseractMissing
    -> 503 (with remediation message)
  • PayloadTooLarge
    -> 413
  • ExtractionFailed
    /
    InvalidConfig
    /
    UnsupportedFormat
    -> 500

MCP Server Implementation

Location:

crates/kreuzberg/src/mcp/server.rs

The MCP server allows Claude and other AI agents to call Kreuzberg extraction functions through the Model Context Protocol.

MCP Tools (Callable Functions)

Three tools are registered:

ToolPurposeRequired Params
extract_file
Extract text/tables/metadata from documents (75+ formats)
file_path
batch_extract
Extract from multiple documents in parallel
file_paths[]
get_capabilities
List supported formats, features, backends(none)

Tool registration pattern (example:

extract_file
):

// Define Tool with name, description, JSON Schema inputSchema
// Register with server.register_tool(tool, handler_fn)
// Handler: parse params -> build ExtractionConfig -> call extract_file() -> return ToolResult as JSON

extract_file
optional params:
format
,
extract_tables
,
extract_images
,
ocr_enabled
,
extract_metadata
,
chunking_preset
,
generate_embeddings
.

MCP Resources (Static Knowledge)

Three resources provide static information to agents:

  • kreuzberg://formats
    -- Supported format list as JSON
  • kreuzberg://features
    -- Cross-binding feature matrix (from
    FEATURE_MATRIX.md
    )
  • kreuzberg://api-reference
    -- Generated API documentation

MCP Prompts (Agent Templates)

Two prompts guide agent extraction workflows:

  • extract_for_rag
    -- Document type-specific RAG extraction guidance (research paper, contract, report). Recommends chunking preset and embedding config.
  • batch_document_processing
    -- Optimal concurrency, grouping, and error handling for batch workflows.

MCP Transport Protocols

  • HTTP/REST: MCP routes mounted alongside REST API on separate
    /mcp/
    prefix
  • Stdio: JSON-RPC 2.0 over stdin/stdout for local CLI integration (e.g., Claude Desktop)

Integration with Claude Desktop

{
  "mcpServers": {
    "kreuzberg": {
      "command": "kreuzberg-mcp",
      "env": {
        "KREUZBERG_API_BASE": "http://localhost:8000",
        "KREUZBERG_MCP_TRANSPORT": "stdio"
      }
    }
  }
}

MCP Error Handling

ToolError
variants:
FileNotFound
,
UnsupportedFormat
,
ExtractionFailed
,
OnnxRuntimeMissing
,
TesseractMissing
,
Timeout
. Each maps to an MCP
ToolResultError
with descriptive code and message.

Environment Configuration

See

.env.example
for all configurable variables. Key categories:

  • Server:
    KREUZBERG_HOST
    ,
    KREUZBERG_PORT
  • Size limits:
    KREUZBERG_MAX_REQUEST_BODY_BYTES
    (default 100MB),
    KREUZBERG_MAX_MULTIPART_FIELD_BYTES
  • Features:
    KREUZBERG_ENABLE_OCR
    ,
    KREUZBERG_ENABLE_EMBEDDINGS
    ,
    KREUZBERG_ENABLE_KEYWORDS
  • Cache:
    KREUZBERG_CACHE_ENABLED
    ,
    KREUZBERG_CACHE_SIZE
  • CORS:
    CORS_ALLOWED_ORIGINS
    (comma-separated)
  • MCP:
    KREUZBERG_MCP_HOST
    ,
    KREUZBERG_MCP_PORT
    ,
    KREUZBERG_MCP_TRANSPORT
    (stdio/http)
  • Logging:
    RUST_LOG=kreuzberg=info,tower_http=debug

Critical Rules

REST API Rules

  1. Always validate multipart file uploads - Check MIME type, size, magic bytes
  2. Timeout long-running extractions - Set per-handler timeout (5 min default)
  3. Stream large files - Never buffer entire multi-GB file in memory
  4. Cache aggressively - Identical files should return from cache in <1ms
  5. Parallel extraction is CPU-bound - Limit workers to CPU count + 1
  6. Error responses must be actionable - Include error code and remediation suggestion
  7. Health checks must verify features - Report missing dependencies (ONNX, Tesseract)
  8. Size limits are configurable - Allow override via env var for large deployments
  9. CORS is permissive by default - Restrict in production via env var
  10. Logging all requests - Track extraction metrics for observability

MCP Rules

  1. All tools must have timeout - Prevent hanging on large files (default 5 min)
  2. Error responses must be detailed - Include suggestions for missing dependencies
  3. Feature gates must be checked - Return helpful message if feature unavailable (embeddings, OCR)
  4. Resources should be static - Don't query external services in resource handlers
  5. Prompts guide agents - Provide clear examples and best practices
  6. Batch tools must support cancellation - Allow agent to stop long-running batch operations
  7. Logging all tool calls - Track usage for analytics and debugging

Related Skills

  • extraction-pipeline-patterns - Core extraction called by handlers and MCP tools
  • chunking-embeddings - Optional chunking/embedding parameters in extraction
  • ocr-backend-management - OCR engine selection and image preprocessing