install
source · Clone the upstream repo
git clone https://github.com/kreuzberg-dev/kreuzberg
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/kreuzberg-dev/kreuzberg "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.ai-rulez/skills/wasm-constraints" ~/.claude/skills/kreuzberg-dev-kreuzberg-wasm-constraints && rm -rf "$T"
manifest:
.ai-rulez/skills/wasm-constraints/SKILL.mdsource content
priority: high
WASM Build Constraints
Overview
WASM target in
crates/kreuzberg-wasm/. Uses wasm-bindgen with sync-only internal APIs.
Feature Flags
[features] wasm-target = ["pdf", "html", "xml", "email", "language-detection", "chunking", "quality", "office"] wasm-threads = ["dep:wasm-bindgen-rayon"] # Optional
Critical Constraints
1. No Tokio Runtime
All operations must be synchronous internally. Use
#[cfg(not(feature = "tokio-runtime"))] paths.
2. SyncExtractor Required
Every WASM-compatible extractor MUST implement
SyncExtractor:
impl SyncExtractor for MyExtractor { fn extract_sync(&self, content: &[u8], mime_type: &str, config: &ExtractionConfig) -> Result<ExtractionResult> { /* sync implementation */ } } impl DocumentExtractor for MyExtractor { fn as_sync_extractor(&self) -> Option<&dyn SyncExtractor> { Some(self) // MUST return Some for WASM } }
3. HTML Size Limit
const MAX_HTML_SIZE: usize = 2 * 1024 * 1024; // 2MB - stack constraint
4. PDFium Initialization (from JS)
import init, { initialize_pdfium_render } from './kreuzberg_wasm.js'; const wasm = await init(); const pdfium = await pdfiumModule(); initialize_pdfium_render(pdfium, wasm, false); // REQUIRED for PDF
Build Config
[lib] crate-type = ["cdylib", "rlib"] [profile.release.package.kreuzberg-wasm] opt-level = "z" # Size optimization codegen-units = 1
API Pattern
#[wasm_bindgen] pub async fn extract_from_bytes(content: Vec<u8>, config: JsValue) -> Result<JsValue, JsValue> { let config: ExtractionConfig = serde_wasm_bindgen::from_value(config)?; let result = extract_bytes_sync(&content, mime_type, &config)?; Ok(serde_wasm_bindgen::to_value(&result)?) }
Functions can be
async for JS compatibility, but internal extraction is sync.
Critical Rules
- No tokio — all operations synchronous
- Implement SyncExtractor for all WASM-compatible extractors
- HTML limited to 2MB due to stack constraints
- PDFium requires manual JS initialization
- Size optimization via
opt-level = "z" - Feature gate with
#[cfg(target_arch = "wasm32")]