Claude-skill-registry cloudflare-vectorize
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/cloudflare-vectorize-dennislee928-smart-zone" ~/.claude/skills/majiayu000-claude-skill-registry-cloudflare-vectorize && rm -rf "$T"
skills/data/cloudflare-vectorize-dennislee928-smart-zone/SKILL.mdCloudflare Vectorize
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
Status: Production Ready ✅ Last Updated: 2026-01-21 Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) Latest Versions: wrangler@4.59.3, @cloudflare/workers-types@4.20260109.0 Token Savings: ~70% Errors Prevented: 14 Dev Time Saved: ~4 hours
What This Skill Provides
Core Capabilities
- ✅ Index Management: Create, configure, and manage vector indexes
- ✅ Vector Operations: Insert, upsert, query, delete, and list vectors (list-vectors added August 2025)
- ✅ Metadata Filtering: Advanced filtering with 10 metadata indexes per index
- ✅ Semantic Search: Find similar vectors using cosine, euclidean, or dot-product metrics
- ✅ RAG Patterns: Complete retrieval-augmented generation workflows
- ✅ Workers AI Integration: Native embedding generation with @cf/baai/bge-base-en-v1.5
- ✅ OpenAI Integration: Support for text-embedding-3-small/large models
- ✅ Document Processing: Text chunking and batch ingestion pipelines
- ✅ Testing Setup: Vitest configuration with Vectorize bindings
Templates Included
- basic-search.ts - Simple vector search with Workers AI
- rag-chat.ts - Full RAG chatbot with context retrieval
- document-ingestion.ts - Document chunking and embedding pipeline
- metadata-filtering.ts - Advanced filtering patterns
⚠️ Vectorize V2 Breaking Changes (September 2024)
IMPORTANT: Vectorize V2 became GA in September 2024 with significant breaking changes.
What Changed in V2
Performance Improvements:
- Index capacity: 200,000 → 5 million vectors per index
- Query latency: 549ms → 31ms median (18× faster)
- TopK limit: 20 → 100 results per query
- Scale limits: 100 → 50,000 indexes per account
- Namespace limits: 100 → 50,000 namespaces per index
Breaking API Changes:
-
Async Mutations - All mutations now asynchronous:
// V2: Returns mutationId const result = await env.VECTORIZE_INDEX.insert(vectors); console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" // Vector inserts/deletes may take a few seconds to be reflected -
returnMetadata Parameter - Boolean → String enum:
// ❌ V1 (deprecated) { returnMetadata: true } // ✅ V2 (required) { returnMetadata: 'all' | 'indexed' | 'none' } -
Metadata Indexes Required Before Insert:
- V2 requires metadata indexes created BEFORE vectors inserted
- Vectors added before metadata index won't be indexed
- Must re-upsert vectors after creating metadata index
V1 Deprecation Timeline:
- December 2024: Can no longer create V1 indexes
- Existing V1 indexes: Continue to work (other operations unaffected)
- Migration: Use
flag for V1 operationswrangler vectorize --deprecated-v1
Wrangler Version Required:
- Minimum: wrangler@3.71.0 for V2 commands
- Recommended: wrangler@4.54.0+ (latest)
Check Mutation Status
// Get index info to check last mutation processed const info = await env.VECTORIZE_INDEX.describe(); console.log(info.mutationId); // Last mutation ID console.log(info.processedUpToMutation); // Last processed timestamp
Critical Setup Rules
⚠️ MUST DO BEFORE INSERTING VECTORS
# 1. Create the index with FIXED dimensions and metric npx wrangler vectorize create my-index \ --dimensions=768 \ --metric=cosine # 2. Create metadata indexes IMMEDIATELY (before inserting vectors!) npx wrangler vectorize create-metadata-index my-index \ --property-name=category \ --type=string npx wrangler vectorize create-metadata-index my-index \ --property-name=timestamp \ --type=number
Why: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
Index Configuration (Cannot Be Changed Later)
# Dimensions MUST match your embedding model output: # - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions # - OpenAI text-embedding-3-small: 1536 dimensions # - OpenAI text-embedding-3-large: 3072 dimensions # Metrics determine similarity calculation: # - cosine: Best for normalized embeddings (most common) # - euclidean: Absolute distance between vectors # - dot-product: For non-normalized vectors
Wrangler Configuration
wrangler.jsonc:
{ "name": "my-vectorize-worker", "main": "src/index.ts", "compatibility_date": "2025-10-21", "vectorize": [ { "binding": "VECTORIZE_INDEX", "index_name": "my-index" } ], "ai": { "binding": "AI" } }
TypeScript Types
export interface Env { VECTORIZE_INDEX: VectorizeIndex; AI: Ai; } interface VectorizeVector { id: string; values: number[] | Float32Array | Float64Array; namespace?: string; metadata?: Record<string, string | number | boolean | string[]>; } interface VectorizeMatches { matches: Array<{ id: string; score: number; values?: number[]; metadata?: Record<string, any>; namespace?: string; }>; count: number; }
Metadata Filter Operators (V2)
Vectorize V2 supports advanced metadata filtering with range queries:
// Equality (implicit $eq) { category: "docs" } // Not equals { status: { $ne: "archived" } } // In/Not in arrays { category: { $in: ["docs", "tutorials"] } } { category: { $nin: ["deprecated", "draft"] } } // Range queries (numbers) - NEW in V2 { timestamp: { $gte: 1704067200, $lt: 1735689600 } } // Range queries (strings) - prefix searching { url: { $gte: "/docs/workers", $lt: "/docs/workersz" } } // Nested metadata with dot notation { "author.id": "user123" } // Multiple conditions (implicit AND) { category: "docs", language: "en", "metadata.published": true }
Metadata Best Practices
1. Cardinality Considerations
Low Cardinality (Good for $eq filters):
// Few unique values - efficient filtering metadata: { category: "docs", // ~10 categories language: "en", // ~5 languages published: true // 2 values (boolean) }
High Cardinality (Avoid in range queries):
// Many unique values - avoid large range scans metadata: { user_id: "uuid-v4...", // Millions of unique values timestamp_ms: 1704067200123 // Use seconds instead }
2. Metadata Limits
- Max 10 metadata indexes per Vectorize index
- Max 10 KiB metadata per vector
- String indexes: First 64 bytes (UTF-8)
- Number indexes: Float64 precision
- Filter size: Max 2048 bytes (compact JSON)
3. Vector Dimension Limit
Current Limit: 1536 dimensions per vector Source: GitHub Issue #8729
Supported Embedding Models:
- Workers AI
: 768 dimensions ✅@cf/baai/bge-base-en-v1.5 - OpenAI
: 1536 dimensions ✅text-embedding-3-small - OpenAI
: 3072 dimensions ❌ (requires dimension reduction)text-embedding-3-large
Unsupported Models (>1536 dimensions):
: 3584 dimensionsnomic-embed-code
: >1536 dimensionsQodo-Embed-1-7B
Workaround: Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality.
Feature Request: Higher dimension support is under consideration. Use Limit Increase Request Form if this blocks your use case.
4. Key Restrictions
// ❌ INVALID metadata keys metadata: { "": "value", // Empty key "user.name": "John", // Contains dot (reserved for nesting) "$admin": true, // Starts with $ "key\"with\"quotes": 1 // Contains quotes } // ✅ VALID metadata keys metadata: { "user_name": "John", "isAdmin": true, "nested": { "allowed": true } // Access as "nested.allowed" in filters }
Best Practices
Batch Insert Performance
Critical: Use batch size of 5000 vectors for optimal performance.
Performance Data:
- Individual inserts: 2.5M vectors in 36+ hours (incomplete)
- Batch inserts (5000): 4M vectors in ~12 hours
- 18× faster with proper batching
Why 5000?
- Vectorize's internal Write-Ahead Log (WAL) optimized for this size
- Avoids Cloudflare API rate limits
- Balances throughput and memory usage
Optimal Pattern:
const BATCH_SIZE = 5000; async function insertVectors(vectors: VectorizeVector[]) { for (let i = 0; i < vectors.length; i += BATCH_SIZE) { const batch = vectors.slice(i, i + BATCH_SIZE); const result = await env.VECTORIZE.insert(batch); console.log(`Inserted batch ${i / BATCH_SIZE + 1}, mutationId: ${result.mutationId}`); // Optional: Rate limiting delay if (i + BATCH_SIZE < vectors.length) { await new Promise(resolve => setTimeout(resolve, 100)); } } }
Sources:
Query Accuracy Modes
Vectorize uses approximate nearest neighbor (ANN) search by default with ~80% accuracy compared to exact search.
Default Mode: Approximate scoring (~80% accuracy)
- Faster latency
- Good for RAG, search, recommendations
- topK up to 100
High-Precision Mode: Near 100% accuracy
- Enabled via
returnValues: true - Higher latency
- Limited to topK=20
Trade-off Example:
// Fast, ~80% accuracy, topK up to 100 const results = await env.VECTORIZE.query(embedding, { topK: 50, returnValues: false // Default }); // Slower, ~100% accuracy, topK max 20 const preciseResults = await env.VECTORIZE.query(embedding, { topK: 10, returnValues: true // High-precision scoring });
When to Use High-Precision:
- Critical applications (fraud detection, legal compliance)
- Small result sets (topK < 20)
- Accuracy is higher priority than latency
Source: Cloudflare Blog - Building Vectorize
Common Errors & Solutions
Error 1: Metadata Index Created After Vectors Inserted
Problem: Filtering doesn't work on existing vectors Solution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting
Error 2: Dimension Mismatch
Problem: "Vector dimensions do not match index configuration" Solution: Ensure embedding model output matches index dimensions: - Workers AI bge-base: 768 - OpenAI small: 1536 - OpenAI large: 3072
Error 3: Invalid Metadata Keys
Problem: "Invalid metadata key" Solution: Keys cannot: - Be empty - Contain . (dot) - Contain " (quote) - Start with $ (dollar sign)
Error 4: Filter Too Large
Problem: "Filter exceeds 2048 bytes" Solution: Simplify filter or split into multiple queries
Error 5: Range Query on High Cardinality
Problem: Slow queries or reduced accuracy Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps
Error 6: Insert vs Upsert Confusion
Problem: Updates not reflecting in index Solution: Use upsert() to overwrite existing vectors, not insert()
Error 7: Missing Bindings
Problem: "VECTORIZE_INDEX is not defined" Solution: Add [[vectorize]] binding to wrangler.jsonc
Error 8: Namespace vs Metadata Confusion
Problem: Unclear when to use namespace vs metadata filtering Solution: - Namespace: Partition key, applied BEFORE metadata filters - Metadata: Flexible key-value filtering within namespace
Error 9: V2 Async Mutation Timing (NEW in V2)
Problem: Inserted vectors not immediately queryable Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected - Use mutationId to track mutation status - Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestamp
Error 10: V1 returnMetadata Boolean (BREAKING in V2)
Problem: "returnMetadata must be 'all', 'indexed', or 'none'" Solution: V2 changed returnMetadata from boolean to string enum: - ❌ V1: { returnMetadata: true } - ✅ V2: { returnMetadata: 'all' }
Error 11: Wrangler --json Output Contains Log Prefix
Error:
wrangler vectorize list --json output starts with log message, breaking JSON parsing
Source: GitHub Issue #11011
Affected Commands:
wrangler vectorize list --jsonwrangler vectorize list-metadata-index --json
Problem:
$ wrangler vectorize list --json 📋 Listing Vectorize indexes... [ { "created_on": "2025-10-18T13:28:30.259277Z", ... } ]
The log message makes output invalid JSON, breaking piping to
jq or other tools.
Solution: Strip first line before parsing:
# Using tail wrangler vectorize list --json | tail -n +2 | jq '.' # Using sed wrangler vectorize list --json | sed '1d' | jq '.'
Error 12: TypeScript Types Missing Filter Operators
Error:
wrangler types generates incomplete VectorizeVectorMetadataFilterOp type
Source: GitHub Issue #10092
Status: OPEN (tracked internally as VS-461)
Problem: Generated type only includes
$eq and $ne, missing V2 operators: $in, $nin, $lt, $lte, $gt, $gte
Impact: TypeScript shows false errors when using valid V2 metadata filter operators:
const vectorizeRes = env.VECTORIZE.queryById(imgId, { filter: { gender: { $in: genderFilters } }, // ❌ TS error but works! topK, returnMetadata: 'indexed', });
Workaround: Manual type override until wrangler types is fixed:
// Add to your types file type VectorizeMetadataFilter = Record<string, | string | number | boolean | { $eq?: string | number | boolean; $ne?: string | number | boolean; $in?: (string | number | boolean)[]; $nin?: (string | number | boolean)[]; $lt?: number | string; $lte?: number | string; $gt?: number | string; $gte?: number | string; } >;
Error 13: Windows Dev Registry Failure (FIXED)
Error:
ENOENT: no such file or directory when running wrangler dev on Windows
Source: GitHub Issue #10383
Status: FIXED in wrangler@4.32.0
Problem: Wrangler attempted to create external worker files with colons in the name (invalid on Windows):
Error: ENOENT: ... '__WRANGLER_EXTERNAL_VECTORIZE_WORKER:<project>:<binding>'
Solution: Update to wrangler@4.32.0 or later:
npm install -g wrangler@latest
Error 14: topK Limit Depends on returnValues/returnMetadata
Error:
topK exceeds maximum allowed value
Source: Vectorize Limits
Problem: Maximum topK value changes based on query options:
| Configuration | Max topK |
|---|---|
, | 100 |
OR | 20 |
| 100 |
Common Error:
// ❌ ERROR - topK too high with returnValues query(embedding, { topK: 100, // Exceeds limit! returnValues: true // Max topK=20 when true });
Solution:
// ✅ OK - respects conditional limit query(embedding, { topK: 20, returnValues: true }); // ✅ OK - higher topK without values query(embedding, { topK: 100, returnValues: false, returnMetadata: 'indexed' });
V2 Migration Checklist
If migrating from V1 to V2:
- ✅ Update wrangler to 3.71.0+ (
)npm install -g wrangler@latest - ✅ Create new V2 index (can't upgrade V1 → V2)
- ✅ Create metadata indexes BEFORE inserting vectors
- ✅ Update
boolean → string enum ('all', 'indexed', 'none')returnMetadata - ✅ Handle async mutations (expect
in responses)mutationId - ✅ Test with V2 limits (topK up to 100, 5M vectors per index)
- ✅ Update error handling for async behavior
V1 Deprecation:
- After December 2024: Cannot create new V1 indexes
- Existing V1 indexes: Continue to work
- Use
for V1 operationswrangler vectorize --deprecated-v1
Testing Considerations
Vitest with Vectorize Bindings
Issue: Using
@cloudflare/vitest-pool-workers with Vectorize or Workers AI bindings causes runtime failure.
Source: GitHub Issue #7434
Error:
wrapped binding module can't be resolved
Workaround:
- Create
without Vectorize/AI bindingswrangler-test.jsonc - Point vitest config to test-specific wrangler file
- Mock bindings in your tests
Example:
// wrangler-test.jsonc (no Vectorize binding) { "name": "my-worker-test", "main": "src/index.ts", "compatibility_date": "2025-10-21" // No vectorize binding } // vitest.config.ts import { defineWorkersProject } from '@cloudflare/vitest-pool-workers/config'; export default defineWorkersProject({ test: { poolOptions: { workers: { wrangler: { configPath: "./wrangler-test.jsonc" } } } } }); // Mock in tests import { vi } from 'vitest'; const mockVectorize = { query: vi.fn().mockResolvedValue({ matches: [ { id: 'test-1', score: 0.95, metadata: { category: 'docs' } } ], count: 1 }), insert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }), upsert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }) }; // Use mock in tests test('vector search', async () => { const env = { VECTORIZE_INDEX: mockVectorize }; // ... test logic });
Community Tips
Note: These tips come from community discussions and official blog posts. Verify against your Vectorize version.
Tip 1: Range Queries at Scale May Have Reduced Accuracy (Community-sourced)
Source: Query Best Practices Confidence: MEDIUM Applies to: Datasets with ~10M+ vectors
Range queries (
$lt, $lte, $gt, $gte) on large datasets may experience reduced accuracy.
Optimization Strategy:
// ❌ High-cardinality range at scale metadata: { timestamp_ms: 1704067200123 } filter: { timestamp_ms: { $gte: 1704067200000 } } // ✅ Bucketed into discrete values metadata: { timestamp_bucket: "2025-01-01-00:00", // 1-hour buckets timestamp_ms: 1704067200123 // Original (non-indexed) } filter: { timestamp_bucket: { $in: ["2025-01-01-00:00", "2025-01-01-01:00"] } }
When This Matters:
- Time-based filtering over months/years
- User IDs, transaction IDs (UUID ranges)
- Any high-cardinality continuous data
Alternative: Use equality filters (
$eq, $in) with bucketed values.
Tip 2: List Vectors Operation (Added August 2025)
Source: Vectorize Changelog
Vectorize V2 added support for the
list-vectors operation for paginated iteration through vector IDs.
Use Cases:
- Auditing vector collections
- Bulk vector operations
- Debugging index contents
API:
const result = await env.VECTORIZE_INDEX.list({ limit: 1000, // Max 1000 per page cursor?: string }); // result.vectors: Array<{ id: string }> // result.cursor: string | undefined // result.count: number // Pagination example let cursor: string | undefined; const allVectorIds: string[] = []; do { const result = await env.VECTORIZE_INDEX.list({ limit: 1000, cursor }); allVectorIds.push(...result.vectors.map(v => v.id)); cursor = result.cursor; } while (cursor);
Limitations:
- Returns IDs only (not values or metadata)
- Max 1000 vectors per page
- Use cursor for pagination
Official Documentation
- Vectorize V2 Docs: https://developers.cloudflare.com/vectorize/
- V2 Changelog: https://developers.cloudflare.com/vectorize/platform/changelog/
- V1 to V2 Migration: https://developers.cloudflare.com/vectorize/reference/transition-vectorize-legacy/
- Metadata Filtering: https://developers.cloudflare.com/vectorize/reference/metadata-filtering/
- Workers AI Models: https://developers.cloudflare.com/workers-ai/models/
Status: Production Ready ✅ (Vectorize V2 GA - September 2024) Last Updated: 2026-01-21 Token Savings: ~70% Errors Prevented: 14 (includes V2 breaking changes, testing setup, TypeScript types) Changes: Added 4 new errors (wrangler --json, TypeScript types, Windows dev, topK limits), batch performance best practices, query accuracy modes, testing setup, community tips on range queries and list-vectors operation.