Skills cloudflare-vectorize
git clone https://github.com/TerminalSkills/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/TerminalSkills/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/cloudflare-vectorize" ~/.claude/skills/terminalskills-skills-cloudflare-vectorize && rm -rf "$T"
skills/cloudflare-vectorize/SKILL.mdCloudflare Vectorize
Overview
Cloudflare Vectorize is a globally distributed vector database built into the Cloudflare Workers platform. It stores high-dimensional vectors (embeddings) and supports fast approximate nearest-neighbor search — all at the edge, with no separate infrastructure to manage.
Key features:
- Create and query indexes directly from Workers
- Metadata filtering alongside vector similarity
- Namespace support for multi-tenant isolation
- Native integration with Workers AI for end-to-end RAG
- Scales automatically with zero configuration
Setup
1. Create a Vectorize index
Use Wrangler CLI to create an index. Specify the embedding dimensions and distance metric:
# For BAAI/bge-base-en-v1.5 (768 dims, cosine similarity) npx wrangler vectorize create my-index \ --dimensions=768 \ --metric=cosine # For OpenAI text-embedding-3-small (1536 dims) npx wrangler vectorize create my-index \ --dimensions=1536 \ --metric=cosine # Euclidean and dot-product are also supported npx wrangler vectorize create my-index \ --dimensions=384 \ --metric=euclidean
2. Bind the index in wrangler.toml
wrangler.tomlname = "my-worker" main = "src/index.ts" compatibility_date = "2024-09-23" [[vectorize]] binding = "VECTORIZE_INDEX" index_name = "my-index"
3. TypeScript types
export interface Env { VECTORIZE_INDEX: VectorizeIndex }
Instructions
Step 1: Insert vectors
Each vector needs a unique string
id and a values array matching the index dimensions:
export default { async fetch(request: Request, env: Env): Promise<Response> { const vectors: VectorizeVector[] = [ { id: "doc-001", values: [0.1, 0.2, 0.3, /* ... 768 total */], metadata: { title: "Introduction to Cloudflare", url: "/docs/intro" }, }, { id: "doc-002", values: [0.4, 0.5, 0.6, /* ... */], metadata: { title: "Workers AI Overview", url: "/docs/workers-ai" }, }, ] const result = await env.VECTORIZE_INDEX.insert(vectors) // result.count = number of vectors inserted return Response.json({ inserted: result.count }) }, }
Step 2: Query for similar vectors
export default { async fetch(request: Request, env: Env): Promise<Response> { const { queryVector, topK = 5 } = await request.json() as { queryVector: number[] topK?: number } const results = await env.VECTORIZE_INDEX.query(queryVector, { topK, returnMetadata: true, // include metadata in results returnValues: false, // skip returning raw vector values }) // results.matches is sorted by score (highest = most similar) return Response.json({ matches: results.matches.map(m => ({ id: m.id, score: m.score, metadata: m.metadata, })) }) }, }
Step 3: Metadata filtering
Filter results to a subset before computing similarity — useful for multi-tenant or categorized data:
const results = await env.VECTORIZE_INDEX.query(queryVector, { topK: 10, returnMetadata: true, filter: { category: { $eq: "documentation" }, }, }) // Compound filter const filtered = await env.VECTORIZE_INDEX.query(queryVector, { topK: 5, returnMetadata: true, filter: { language: { $eq: "en" }, published: { $eq: true }, }, })
Supported filter operators:
$eq, $ne, $lt, $lte, $gt, $gte, $in
Step 4: Namespace support
Use namespaces to isolate data for different tenants or categories within a single index:
// Insert with namespace await env.VECTORIZE_INDEX.insert([{ id: "tenant-a-doc-1", values: embedding, metadata: { text: "Document content..." }, namespace: "tenant-a", }]) // Query within a namespace const results = await env.VECTORIZE_INDEX.query(queryVector, { topK: 5, returnMetadata: true, namespace: "tenant-a", })
Step 5: Get, update, and delete vectors
// Get vectors by ID const vectors = await env.VECTORIZE_INDEX.getByIds(["doc-001", "doc-002"]) // Upsert (insert or update) await env.VECTORIZE_INDEX.upsert([{ id: "doc-001", values: newEmbedding, metadata: { updated: true }, }]) // Delete by ID await env.VECTORIZE_INDEX.deleteByIds(["doc-001", "doc-002"])
Step 6: End-to-end RAG with Workers AI
Complete RAG pipeline — embed query, search Vectorize, generate answer with LLM:
export interface Env { AI: Ai VECTORIZE_INDEX: VectorizeIndex } export default { async fetch(request: Request, env: Env): Promise<Response> { const { question } = await request.json() as { question: string } // 1. Embed the user's question const embeddingResult = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: [question], }) const queryVector = embeddingResult.data[0] // 2. Find relevant documents const searchResults = await env.VECTORIZE_INDEX.query(queryVector, { topK: 3, returnMetadata: true, }) const context = searchResults.matches .map(m => m.metadata?.text as string) .filter(Boolean) .join("\n\n") // 3. Generate answer with context const answer = await env.AI.run("@cf/meta/llama-3-8b-instruct", { messages: [ { role: "system", content: `Answer the question using only the provided context.\n\nContext:\n${context}`, }, { role: "user", content: question }, ], max_tokens: 512, }) return Response.json({ answer: answer.response, sources: searchResults.matches.map(m => ({ id: m.id, score: m.score, url: m.metadata?.url, })), }) }, }
Step 7: Bulk indexing pipeline
For indexing large document collections, batch inserts for efficiency:
async function indexDocuments( documents: Array<{ id: string; text: string; metadata: Record<string, unknown> }>, env: Env, batchSize = 100 ) { for (let i = 0; i < documents.length; i += batchSize) { const batch = documents.slice(i, i + batchSize) // Embed batch const embeddingResult = await env.AI.run("@cf/baai/bge-base-en-v1.5", { text: batch.map(d => d.text), }) // Prepare vectors const vectors: VectorizeVector[] = batch.map((doc, idx) => ({ id: doc.id, values: embeddingResult.data[idx], metadata: { ...doc.metadata, text: doc.text }, })) // Insert batch await env.VECTORIZE_INDEX.insert(vectors) console.log(`Indexed ${i + batch.length}/${documents.length} documents`) } }
Manage indexes via Wrangler
# List all indexes npx wrangler vectorize list # Describe an index (dimensions, metric, vector count) npx wrangler vectorize info my-index # Delete an index npx wrangler vectorize delete my-index # Get vectors by ID (for debugging) npx wrangler vectorize get-vectors my-index --ids=doc-001,doc-002
Guidelines
- Dimensions must match exactly what your embedding model produces — mismatches cause errors at insert time.
- Use
distance for normalized text embeddings (BAAI, OpenAI); usecosine
oreuclidean
only when your model specifically recommends it.dot-product - Store the original text in
so you can return it with search results without a separate database lookup.metadata - Vectorize supports up to 100 vectors per
call — batch larger datasets.insert() - Metadata values must be strings, numbers, or booleans; nested objects are not supported in filters.
- Use namespaces for multi-tenant apps instead of separate indexes — it's cheaper and simpler.
- Vectorize indexes have eventual consistency; newly inserted vectors may not appear in queries for a few seconds.
- Combine with Workers AI for fully serverless RAG — no external embedding API keys required.