Skills pgvector

install

source · Clone the upstream repo

git clone https://github.com/TerminalSkills/skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/TerminalSkills/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/pgvector" ~/.claude/skills/terminalskills-skills-pgvector && rm -rf "$T"

manifest: skills/pgvector/SKILL.md

safety · automated scan (low risk)

This is a pattern-based risk scan, not a security review. Our crawler flagged:

references .env files

Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.

source content

pgvector

Overview

pgvector adds vector similarity search to PostgreSQL. Store embeddings alongside your regular data — no separate vector database, no data sync, no new infrastructure. Use your existing Postgres for semantic search, RAG, recommendations, and deduplication. Supports exact and approximate nearest neighbor search with IVFFlat and HNSW indexes.

When to Use

Adding semantic/vector search to an existing Postgres-backed app
RAG (Retrieval-Augmented Generation) without running Pinecone/Qdrant/Weaviate
Storing embeddings alongside relational data (users, products, documents)
Recommendation systems based on content similarity
Don't want to manage a separate vector database

Instructions

Setup

-- Enable the extension (available on Supabase, Neon, RDS, self-hosted)
CREATE EXTENSION IF NOT EXISTS vector;

Schema Design

-- Store documents with embeddings alongside regular columns
CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  metadata JSONB DEFAULT '{}',
  embedding vector(1536),          -- OpenAI ada-002 dimension
  created_at TIMESTAMPTZ DEFAULT NOW()
);

-- HNSW index for fast approximate search (recommended)
CREATE INDEX ON documents
  USING hnsw (embedding vector_cosine_ops)
  WITH (m = 16, ef_construction = 64);

-- Or IVFFlat for lower memory usage
-- CREATE INDEX ON documents
--   USING ivfflat (embedding vector_cosine_ops)
--   WITH (lists = 100);

Store Embeddings

// ingest.ts — Generate and store embeddings
import { Pool } from "pg";
import OpenAI from "openai";

const pool = new Pool({ connectionString: process.env.DATABASE_URL });
const openai = new OpenAI();

async function storeDocument(title: string, content: string, metadata: object = {}) {
  // Generate embedding
  const embeddingRes = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: content,
  });
  const embedding = embeddingRes.data[0].embedding;

  // Store with embedding (pgvector accepts array format)
  await pool.query(
    `INSERT INTO documents (title, content, metadata, embedding)
     VALUES ($1, $2, $3, $4)`,
    [title, content, JSON.stringify(metadata), JSON.stringify(embedding)]
  );
}

Similarity Search

// search.ts — Find similar documents by vector distance
async function semanticSearch(query: string, limit = 5, threshold = 0.7) {
  // Embed the query
  const embeddingRes = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: query,
  });
  const queryEmbedding = embeddingRes.data[0].embedding;

  // Cosine similarity search
  const result = await pool.query(
    `SELECT id, title, content, metadata,
            1 - (embedding <=> $1::vector) AS similarity
     FROM documents
     WHERE 1 - (embedding <=> $1::vector) > $2
     ORDER BY embedding <=> $1::vector
     LIMIT $3`,
    [JSON.stringify(queryEmbedding), threshold, limit]
  );

  return result.rows;
  // [{ id: 1, title: "...", content: "...", similarity: 0.89 }, ...]
}

RAG with pgvector

// rag.ts — Retrieval-Augmented Generation using pgvector
async function ragAnswer(question: string): Promise<string> {
  // 1. Find relevant documents
  const docs = await semanticSearch(question, 5);

  // 2. Build context from retrieved documents
  const context = docs.map((d) => `## ${d.title}\n${d.content}`).join("\n\n");

  // 3. Generate answer with context
  const completion = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      {
        role: "system",
        content: `Answer based on the following context. If the context doesn't contain the answer, say so.\n\n${context}`,
      },
      { role: "user", content: question },
    ],
  });

  return completion.choices[0].message.content!;
}

With Drizzle ORM

// schema.ts — pgvector with Drizzle ORM
import { pgTable, text, serial, jsonb, index, timestamp } from "drizzle-orm/pg-core";
import { customType } from "drizzle-orm/pg-core";

// Custom vector type for Drizzle
const vector = customType<{ data: number[]; driverData: string }>({
  dataType: () => "vector(1536)",
  toDriver: (value) => JSON.stringify(value),
});

export const documents = pgTable("documents", {
  id: serial("id").primaryKey(),
  title: text("title").notNull(),
  content: text("content").notNull(),
  embedding: vector("embedding"),
  createdAt: timestamp("created_at").defaultNow(),
});

Examples

Example 1: Add semantic search to an existing app

User prompt: "I have a Postgres database with articles. Add semantic search so users can search by meaning, not just keywords."

The agent will add a vector column, generate embeddings for existing articles, create an HNSW index, and build a search endpoint that combines vector similarity with existing filters.

Example 2: Document Q&A with RAG

User prompt: "Build a Q&A system over our internal docs using our existing Postgres database."

The agent will chunk documents, store embeddings in pgvector, and build a RAG pipeline that retrieves relevant chunks and generates answers.

Guidelines

HNSW index for most cases — faster queries, slightly more memory than IVFFlat
Cosine distance (
<=>
) — best for normalized embeddings (OpenAI, Cohere)
L2 distance (
<->
) — for non-normalized embeddings
Dimension must match model — ada-002: 1536, text-embedding-3-small: 1536
Index after bulk insert — create index after loading data, not before
Filter + vector search — combine
```
WHERE
```
clauses with vector similarity
No separate infrastructure — one less service to manage, deploy, and pay for
Supabase has it built-in —
```
enable_extension('vector')
```
in dashboard
Chunk long documents — 500-1500 tokens per chunk for best retrieval
Re-embed when you change models — embeddings from different models aren't compatible