Skillshub sqlite-vec-skilld
ALWAYS use when writing code importing \"sqlite-vec\". Consult for debugging, best practices, or modifying sqlite-vec, sqlite vec.
git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/harlan-zw/skilld/sqlite-vec-skilld" ~/.claude/skills/comeonoliver-skillshub-sqlite-vec-skilld && rm -rf "$T"
skills/harlan-zw/skilld/sqlite-vec-skilld/SKILL.mdasg017/sqlite-vec sqlite-vec
sqlite-vecVersion: 0.1.7 Tags: latest: 0.1.7, alpha: 0.1.7-alpha.13
References: package.json — exports, entry points • README — setup, basic usage • Docs — API reference, guides • GitHub Issues — bugs, workarounds, edge cases • Releases — changelog, breaking changes, new APIs
Search
Use
skilld search instead of grepping .skilld/ directories — hybrid semantic + keyword search across all indexed docs, issues, and releases. If skilld is unavailable, use npx -y skilld search.
skilld search "query" -p sqlite-vec skilld search "issues:error handling" -p sqlite-vec skilld search "releases:deprecated" -p sqlite-vec
Filters:
docs:, issues:, releases: prefix narrows by source type.
<!-- skilld:api-changes -->
API Changes
This section documents version-specific API changes — prioritize recent major/minor releases.
-
BREAKING: DELETE operations now properly clear vector data and free space — v0.1.7 changed behavior from only setting validity bits. Code using DELETE statements may see different storage behavior source
-
NEW: Distance column constraints in KNN queries — v0.1.7 adds support for
,>
,>=
,<
constraints on the distance column, enabling pagination-like patterns without requiring large k values source<= -
NEW: Metadata columns in vec0 virtual tables — v0.1.6 added ability to declare metadata columns that can be filtered in WHERE clauses of KNN queries alongside vector matching source
-
NEW: Partition keys for internal index sharding — v0.1.6 added
syntax to internally shard vector indexes by column values sourcepartition key -
NEW: Auxiliary columns with
prefix — v0.1.6 added support for auxiliary columns (prefix with+
) that are unindexed but available for fast lookups in KNN query results source+ -
BREAKING:
table function removed from default entrypoint — v0.1.3 moved this experimental function out due to CVE-2024-46488 security mitigation; affected code using untrusted SQL or the rarevec_npy_each
function sourcevec_npy_each
Also changed: Static linking support for SQLite 3.31.1+ ·
serialize_float32() / serialize_int8() Python functions added
<!-- /skilld:api-changes -->
<!-- skilld:best-practices -->
Best Practices
-
Use two-column re-scoring pattern for binary quantization — store both quantized and full-precision vectors; query coarse index with quantized vectors, then re-score top candidates with full precision to recover quality lost from extreme dimensionality reduction source
-
Combine
withvec_slice()
for Matryoshka embeddings — truncating dimensions requires subsequent normalization to maintain embedding quality and semantic meaning sourcevec_normalize() -
Prefer scalar quantization over binary quantization for moderate storage savings — trade off storage efficiency against quality loss;
(2 bytes per value) andvec_quantize_float16
(1 byte per value) offer better quality retention than binary quantization for many use cases sourcevec_quantize_int8 -
Use partition keys to shard large vector datasets — declare a
column inpartition key
to internally shard the vector index on that column, improving query performance by reducing search scope sourceCREATE VIRTUAL TABLE -
Combine metadata columns (indexed) with auxiliary columns (unindexed) for efficient filtering — use regular metadata columns for dimensions you filter on in KNN WHERE clauses; prefix columns with
to store related data without indexing overhead source+ -
Use distance constraints instead of oversampling for pagination — as of v0.1.7, apply
ordistance > threshold
constraints in WHERE clauses to paginate through KNN results without fetching excess candidates sourcedistance < threshold -
Monitor the k value limit when performing large KNN queries — the default maximum k is 4096 (configurable) to prevent memory exhaustion; be aware that kNN results are materialized in memory and internally use O(n²) complexity on k source
-
Rely on v0.1.7+ for automatic DELETE cleanup — vector space is now reclaimed when enough vectors are deleted to clear a chunk (~1024 vectors); previous versions only marked entries as deleted without freeing space source
-
Select embedding models with quantization support for better results — models like
,nomic-embed-text-v1.5
, and OpenAI'smxbai-embed-large-v1
are specifically trained to maintain quality after quantization and Matryoshka truncation sourcetext-embedding-3