Claude-skill-registry embedding
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/embedding" ~/.claude/skills/majiayu000-claude-skill-registry-embedding && rm -rf "$T"
manifest:
skills/data/embedding/SKILL.mdsource content
Embedding Skill
Standalone embedding service for semantic search across any database.
Architecture
┌─────────────────────────────────────────┐ │ embedding service (:8602) │ │ Model: EMBEDDING_MODEL env var │ │ Device: auto (CPU/GPU) │ └───────────────────┬─────────────────────┘ │ ┌───────────────┼───────────────┐ ▼ ▼ ▼ memory edge-verifier your-project skill searches ArangoDB/etc
Quick Start
# Start the service (first run loads model ~5-10s) ./run.sh serve # Embed text (CLI) ./run.sh embed --text "your query here" # Embed via HTTP (after service is running) curl -X POST http://127.0.0.1:8602/embed -H "Content-Type: application/json" \ -d '{"text": "your query here"}'
Commands
| Command | Description |
|---|---|
| Start persistent FastAPI server |
| Embed single text (uses service if running) |
| Embed file contents |
| Show model, device, service status |
Configuration
| Variable | Default | Description |
|---|---|---|
| | Sentence-transformers model name |
| | Device: , , , |
| | Service port |
| | Client connection URL |
Swapping Models
# Use a different model for this project export EMBEDDING_MODEL="nomic-ai/nomic-embed-text-v1" ./run.sh serve # Or for GPU-accelerated export EMBEDDING_MODEL="intfloat/e5-large-v2" export EMBEDDING_DEVICE="cuda" ./run.sh serve
API Endpoints
POST /embed
Embed single text.
{"text": "query to embed"} → {"vector": [0.1, 0.2, ...], "model": "all-MiniLM-L6-v2", "dimensions": 384}
POST /embed/batch
Embed multiple texts.
{"texts": ["query 1", "query 2"]} → {"vectors": [[...], [...]], "model": "...", "count": 2}
GET /info
Service status and configuration.
{ "model": "all-MiniLM-L6-v2", "device": "cuda", "dimensions": 384, "status": "ready" }
Integration Examples
ArangoDB Semantic Search
import httpx # Get embedding resp = httpx.post("http://127.0.0.1:8602/embed", json={"text": "find similar docs"}) vector = resp.json()["vector"] # Use in AQL query aql = """ FOR doc IN my_collection LET score = COSINE_SIMILARITY(doc.embedding, @vector) FILTER score > 0.7 SORT score DESC RETURN doc """
From Memory Skill
Memory skill can consume this service by setting:
export EMBEDDING_SERVICE_URL="http://127.0.0.1:8602"
Cold Start
First invocation loads the model (~5-10 seconds). After that, embeddings are millisecond-latency. The service logs progress:
[embedding] Loading model: all-MiniLM-L6-v2... [embedding] Model loaded in 6.2s [embedding] Service ready on http://127.0.0.1:8602