Asi scry
git clone https://github.com/plurigrid/asi
T=$(mktemp -d) && git clone --depth=1 https://github.com/plurigrid/asi "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/asi/skills/scry" ~/.claude/skills/plurigrid-asi-scry && rm -rf "$T"
plugins/asi/skills/scry/SKILL.mdScry Skill
Scry gives you read-only SQL access to the ExoPriors public corpus (229M+ entities) via a single HTTP endpoint. You write Postgres SQL against a curated
scry.* schema
and get JSON rows back. There is no ORM, no GraphQL, no pagination token -- just SQL.
Skill generation:
2026031701
A) When to use / not use
Use this skill when:
- Searching, filtering, or aggregating content across the ExoPriors corpus
- Running lexical (BM25) or hybrid searches
- Exploring author networks, cross-platform identities, or publication patterns
- Navigating the OpenAlex academic graph (authors, citations, institutions, concepts)
- Creating shareable artifacts from query results
- Emitting structured agent judgements about entities or external references
Do NOT use this skill when:
- The user wants semantic/vector search composition or embedding algebra (use the scry-vectors skill)
- The user wants LLM-based reranking (use the scry-rerank skill)
- The user is querying their own local database
B) Golden Rules
-
Context handshake first. At session start, call
. This endpoint is public; you do not need a key for the handshake itself. Use the returnedGET /v1/scry/context?skill_generation=2026031701
block for the current product summary budgets, canonical env var, default skill, and specialized skill catalog. If you need a concise shareable bootstrap prompt for another agent, useofferings
instead of paraphrasing your own. If you need deeper docs, useofferings.public_agent_prompt.copy_text
, each skill'sofferings.canonical_doc_path
, andrepo_path
instead of guessing where the maintained docs live. If you cache descriptive bootstrap context across turns or sessions, also trackreference_paths
and refresh when it changes. Readsurface_context_generation
as well: if it is notlexical_search.status
, stop assuming globalhealthy
is reliable and pivot to source-localscry.search*
/scry.*
surfaces or semantic retrieval while the canonical BM25 index recovers. Ifmv_*
, tell the user to runshould_update_skill=true
. If the response reportsnpx skills update
while you're using packaged skills, or if local instructions still mentionclient_skill_generation: null
orapi.exopriors.com
, treat the install as stale and ask the user to runexopriors.com/console
before more debugging.npx skills update -
Schema first. ALWAYS call
before writing SQL. Never guess column names or types. The schema endpoint returns live column metadata and row-count estimates for every view.GET /v1/scry/schema -
Check operational status when search looks wrong. If lexical search, materialized-view freshness, or corpus behavior seems off, call
before assuming the query or schema is wrong.GET /v1/scry/index-view-status -
Clarify ambiguous intent before heavy queries. If the request is vague ("search Reddit for X", "find things about Y"), ask one short clarification question about the goal/output format before running expensive SQL.
-
Start with a cheap probe. Before any query likely to run >5s, run
and/or a tight exploratory query (/v1/scry/estimate
plus scoped source/window filters), then scale only after confirming relevance.LIMIT 20 -
Choose lexical vs semantic explicitly. Use lexical (
) for exact terms and named entities. For conceptual intent ("themes", "things like", "similar to"), route to scry-vectors first, then optionally hybridize.scry.search* -
LIMIT always. Every query MUST include a LIMIT clause. Max 10,000 rows. Queries without LIMIT are rejected by the SQL validator.
-
Prefer canonical surfaces with tight filters.
has 229M+ rows, so do not scan it blindly. Usescry.entities
for lexical retrieval,scry.search*
for chunk-level semantic retrieval,scry.chunk_embeddings
orscry.entity_embeddings
only when you want one entity-level vector row per entity,scry.entities_with_embeddings
to inspect public vs staged vs ready source/kind coverage, and source-native tables such asscry.embedding_coverage
,scry.hackernews_items
,scry.wikipedia_articles
,scry.pubmed_papers
,scry.repec_records
,scry.openalex_works
,scry.bluesky_posts
, andscry.mailing_list_messages
when a corpus no longer lives canonically inscry.openlibrary_*
. Reach for a specificscry.entities
convenience view only whenmv_*
confirms it is healthy and useful for the task./v1/scry/schema -
Cross-table composition is normal. If the best records live in multiple source-native tables, combine them in one SQL statement with CTEs,
, and joins throughUNION ALL
. This is the intended contract, not a workaround.scry.source_records -
Filter dangerous content. Always include
unless the user explicitly asks for unfiltered results. Dangerous content contains adversarial prompt-injection content.WHERE content_risk IS DISTINCT FROM 'dangerous' -
Raw SQL, not JSON.
takesPOST /v1/scry/query
with raw SQL in the body. Not JSON-wrapped SQL.Content-Type: text/plain -
File rough edges promptly. If Scry blocks the task, misses an obvious result set, or exposes a rough edge, submit a brief note to
usingPOST /v1/feedback?feedback_type=suggestion|bug|other&channel=scry_skill
by default (Content-Type: text/plain
also works). Do not silently work around it. Logged-in users can review their submissions withtext/markdown
.GET /v1/feedback
For full tier limits, timeout policies, and degradation strategies, see Shared Guardrails.
B.1 API Key Setup (Canonical)
Recommended default for less-technical users: in the directory where you launch the agent, store
SCRY_API_KEY in .env so skills and copied prompts use the same place.
Canonical key naming for this skill:
- Env var:
SCRY_API_KEY - Anonymous bootstrap key format:
fromscry_anon_*POST /v1/scry/anonymous-key - Personal key format: personal Scry API key with Scry access
- Recommended anonymous client header:
X-Scry-Client-Tag: <short-stable-tag>
printf '%s\n' 'SCRY_API_KEY=<your key>' >> .env set -a && source .env && set +a
Verify:
echo "$SCRY_API_KEY"
Anonymous bootstrap flow when the user wants immediate public access without signup:
CLIENT_TAG="${SCRY_CLIENT_TAG:-dev-laptop}" ANON_KEY="$(curl -s https://api.scry.io/v1/scry/anonymous-key -X POST -H "X-Scry-Client-Tag: $CLIENT_TAG" | python3 -c 'import json,sys; print(json.load(sys.stdin)[\"api_key\"])')" curl -s https://api.scry.io/v1/scry/schema \ -H "Authorization: Bearer $ANON_KEY" \ -H "X-Scry-Client-Tag: $CLIENT_TAG" curl -s https://api.scry.io/v1/scry/query \ -H "Authorization: Bearer $ANON_KEY" \ -H "X-Scry-Client-Tag: $CLIENT_TAG" \ -H "Content-Type: text/plain" \ --data "SELECT 1 LIMIT 1"
Use this for fast trial access only. The anonymous bootstrap lane is intentionally generous for the first few queries and then degrades. For sustained usage, prefer a personal Scry API key. Keep the same
X-Scry-Client-Tag value on the same device when staying anonymous so the backend can distinguish a real first-use session from abuse behind shared IPs.
If using packaged skills, keep them current:
npx skills add exopriors/skills npx skills update
B.1b x402 Query-Only Access
POST /v1/scry/query still supports standard x402, but it is now an explicit
paid path rather than the default no-auth bootstrap path. Use x402 when the
user already has an x402-capable wallet/client and only needs direct paid query
execution. For public trial use, use POST /v1/scry/anonymous-key. For
schema/context, shares, judgements, feedback, or repeated multi-endpoint usage,
prefer a personal Scry API key.
If the user wants wallet-native durable identity plus a reusable key, use
POST /v1/auth/agent/signup first. That binds the wallet to a user and returns
a session token plus API key in one flow.
Minimal client shape:
import { wrapFetchWithPayment } from 'x402-fetch'; const paidFetch = wrapFetchWithPayment(fetch, walletClient); const resp = await paidFetch('https://api.scry.io/v1/scry/query', { method: 'POST', headers: { 'content-type': 'text/plain' }, body: 'SELECT 1 LIMIT 1', });
C) Quickstart
One end-to-end example: find recent high-scoring LessWrong posts about RLHF.
Step 1: Get dynamic context + update advisory GET https://api.scry.io/v1/scry/context?skill_generation=2026031701 Authorization: Bearer $SCRY_API_KEY Step 2: Get schema GET https://api.scry.io/v1/scry/schema Authorization: Bearer $SCRY_API_KEY Step 3: Run query POST https://api.scry.io/v1/scry/query Authorization: Bearer $SCRY_API_KEY Content-Type: text/plain WITH hits AS ( SELECT id FROM scry.search('RLHF reinforcement learning human feedback', kinds=>ARRAY['post'], limit_n=>100) ) SELECT e.uri, e.title, e.original_author, e.original_timestamp, e.score FROM hits h JOIN scry.entities e ON e.id = h.id WHERE e.source = 'lesswrong' ORDER BY e.score DESC NULLS LAST, e.original_timestamp DESC LIMIT 20
Response shape:
{ "columns": ["uri", "title", "original_author", "original_timestamp", "score"], "rows": [["https://...", "My RLHF Post", "author", "2025-01-15T...", 142], ...], "row_count": 20, "duration_ms": 312, "truncated": false }
Source-native cross-table example:
WITH hn AS ( SELECT 'hackernews'::text AS source, hn_id::text AS external_id, score FROM scry.search_hackernews_items('interpretability', kinds => ARRAY['post'], limit_n => 20) ), wiki AS ( SELECT 'wikipedia'::text AS source, page_id::text AS external_id, score FROM scry.search_wikipedia_articles('interpretability', limit_n => 20) ), hits AS ( SELECT * FROM hn UNION ALL SELECT * FROM wiki ) SELECT h.source, r.uri, r.title, h.score FROM hits h JOIN scry.source_records r ON r.source = h.source AND r.external_id = h.external_id ORDER BY h.score DESC LIMIT 20;
D) Decision Tree
User wants to search the ExoPriors corpus? | +-- Ambiguous / conceptual ask? --> Clarify intent first, then use | scry-vectors for semantic search (optionally hybridize with lexical) | +-- By keywords/phrases? --> scry.search() (BM25 lexical over canonical content_text) | +-- Specific forum? --> join/filter `source` explicitly (or use a healthy source-local view if schema confirms it) | +-- Reddit? --> START with scry.reddit_subreddit_stats / | scry.reddit_clusters() / scry.reddit_embeddings | and trust /v1/scry/schema status before | using direct retrieval helpers | +-- Large result? --> scry.search_ids() (up to 2000 lexical IDs; join for fields) | +-- By structured filters (source, date, author)? --> Direct SQL on MVs | +-- By semantic similarity? --> (scry-vectors skill, not this one) | +-- Hybrid (keywords + semantic rerank)? --> scry.hybrid_search() or | lexical CTE + JOIN scry.chunk_embeddings | +-- Author/people lookup? --> scry.actors, scry.people, scry.person_accounts | +-- Academic graph (OpenAlex)? --> scry.openalex_find_authors(), | scry.openalex_find_works(), etc. (see schema-guide.md) | +-- Need to share results? --> POST /v1/scry/shares | +-- Need to emit a structured observation? --> POST /v1/scry/judgements | +-- Scry blocked / missing obvious results? --> POST /v1/feedback
E) Recipes
E0. Context handshake + skill update advisory
curl -s "https://api.scry.io/v1/scry/context?skill_generation=2026031701" \ -H "Authorization: Bearer $SCRY_API_KEY"
If response includes
"should_update_skill": true, ask the user to run:
npx skills update.
If the response shows "client_skill_generation": null while the session is
using packaged Scry skills, or if local instructions still point at
api.exopriors.com / exopriors.com/console, stop and ask the user to run
npx skills update before deeper debugging.
If response includes "lexical_search": {"status": "rebuilding"|"degraded"|"stale"|...},
prefer source-local scry.* surfaces or scry.entities_with_embeddings and use
/v1/scry/index-view-status for detailed rebuild timing before blaming the query.
E0b. Submit feedback when Scry blocks the task
curl -s "https://api.scry.io/v1/feedback?feedback_type=bug&channel=scry_skill" \ -H "Authorization: Bearer $SCRY_API_KEY" \ -H "Content-Type: text/plain" \ --data $'## What happened\n- Query: ...\n- Problem: ...\n\n## Why it matters\n- ...\n\n## Suggested fix\n- ...'
Success response includes a receipt
id. Logged-in users can review their own
submissions with:
curl -s "https://api.scry.io/v1/feedback?limit=10" \ -H "Authorization: Bearer $SCRY_API_KEY"
E1. Lexical search (BM25)
WITH c AS ( SELECT id FROM scry.search('your query here', kinds=>ARRAY['post'], limit_n=>100) ) SELECT e.uri, e.title, e.original_author, e.original_timestamp FROM c JOIN scry.entities e ON e.id = c.id WHERE e.content_risk IS DISTINCT FROM 'dangerous' LIMIT 50
Default
kinds if omitted: ['post','paper','document','webpage','twitter_thread','grant'].
scry.search() broadens once to kinds=>ARRAY['comment'] if that default returns 0 rows.
Pass explicit kinds for strict scope (for example comment-only or tweet-only).
For source scoping, join back to scry.entities and filter source explicitly.
Healthy source-specific MVs can still be useful for source-native score fields
such as base_score, but they are optional convenience slices rather than the default path.
E2. Reddit-specific discovery
SELECT subreddit, total_count, latest FROM scry.reddit_subreddit_stats WHERE subreddit IN ('MachineLearning', 'LocalLLaMA') ORDER BY total_count DESC
For semantic Reddit retrieval over the embedding-covered subset, use
scry.reddit_embeddings or scry.search_reddit_posts_semantic(...).
Direct retrieval helpers (
scry.reddit_posts, scry.reddit_comments,
scry.mv_reddit_*, scry.search_reddit_posts(...),
scry.search_reddit_comments(...)) are currently degraded on the public
instance. Check /v1/scry/schema status before using them.
E3. Source-filtered materialized view query
SELECT entity_id, uri, title, original_author, score, original_timestamp FROM scry.mv_arxiv_papers WHERE original_timestamp >= '2025-01-01' ORDER BY original_timestamp DESC LIMIT 50
score is NULL for arXiv papers on the public surface. Sort by
original_timestamp, category, or downstream citation proxies instead.
E4. Author activity across sources
SELECT e.source::text, COUNT(*) AS docs, MAX(e.original_timestamp) AS latest FROM scry.entities e WHERE e.original_author ILIKE '%yudkowsky%' AND e.content_risk IS DISTINCT FROM 'dangerous' GROUP BY e.source::text ORDER BY docs DESC LIMIT 20
E5. Recent entity kind distribution for a source
SELECT kind::text, COUNT(*) FROM scry.hackernews_items WHERE original_timestamp >= '2025-01-01' GROUP BY kind::text ORDER BY 2 DESC LIMIT 20
Source-native corpora follow the same pattern:
SELECT kind::text, COUNT(*) FROM scry.wikipedia_articles WHERE original_timestamp >= '2025-01-01' GROUP BY kind::text ORDER BY 2 DESC LIMIT 20
Removing the date bound turns this into a large base-table aggregation. Run
/v1/scry/estimate first or prefer source-specific MVs when they already cover
the question.
E6. Hybrid search (lexical + semantic rerank in SQL)
WITH c AS ( SELECT id FROM scry.search('deceptive alignment', kinds=>ARRAY['post'], limit_n=>200) ) SELECT e.uri, e.title, e.original_author, emb.embedding_voyage4 <=> @p_deadbeef_topic AS distance FROM c JOIN scry.entities e ON e.id = c.id JOIN scry.chunk_embeddings emb ON emb.entity_id = c.id AND emb.chunk_index = 0 WHERE e.content_risk IS DISTINCT FROM 'dangerous' ORDER BY distance LIMIT 50
Requires a stored embedding handle (
@p_deadbeef_topic). See scry-vectors
skill for creating handles.
E7. Cost estimation before execution
curl -s -X POST https://api.scry.io/v1/scry/estimate \ -H "Authorization: Bearer $SCRY_API_KEY" \ -H "Content-Type: application/json" \ -d '{"sql": "SELECT id, title FROM scry.mv_arxiv_papers LIMIT 1000"}'
Returns EXPLAIN (FORMAT JSON) output. Use this for expensive queries before committing. It does not prove BM25 helper health: if
scry.search* fails, check
/v1/scry/index-view-status and /v1/scry/schema status as well.
The /v1/scry/context handshake now also exposes lexical_search.status for
cheap degraded-mode detection before you start issuing lexical helpers.
E8. Create a shareable artifact
# 1. Run query and capture results # 2. POST share curl -s -X POST https://api.scry.io/v1/scry/shares \ -H "Authorization: Bearer $SCRY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "kind": "query", "title": "Top RLHF posts on LessWrong", "summary": "20 highest-scored LW posts mentioning RLHF.", "payload": { "sql": "...", "result": {"columns": [...], "rows": [...]} } }'
Kinds:
query, rerank, insight, chat, markdown.
Progressive update: create stub immediately, then PATCH /v1/scry/shares/{slug}.
Rendered at: https://scry.io/scry/share/{slug}.
E9. Emit a structured agent judgement
curl -s -X POST https://api.scry.io/v1/scry/judgements \ -H "Authorization: Bearer $SCRY_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "emitter": "my-agent", "judgement_kind": "topic_classification", "target_external_ref": "arxiv:2401.12345", "summary": "Paper primarily about mechanistic interpretability.", "payload": {"primary_topic": "mech_interp", "confidence_detail": "title+abstract match"}, "confidence": 0.88, "tags": ["arxiv", "mech_interp"], "privacy_level": "public" }'
Exactly one target required:
target_entity_id, target_actor_id,
target_judgement_id, or target_external_ref.
Judgement-on-judgement: use target_judgement_id to chain observations.
E10. People / author lookup
-- Per-source author grouping SELECT a.handle, a.display_name, a.source::text, COUNT(*) AS docs FROM scry.entities e JOIN scry.actors a ON a.id = e.author_actor_id WHERE e.source = 'twitter' GROUP BY a.handle, a.display_name, a.source::text ORDER BY docs DESC LIMIT 50
E11. Thread navigation (replies)
-- Find all replies to a root post SELECT id, uri, title, original_author, original_timestamp FROM scry.entities WHERE anchor_entity_id = 'ROOT_ENTITY_UUID' ORDER BY original_timestamp LIMIT 100
anchor_entity_id is the root subject; parent_entity_id is the direct parent.
E12. Count estimation (safe pattern)
Avoid
COUNT(*) on large tables. Instead, use schema endpoint row estimates or:
SELECT reltuples::bigint AS estimated_rows FROM pg_class WHERE relname = 'mv_lesswrong_posts' LIMIT 1
Note:
pg_class access is blocked on the public Scry SQL surface. Use /v1/scry/schema instead.
F) Error Handling
See
references/error-reference.md for the full catalogue. Key patterns:
| HTTP | Code | Meaning | Action |
|---|---|---|---|
| 400 | | SQL parse error, missing LIMIT, bad params | Fix query |
| 401 | | Missing or invalid API key | Check key |
| 402 | | Token budget exhausted | Notify user |
| 429 | | Too many requests | Respect header |
| 503 | | Scry pool down or overloaded | Wait and retry |
Auth + timeout diagnostics for CLI users:
- If curl shows HTTP
, that is client-side timeout/network abort, not a server HTTP status. Check000
and retry with--max-time
first./v1/scry/estimate - If you see
with401
, check for whitespace/newlines in the key:"Invalid authorization format"
then useKEY_CLEAN="$(printf '%s' \"$SCRY_API_KEY\" | tr -d '\\r\\n')"
.Authorization: Bearer $KEY_CLEAN
Quota fallback strategy:
- If 429: wait
seconds, retry once.Retry-After - If 402: tell the user their token budget is exhausted.
- If 503: retry after 30s with exponential backoff (max 3 attempts).
- If query times out: simplify (use MV instead of full table, reduce LIMIT, add tighter WHERE filters).
G) Output Contract
When this skill completes a query task, return a consistent structure:
## Scry Result **Query**: <natural language description> **SQL**: ```sql <the SQL that ran> ``` **Rows returned**: <N> (truncated: <yes/no>) **Duration**: <N>ms <formatted results table or summary> **Share**: <share URL if created> **Caveats**: <any data quality notes, e.g., "score is NULL for arXiv">
Handoff Contract
Produces: JSON with
columns, rows, row_count, duration_ms, truncated
Feeds into:
: ensure SQL returnsrerank
andid
columns for candidate setscontent_text
: save entity IDs for embedding lookup and semantic reranking Receives from: none (entry point for SQL-based corpus access)scry-vectors
Related Skills
- scry-vectors -- embed concepts as @handles, search by cosine distance, debias with vector algebra
- scry-rerank -- LLM-powered multi-attribute reranking of candidate sets via pairwise comparison
For detailed schema documentation, see
references/schema-guide.md.
For the full pattern library, see references/query-patterns.md.
For error codes and quota details, see references/error-reference.md.