Skills apify
Run Apify Actors (web scrapers, crawlers, automation tools) and retrieve their results using the Apify REST API with curl. Use when the user wants to scrape a website, extract data from the web, run an Apify Actor, crawl pages, or get results from Apify datasets.
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bmestanov/apify" ~/.claude/skills/clawdbot-skills-apify && rm -rf "$T"
skills/bmestanov/apify/SKILL.mdApify
Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.
Full OpenAPI spec: openapi.json
Authentication
All requests need the
APIFY_TOKEN env var. Use it as a Bearer token:
-H "Authorization: Bearer $APIFY_TOKEN"
Base URL:
https://api.apify.com
Core workflow
1. Find the right Actor
Search the Apify Store by keyword:
curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \ -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'
Actors are identified by
username~name (tilde) in API paths, e.g. apify~web-scraper.
2. Get Actor README and input schema
Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):
curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \ -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'
inputSchema is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.
You can also get the Actor's per-build OpenAPI spec (no auth required):
curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"
3. Run an Actor (async — recommended for most cases)
Start the Actor and get the run object back immediately:
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \ -H "Authorization: Bearer $APIFY_TOKEN" \ -H "Content-Type: application/json" \ -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'
Response includes
data.id (run ID), data.defaultDatasetId, data.status.
Optional query params:
?timeout=300&memory=4096&maxItems=100&waitForFinish=60
(0-60): seconds the API waits before returning. Useful to avoid polling for short runs.waitForFinish
4. Poll run status
curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \ -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'
Terminal statuses:
SUCCEEDED, FAILED, ABORTED, TIMED-OUT.
5. Get results
Dataset items (most common — structured scraped data):
curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \ -H "Authorization: Bearer $APIFY_TOKEN"
Or directly from the run (shortcut — same parameters):
curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \ -H "Authorization: Bearer $APIFY_TOKEN"
Params:
format (json|csv|jsonl|xml|xlsx|rss), fields, omit, limit, offset, clean, desc.
Key-value store record (screenshots, HTML, OUTPUT):
curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \ -H "Authorization: Bearer $APIFY_TOKEN"
Run log:
curl -s "https://api.apify.com/v2/logs/RUN_ID" \ -H "Authorization: Bearer $APIFY_TOKEN"
6. Run Actor synchronously (short-running Actors only)
For Actors that finish within 300 seconds, get dataset items in one call:
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \ -H "Authorization: Bearer $APIFY_TOKEN" \ -H "Content-Type: application/json" \ -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'
Returns the dataset items array directly (not wrapped in
data). Returns 408 if the run exceeds 300s.
Alternative:
/run-sync returns the KVS OUTPUT record instead of dataset items.
Quick recipes
Scrape a website
curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \ -H "Authorization: Bearer $APIFY_TOKEN" \ -H "Content-Type: application/json" \ -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'
Google search
curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \ -H "Authorization: Bearer $APIFY_TOKEN" \ -H "Content-Type: application/json" \ -d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'
Long-running Actor (async with polling)
# 1. Start RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \ -H "Authorization: Bearer $APIFY_TOKEN" \ -H "Content-Type: application/json" \ -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}') RUN_ID=$(echo "$RUN" | jq -r '.data.id') # 2. Poll until done while true; do STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \ -H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status') echo "Status: $STATUS" case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac done # 3. Fetch results curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \ -H "Authorization: Bearer $APIFY_TOKEN"
Abort a run
curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \ -H "Authorization: Bearer $APIFY_TOKEN"
Paid / rental Actors
Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:
https://console.apify.com/actors/ACTOR_ID
Replace
ACTOR_ID with the Actor's ID (e.g. AhEsMsQyLfHyMLaxz). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.
You can get the Actor ID from the store search response (
data.items[].id) or from GET /v2/acts/username~name (data.id).
Error handling
- 401:
missing or invalid.APIFY_TOKEN - 404 Actor not found: check
format (tilde, not slash). Browse https://apify.com/store.username~name - 400 run-failed: check
for details.GET /v2/logs/RUN_ID - 402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
- 408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
- 429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).
Additional resources
- API docs (LLM-friendly): https://docs.apify.com/api/v2.md
- OpenAPI spec: openapi.json
- Apify Store (browse Actors): https://apify.com/store