Skills apify

Run Apify Actors (web scrapers, crawlers, automation tools) and retrieve their results using the Apify REST API with curl. Use when the user wants to scrape a website, extract data from the web, run an Apify Actor, crawl pages, or get results from Apify datasets.

install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bmestanov/apify" ~/.claude/skills/clawdbot-skills-apify && rm -rf "$T"
manifest: skills/bmestanov/apify/SKILL.md
source content

Apify

Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.

Full OpenAPI spec: openapi.json

Authentication

All requests need the

APIFY_TOKEN
env var. Use it as a Bearer token:

-H "Authorization: Bearer $APIFY_TOKEN"

Base URL:

https://api.apify.com

Core workflow

1. Find the right Actor

Search the Apify Store by keyword:

curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'

Actors are identified by

username~name
(tilde) in API paths, e.g.
apify~web-scraper
.

2. Get Actor README and input schema

Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'

inputSchema
is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.

You can also get the Actor's per-build OpenAPI spec (no auth required):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"

3. Run an Actor (async — recommended for most cases)

Start the Actor and get the run object back immediately:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'

Response includes

data.id
(run ID),
data.defaultDatasetId
,
data.status
.

Optional query params:

?timeout=300&memory=4096&maxItems=100&waitForFinish=60

  • waitForFinish
    (0-60): seconds the API waits before returning. Useful to avoid polling for short runs.

4. Poll run status

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'

Terminal statuses:

SUCCEEDED
,
FAILED
,
ABORTED
,
TIMED-OUT
.

5. Get results

Dataset items (most common — structured scraped data):

curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Or directly from the run (shortcut — same parameters):

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Params:

format
(
json
|
csv
|
jsonl
|
xml
|
xlsx
|
rss
),
fields
,
omit
,
limit
,
offset
,
clean
,
desc
.

Key-value store record (screenshots, HTML, OUTPUT):

curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Run log:

curl -s "https://api.apify.com/v2/logs/RUN_ID" \
  -H "Authorization: Bearer $APIFY_TOKEN"

6. Run Actor synchronously (short-running Actors only)

For Actors that finish within 300 seconds, get dataset items in one call:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'

Returns the dataset items array directly (not wrapped in

data
). Returns
408
if the run exceeds 300s.

Alternative:

/run-sync
returns the KVS
OUTPUT
record instead of dataset items.

Quick recipes

Scrape a website

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'

Google search

curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'

Long-running Actor (async with polling)

# 1. Start
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}')
RUN_ID=$(echo "$RUN" | jq -r '.data.id')

# 2. Poll until done
while true; do
  STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \
    -H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status')
  echo "Status: $STATUS"
  case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac
done

# 3. Fetch results
curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Abort a run

curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Paid / rental Actors

Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:

https://console.apify.com/actors/ACTOR_ID

Replace

ACTOR_ID
with the Actor's ID (e.g.
AhEsMsQyLfHyMLaxz
). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.

You can get the Actor ID from the store search response (

data.items[].id
) or from
GET /v2/acts/username~name
(
data.id
).

Error handling

  • 401:
    APIFY_TOKEN
    missing or invalid.
  • 404 Actor not found: check
    username~name
    format (tilde, not slash). Browse https://apify.com/store.
  • 400 run-failed: check
    GET /v2/logs/RUN_ID
    for details.
  • 402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
  • 408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
  • 429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).

Additional resources