Skills apify

Run Apify Actors (web scrapers, crawlers, automation tools) and retrieve their results using the Apify REST API with curl. Use when the user wants to scrape a website, extract data from the web, run an Apify Actor, crawl pages, or get results from Apify datasets.

install

source · Clone the upstream repo

git clone https://github.com/openclaw/skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bmestanov/apify" ~/.claude/skills/clawdbot-skills-apify && rm -rf "$T"

manifest: skills/bmestanov/apify/SKILL.md

source content

Apify

Run any of the 17,000+ Actors on Apify Store and retrieve structured results via the REST API.

Full OpenAPI spec: openapi.json

Authentication

All requests need the

APIFY_TOKEN

env var. Use it as a Bearer token:

-H "Authorization: Bearer $APIFY_TOKEN"

Base URL:

https://api.apify.com

Core workflow

1. Find the right Actor

Search the Apify Store by keyword:

curl -s "https://api.apify.com/v2/store?search=web+scraper&limit=5" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data.items[] | {name: (.username + "/" + .name), title, description}'

Actors are identified by

username~name

(tilde) in API paths, e.g.

apify~web-scraper

2. Get Actor README and input schema

Before running an Actor, fetch its default build to get the README (usage docs) and input schema (expected JSON fields):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {readme, inputSchema}'

inputSchema

is a JSON-stringified object — parse it to see required/optional fields, types, defaults, and descriptions. Use this to construct valid input for the run.

You can also get the Actor's per-build OpenAPI spec (no auth required):

curl -s "https://api.apify.com/v2/acts/apify~web-scraper/builds/default/openapi.json"

3. Run an Actor (async — recommended for most cases)

Start the Actor and get the run object back immediately:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":10}'

Response includes

data.id

(run ID),

data.defaultDatasetId

data.status

Optional query params:

?timeout=300&memory=4096&maxItems=100&waitForFinish=60

```
waitForFinish
```
(0-60): seconds the API waits before returning. Useful to avoid polling for short runs.

4. Poll run status

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" | jq '.data | {status, defaultDatasetId}'

Terminal statuses:

SUCCEEDED

FAILED

ABORTED

TIMED-OUT

5. Get results

Dataset items (most common — structured scraped data):

curl -s "https://api.apify.com/v2/datasets/DATASET_ID/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Or directly from the run (shortcut — same parameters):

curl -s "https://api.apify.com/v2/actor-runs/RUN_ID/dataset/items?clean=true&limit=100" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Params:

format

(

json

csv

jsonl

xml

xlsx

rss

fields

omit

limit

offset

clean

desc

Key-value store record (screenshots, HTML, OUTPUT):

curl -s "https://api.apify.com/v2/key-value-stores/STORE_ID/records/OUTPUT" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Run log:

curl -s "https://api.apify.com/v2/logs/RUN_ID" \
  -H "Authorization: Bearer $APIFY_TOKEN"

6. Run Actor synchronously (short-running Actors only)

For Actors that finish within 300 seconds, get dataset items in one call:

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":5}'

Returns the dataset items array directly (not wrapped in

data

). Returns

if the run exceeds 300s.

Alternative:

/run-sync

returns the KVS

OUTPUT

record instead of dataset items.

Quick recipes

Scrape a website

curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":20}'

Google search

curl -s -X POST "https://api.apify.com/v2/acts/apify~google-search-scraper/run-sync-get-dataset-items?timeout=120" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"queries":"site:example.com openai","maxPagesPerQuery":1}'

Long-running Actor (async with polling)

# 1. Start
RUN=$(curl -s -X POST "https://api.apify.com/v2/acts/apify~web-scraper/runs?waitForFinish=60" \
  -H "Authorization: Bearer $APIFY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"startUrls":[{"url":"https://example.com"}],"maxPagesPerCrawl":500}')
RUN_ID=$(echo "$RUN" | jq -r '.data.id')

# 2. Poll until done
while true; do
  STATUS=$(curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID?waitForFinish=60" \
    -H "Authorization: Bearer $APIFY_TOKEN" | jq -r '.data.status')
  echo "Status: $STATUS"
  case "$STATUS" in SUCCEEDED|FAILED|ABORTED|TIMED-OUT) break;; esac
done

# 3. Fetch results
curl -s "https://api.apify.com/v2/actor-runs/$RUN_ID/dataset/items?clean=true" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Abort a run

curl -s -X POST "https://api.apify.com/v2/actor-runs/RUN_ID/abort" \
  -H "Authorization: Bearer $APIFY_TOKEN"

Paid / rental Actors

Some Actors require a monthly subscription before they can be run. If the API returns a permissions or payment error for an Actor, ask the user to manually subscribe via the Apify Console:

https://console.apify.com/actors/ACTOR_ID

Replace

ACTOR_ID

with the Actor's ID (e.g.

AhEsMsQyLfHyMLaxz

). The user needs to click Start on that page to activate the subscription. Most rental Actors offer a free trial period set by the developer.

You can get the Actor ID from the store search response (

data.items[].id

) or from

GET /v2/acts/username~name

(

data.id

Error handling

401:
```
APIFY_TOKEN
```
missing or invalid.
404 Actor not found: check
```
username~name
```
format (tilde, not slash). Browse https://apify.com/store.
400 run-failed: check
```
GET /v2/logs/RUN_ID
```
for details.
402/403 payment required: the Actor likely requires a subscription. See "Paid / rental Actors" above.
408 run-timeout-exceeded: sync endpoints have a 300s limit. Use async workflow instead.
429 rate-limit-exceeded: retry with exponential backoff (start at 500ms, double each time).

Additional resources

API docs (LLM-friendly): https://docs.apify.com/api/v2.md
OpenAPI spec: openapi.json
Apify Store (browse Actors): https://apify.com/store