CLI-Anything cli-anything-safari
git clone https://github.com/HKUDS/CLI-Anything
T=$(mktemp -d) && git clone --depth=1 https://github.com/HKUDS/CLI-Anything "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/cli-anything-safari" ~/.claude/skills/hkuds-cli-anything-cli-anything-safari && rm -rf "$T"
skills/cli-anything-safari/SKILL.mdcli-anything-safari
A command-line interface for Safari browser automation on macOS. Wraps the
Node.js MCP
server in a Python Click CLI.safari-mcp
Feature parity is guaranteed. Every Click command is generated automatically from
safari-mcp's tool schema (bundled as
resources/tools.json). All 84 tools are reachable with the exact
argument names and types the MCP server expects.
When to use this CLI
Each CLI invocation spawns a fresh subprocess, so there is per-call overhead. If your agent speaks MCP natively (Claude Code, Cursor, Cline, etc.), using
safari-mcp directly over MCP stdio will be faster.
Use this CLI when:
- Your agent framework does not speak MCP (Codex CLI, GitHub Copilot CLI, custom scripts, older agent frameworks).
- You need to script browser automation from bash —
.cli-anything-safari --json tool snapshot | jq '...' - You run in CI/CD and want cron-able, subprocess-friendly output.
- You're debugging interactively from Terminal.
Installation
Prerequisites
- macOS — Safari MCP is macOS-only.
- Safari — already installed on macOS.
- Node.js 18+ —
or from https://nodejs.org/brew install node - Python 3.10+
- Enable Apple Events for Safari: Safari → Develop → Allow JavaScript from Apple Events
Install the CLI
cd safari/agent-harness pip install -e .
The first
tool call will download the safari-mcp npm package (one-time, a few MB).
Command Structure
The CLI has 5 top-level commands:
| Command | Purpose |
|---|---|
| Call any of safari-mcp's 84 tools (dynamic, schema-driven) |
| Inspect the bundled tool registry (, , ) |
| Escape hatch — call a tool by full name with raw JSON args |
| In-memory session state (last URL, current tab) |
| Interactive REPL (default when no subcommand given) |
Usage Examples
Discover the tool surface
# Count of tools (sanity check — must match safari-mcp's registered tools) cli-anything-safari tools count # → 84 # List every tool cli-anything-safari tools list cli-anything-safari tools list --filter click # filter by substring # Full schema for one tool (JSON or human format) cli-anything-safari tools describe safari_scroll cli-anything-safari --json tools describe safari_click
Call a tool (schema-driven)
# Navigate cli-anything-safari tool navigate --url https://example.com # Take a snapshot (preferred over screenshot — structured text with ref IDs) cli-anything-safari --json tool snapshot # Click by ref (refs come from snapshot; they expire on the next snapshot!) cli-anything-safari tool click --ref 0_5 # Click by selector or visible text cli-anything-safari tool click --selector "#submit" cli-anything-safari tool click --text "Log in" # Fill a field cli-anything-safari tool fill --selector "#email" --value "user@example.com" # Scroll by direction/amount (NOT x/y — note the schema!) cli-anything-safari tool scroll --direction down --amount 500 # Drag one element onto another cli-anything-safari tool drag \ --source-selector ".card" \ --target-selector ".trash" # Screenshot — returns base64 JPEG in stdout. Decode with: cli-anything-safari --json tool screenshot --full-page \ | python3 -c "import sys,json,base64; \ d=json.load(sys.stdin); \ open('/tmp/shot.jpg','wb').write(base64.b64decode(d['data']))" # Save as PDF (this one writes to disk directly) cli-anything-safari tool save-pdf --path /tmp/page.pdf # Evaluate JavaScript (note: parameter is --script, not --code) cli-anything-safari tool evaluate --script "document.title"
Navigate and read in one round-trip
cli-anything-safari --json tool navigate-and-read --url https://example.com
Form fill (bulk)
safari_fill_form takes an array of {selector, value} objects.
Pass it as a JSON string:
cli-anything-safari tool fill-form --fields '[ {"selector": "#email", "value": "user@example.com"}, {"selector": "#password", "value": "hunter2"} ]'
Run
cli-anything-safari tools describe safari_fill_form to see the
exact schema, including any new fields safari-mcp adds upstream.
Network monitoring
cli-anything-safari tool start-network-capture cli-anything-safari tool navigate --url https://example.com cli-anything-safari --json tool network cli-anything-safari tool performance-metrics
Storage
cli-anything-safari tool get-cookies cli-anything-safari tool set-cookie --name session --value abc123 --domain example.com cli-anything-safari tool local-storage --key theme # export-storage returns JSON to stdout — no --path arg. Pipe to a file: cli-anything-safari --json tool export-storage > /tmp/storage.json
Raw JSON escape hatch
When you need to pass a complex nested object or want to drive the CLI from a pre-built JSON blob:
cli-anything-safari raw safari_evaluate \ --json-args '{"code":"[...document.querySelectorAll(\"a\")].map(a => a.href)"}'
Interactive REPL
cli-anything-safari
The REPL banner prints the absolute path to this SKILL.md so agents can self-discover capabilities.
JSON Output
All commands support
--json as a global flag:
cli-anything-safari --json tool snapshot cli-anything-safari --json tool list-tabs cli-anything-safari --json tools list
State Management
The CLI maintains a small amount of in-memory state for REPL display only:
— last URL the CLI navigated to (updated after every successfullast_url
,tool navigate
, ortool navigate-and-read
)tool new-tab
— last known active tab indexcurrent_tab_index
There is no persistent session, no undo/redo, no document model. Every CLI invocation starts with fresh state. Safari MCP itself is stateless per-call: each
tool command spawns a fresh
npx safari-mcp subprocess, performs the action, and exits. This is a
deliberate design choice; see HARNESS.md and TEST.md for the
reasoning behind the deviation from the standard undo/redo pattern.
Output Formats
All commands support dual output modes:
- Human-readable (default): indented key-value text for
results, bullet lists for arrays, plain text otherwisedict - Machine-readable (
flag): structured JSON for agent consumption--json
# Human output cli-anything-safari tool snapshot # JSON output for agents cli-anything-safari --json tool snapshot cli-anything-safari --json tools list cli-anything-safari --json tools describe safari_click
For AI Agents
When using this CLI programmatically:
- Always use
flag for parseable output.--json - Check return codes — 0 for success, non-zero for errors (URL validation failures, MCP call failures, invalid JSON args).
- Parse stderr for error messages; use stdout for data.
- File-handling tools have inconsistent path arg names — always
check
first:tools describe <name>tool save-pdf --path /tmp/x.pdf
(note:tool upload-file --selector ... --file-path /tmp/x.txt
, not--file-path
)--path
— no path arg; pipe JSON output to a filetool export-storagetool import-storage --path /tmp/x.json
/tool screenshot
— return base64 in the JSON response, no path arg (decode it yourself)screenshot-element
- Snapshot before click — refs from
expire on the next snapshot. Always snapshot → find ref → click in close succession.tool snapshot - Discover tools via
— the bundled registry is the source of truth for what's available. Do not hard-code tool names that may change upstream.tools list - Use
to learn the exact schema (required args, enum choices, JSON-typed args) before constructing a call. Never assume parameter names from the description — for example,tools describe <name>
takessafari_evaluate
(not--script
) even though the description says "JavaScript code to execute".--code
Agent-Specific Guidance
Finding the right tool
Use the introspection commands. The CLI is guaranteed to reflect the MCP server 1:1:
# Find all click-related tools cli-anything-safari tools list --filter click # Get the full schema (including every argument with type, description, # required/optional, enum choices, defaults) cli-anything-safari --json tools describe safari_click
Tool selection strategy
overtool snapshot
— structured text with ref IDs is orders of magnitude cheaper and carries the refs needed for clicks.tool screenshot
overtool click --ref
— refs are stable within a single snapshot, selectors may be brittle.tool click --selector
overtool navigate-and-read
+navigate
— saves one round-trip.read-page
overtool click-and-read
+click
— saves one round-trip.read-page
only when regular click fails with 405/403 (WAF blocks, G2, Cloudflare) — it physically moves the cursor.tool native-click
Refs Expire
Refs from
tool snapshot expire when you take a new snapshot:
- First snapshot: refs
,0_1
,0_2
...0_3 - Second snapshot: refs
,1_1
,1_2
...1_3
Always snapshot → click in close succession. If in doubt, snapshot again.
Tab Ownership Safety
Safari MCP tracks tab ownership per session. Tools that modify a tab (navigate, click, fill) are blocked on tabs the session did not open. To operate on a specific page, always start with
tool new-tab --url ....
Error Handling
Common errors:
→ install Node.js 18+npx not found
→ check networksafari-mcp package not found on npm registry
→ harness is macOS-onlyNot macOS
→ enable "Allow JavaScript from Apple Events" in Safari → DevelopAppleScript denied
→ URL validation rejected the input (by design)Blocked URL scheme: file
URL Validation
The CLI validates URLs before passing them to
safari_navigate,
safari_navigate_and_read, and safari_new_tab. Blocked schemes:
file, javascript, data, vbscript, about, chrome, safari,
webkit, x-apple, and other browser-internal schemes. The raw
command also enforces this for navigation tools.
Multi-Session Warning
Safari MCP enforces a single active session by killing stale Node.js processes older than 10 seconds. If you run two CLI instances at once, one will kill the other's backend. There is currently no daemon mode — for latency-sensitive workflows, drive the CLI from a long-lived Python script that imports
cli_anything.safari.utils.safari_backend.call() directly to avoid
re-spawning the subprocess on every invocation.
Links
Security Considerations
URL Validation
All navigation tools (
tool navigate, tool navigate-and-read, tool new-tab, and raw safari_navigate*) pass the url argument through
utils/security.py which blocks dangerous schemes and optionally blocks
private networks (set CLI_ANYTHING_SAFARI_BLOCK_PRIVATE=1).
Tab Isolation
Safari MCP enforces per-session tab ownership upstream — tools cannot operate on tabs the session did not open.
Profile Isolation
Set
SAFARI_PROFILE env var to use a separate Safari profile for
automation:
export SAFARI_PROFILE="Automation" cli-anything-safari tool navigate --url https://example.com
This keeps cookies/logins/history separate from the user's main browsing.
JavaScript Execution
tool evaluate and tool run-script run arbitrary JavaScript in the page
context. Treat untrusted input with the same care as any dynamic code
execution path.
Clipboard
tool clipboard-read and tool clipboard-write touch the system
clipboard. Be careful when running inside a user's active session —
overwriting the clipboard mid-task is disruptive.
Regenerating the tool registry
If you upgrade
safari-mcp, regenerate the bundled schema:
python scripts/extract_tools.py \ "$(npm root -g)/safari-mcp/index.js" \ cli_anything/safari/resources/tools.json
The parity test (
test_parity.py) pins the expected tool count; update
it when the upstream tool list changes.
More Information
- Full documentation:
in the packagecli_anything/safari/README.md - Test coverage:
in the packagecli_anything/safari/tests/TEST.md - Architecture analysis:
safari/agent-harness/SAFARI.md - Methodology:
cli-anything-plugin/HARNESS.md - MCP backend pattern:
cli-anything-plugin/guides/mcp-backend.md
Version
1.0.0 — targets safari-mcp 2.7.8 (84 tools). Bundled tool registry is regenerated via
scripts/extract_tools.py when safari-mcp upgrades.