Opencli opencli-browser
Use when an agent needs to drive a real Chrome window via opencli — inspect a page, fill forms, click through logged-in flows, or extract data ad-hoc. Covers the selector-first target contract, compound form fields, stale-ref handling, network capture, and the agent-native envelopes the CLI returns. Not for writing adapters — see opencli-adapter-author for that.
git clone https://github.com/jackwener/OpenCLI
T=$(mktemp -d) && git clone --depth=1 https://github.com/jackwener/OpenCLI "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/opencli-browser" ~/.claude/skills/jackwener-opencli-opencli-browser && rm -rf "$T"
skills/opencli-browser/SKILL.mdopencli-browser
The first reader of this CLI is an agent, not a human. Every subcommand returns a structured envelope that tells you exactly what matched, how confident the match is, and what to do if it didn't. Lean on those envelopes — do not guess.
This skill is for driving a live browser to accomplish an agent task. If you are building a reusable adapter under
~/.opencli/clis/<site>/ use opencli-adapter-author instead.
Prerequisites
opencli doctor
Until
doctor is green, nothing else will work. Typical failures: Chrome not running, extension not installed, debug port blocked by 1Password / other extensions. The doctor output tells you which.
Mental model
- Selector-first target contract. Every interaction command (
,click
,type
,select
) takes oneget text/value/attributes
, which is either a numeric ref from<target>
/state
or a CSS selector. Usefind
to disambiguate multiple CSS matches.--nth <n> - Every envelope reports
andmatches_n
.match_level
ismatch_level
,exact
, orstable
— the CLI already rescued moderate DOM drift for you, but the level tells you how confident to be.reidentified - Compact output first, full payload on demand.
is a budget-aware snapshot;state
supportsget html --as json
;--depth/--children-max/--text-max
returns shape previews and you re-fetch a single body withnetwork
. If you emit a giant payload you are burning context you did not need to burn.--detail <key> - Structured errors are machine-readable. On failure the CLI emits
. Branch on{error: {code, message, hint?, candidates?}}
, not on message strings.code
Critical rules
- Always inspect before you act. Run
orstate
first. Never hard-code a ref or selector from memory across sessions — indices are per-snapshot.find - Prefer numeric ref over CSS once you have it. Numeric refs survive mild DOM shifts because the CLI fingerprints each tagged element. A CSS selector written by hand will break the first time the site re-renders.
- Read
after every write.match_level
= all good.exact
= the element is the same but some soft attrs drifted — your action still applied.stable
= the original ref was gone and the CLI found a unique replacement; double-check you hit the right element.reidentified - Use the
field for form controls. Do not regex-guess a date format, do notcompound
twice to get the fullstate
options list. The compound envelope has the format string, full option list up to 50,<select>
for overflow, andoptions_total
/accept
formultiple
.<input type=file> - Verify writes that matter. After
, runtype <target> <text>
. Afterget value <target>
, runselect
. Autocomplete widgets, React controlled inputs, and masked fields all silently eat characters. The CLI cannot detect this for you.get value
→ action →state
after a page change. Navigations, form submits, and SPA route changes invalidate refs. Take a fresh snapshot. Do not reuse refs from before the transition.state- Chain with
. A chained sequence runs in one shell so refs acquired by the first command stay live for the second. Separate shell invocations lose the session context you just set up.&&
is read-only. Wrap the JS in an IIFE and return JSON. If you need to change the page, use the structuredeval
/click
/type
/select
commands instead — they produce structured output and fingerprints,keys
does not.eval- Prefer
to screen-scraping. If a page you care about fetches its data from a JSON API, the API is almost always more reliable than scraping the rendered DOM. Capture once, inspect the shape, thennetwork
the body you need.--detail <key>
Target contract (<target>
for click / type / select / get text|value|attributes)
<target><target> ::= <numeric-ref> | <css-selector>
- Numeric ref — the
index from[N]
orstate
. Cheap, resilient to soft DOM drift.find - CSS selector — anything
accepts. Must be unambiguous on write ops, or pair withquerySelectorAll
.--nth <n>
Envelope on success
{ "clicked": true, "target": "3", "matches_n": 1, "match_level": "exact" }
{ "value": "kalevin@example.com", "matches_n": 1, "match_level": "stable" }
match_level
| level | meaning | you should |
|---|---|---|
| Fingerprint agreed on tag + strong IDs with at most one soft drift | Proceed. |
| Tag + strong IDs still agree, soft signals (aria-label, role, text) drifted | Proceed, but if what you typed/clicked matters, re-check with or . |
| Original ref was gone; a unique live element matched the fingerprint and was re-tagged with the old ref | Double-check you hit the right element before chaining more writes. |
Structured error codes
Branch on these, not on the human message:
| code | meaning |
|---|---|
| Numeric ref is no longer in the DOM. Re-. |
| Ref exists but the element at that ref changed identity. Re-. |
| CSS was rejected by . Fix the selector. |
| CSS matches 0 elements. Try with a looser selector. |
| CSS matches >1 and no . Add or narrow the selector. |
| beyond match count. |
| couldn't find an option matching that label/value. Error envelope includes of the real option labels. |
| was called on a non- element. |
Error envelope always includes
error.code and error.message. Target errors (selector_not_found, selector_ambiguous, etc.) often add error.candidates: string[] with suggested selectors. option_not_found adds error.available: string[] instead.
Command reference
Inspect
| command | purpose |
|---|---|
| Snapshot: text tree with refs, scroll hints, hidden-interactive hints, sidecar for date/select/file refs. |
| Run a CSS query and return one entry per match with . Allocates refs for matches the prior snapshot didn't tag. Cheap alternative to when you already know the selector. |
| List cross-origin iframe targets. Pass the index to on . |
| Viewport PNG. No path → base64 to stdout. Prefer when you just need structure. |
Get (read-only)
| command | returns |
|---|---|
| plain text |
| plain text |
| |
| |
| |
| Raw HTML, or structured tree. JSON tree nodes have . Truncation reported via . |
Interact
| command | notes |
|---|---|
| Returns . |
| Clicks first, then types. Returns . means a combobox/datalist popup appeared after typing — you almost always need or a follow-up on the suggestion to commit the value. |
| Matches option by label first, then value. Use from / to see exactly what labels are available. |
| , , , , etc. Runs against the focused element. |
| / . Default amount . |
Wait
browser wait selector "<css>" [--timeout ms] # wait until the selector matches browser wait text "<substring>" [--timeout ms] # wait until the text appears browser wait time <seconds> # hard sleep, last resort
Default timeout
10000 ms. SPA routes, login redirects, and lazy-loaded lists need wait before state/get.
Extract
— Run an expression in the page (or in a cross-origin frame viabrowser eval <js> [--frame N]
). Wrap in an IIFE and return JSON. Read-only: no--frame
, no clicks, no navigations. If the result is a string, stdout is the raw string; otherwise it's JSON.document.forms[0].submit()
— Markdown extraction of long-form content with a continuation cursor. Returnsbrowser extract [--selector <css>] [--chunk-size N] [--start N]
. Loop on{url, title, selector, total_chars, chunk_size, start, end, next_start_char, content}
until it isnext_start_char
. Auto-scopes tonull
/<main>
/<article>
if you don't pass<body>
.--selector
Network
browser network # shape preview + cache key list browser network --detail <key> # full body for one cached entry browser network --filter "field1,field2" # keep only entries whose body shape contains ALL fields as path segments browser network --all # include static resources (usually noise) browser network --raw # full bodies inline — large; use sparingly browser network --ttl <ms> # cache TTL (default 24h)
List entries look like
{key, method, status, url, ct, size, shape, body_truncated?}. Detail envelope is {key, url, method, status, ct, size, shape, body, body_truncated?, body_full_size?, body_truncation_reason}. Cache lives in ~/.opencli/cache/browser-network/ so you can re-inspect without re-triggering the request.
Tabs & session
| command | purpose |
|---|---|
| JSON array of . The string is the tab identity you pass as to / , or to on any subcommand. ('s placeholder is historical — the value is always .) |
| Open a new tab. Prints the new string. |
| Make a tab the default. All subcommands accept to target one without changing the default. |
| Close by . |
| History back on the active tab. |
| Close the automation window when done. |
Compound form controls
Every date/time, select, and file input carries a
compound field. Use it — do not regex attributes.
Date family
{ "control": "date", "format": "YYYY-MM-DD", "current": "2026-04-21", "min": "2026-01-01", "max": "2026-12-31" }
control is one of date | time | datetime-local | month | week. format is a concrete template string — type into the field using that exact format, or select by label if the site wraps the native input in a custom widget.
Select
{ "control": "select", "multiple": false, "current": "United States", "options": [ { "label": "United States", "value": "us", "selected": true }, { "label": "Canada", "value": "ca" } ], "options_total": 137 }
options[] is capped at 50 entries. current is always correct even when the selected option is past the cap — it's computed by scanning every option, not from the truncated list. If options_total > options.length and you need an option that isn't in options[], call browser select <target> "<label>" directly — the CLI matches against the live DOM, not the truncated list.
File
{ "control": "file", "multiple": true, "current": ["report.pdf", "cover.png"], "accept": "application/pdf,image/*" }
Do not invent file paths. Upload is done via the normal click flow — respect
accept when telling the user what to upload.
Where compounds show up
entries: inline on each match.browser find --css <sel>
tree nodes: inline on matching nodes.browser get html --as json
snapshot: in abrowser state
sidecar keyed by numeric ref, so you can tell at a glance whichcompounds (N):
entries have rich metadata.[N]
Cost guide
Think about payload size per call. Budgets exist for a reason.
| command | rough cost | when to use |
|---|---|---|
| medium (bounded by internal budget) | First call on any page, after every nav, when you need refs. |
| small | You already know the selector — one query, compact entries. |
/ | tiny | Sanity checks between steps. |
| tiny per call | Verifying one specific field. |
(raw) | can be huge | Avoid on unbounded pages. Always pair with and a budget. |
| medium | When you need to reason about structure, not a specific field. |
| large | Only when the page is visual (CAPTCHA, charts). Prefer . |
| medium per chunk | Long-form reading. Loop via . |
(default) | small | First look at APIs. |
| varies | Pull one body. |
| huge | Only after narrowed the candidate set. |
| controlled | Targeted extraction when none of the above fit. |
Rule of thumb: one
per page transition, one state
per follow-up query, one find
/get
/click
per action. If your plan involves >10 calls per page you are probably scraping instead of interacting — consider type
extract or network.
Chaining rules
Good — one shell, live session:
opencli browser open "https://news.ycombinator.com" \ && opencli browser state \ && opencli browser click 3
Bad — each line is a fresh shell, refs from call 1 are already forgotten when call 2 runs. (Only a problem if you rely on shell-scoped state; browser refs themselves persist in-page, but interleaving unrelated shells invites races.) Prefer
&& when the steps are meant to be atomic.
Never chain a write and then an immediate
state without a wait if the action causes a network round-trip — you will snapshot the pre-response DOM and make bad decisions off stale data.
Recipes
Fill a login form
opencli browser open "https://example.com/login" opencli browser state # find [N] for email, password, submit opencli browser type 4 "me@example.com" opencli browser type 5 "hunter2" opencli browser get value 4 # verify (autocomplete can eat chars) opencli browser click 6 # submit opencli browser wait selector "[data-testid=account-menu]" --timeout 15000 opencli browser state # fresh refs on the logged-in page
Pick from a long dropdown
opencli browser state # sidebar shows [12] <select name=country> opencli browser find --css "select[name=country]" # the compound.options_total is 137, but compound.current is "" — unselected. opencli browser select 12 "Uruguay" opencli browser get value 12 # { value: "uy", match_level: "exact" }
Scrape a list via network instead of DOM
opencli browser open "https://news.ycombinator.com" opencli browser network --filter "title,score" # -> find the /topstories entry, note its key opencli browser network --detail topstories-a1b2
Read a long article in chunks
opencli browser open "https://blog.example.com/long-post" opencli browser extract --chunk-size 8000 # -> content + next_start_char: 8000 opencli browser extract --start 8000 --chunk-size 8000 # ...until next_start_char is null
Cross-origin iframe
opencli browser frames # -> [{"index": 0, "url": "https://checkout.stripe.com/...", ...}] opencli browser eval "(() => document.querySelector('input[name=cardnumber]')?.value)()" --frame 0
Pitfalls
- Do not submit forms via
— modern sites intercept with JS handlers and silently drop the call. Eithereval "document.forms[0].submit()"
the submit button via its ref, or (if you know the GET URL) justclick
it directly.open - Do not reuse refs across a page transition.
for the new state, then re-wait
. Old refs will either 404 or (worse)state
onto a similarly-shaped element on the new page.reidentify
is a warning, not an error. The action went through, but if you are chaining 5 more writes that all depend on that being the right element, verify with amatch_level: reidentified
orget text
before continuing.get value- Budget-aware commands silently cap.
with default budgets will returnget html --as json
. If your downstream logic needs the whole subtree, raisetruncated: {...}
/--depth
or tighten the selector.--children-max
on aautocomplete: true
response is not an error. It means a suggestion popup is open and your value isn't committed yet. Typicallytype
to accept the first suggestion, orkeys Enter
the one you want.click
is AND-semantics on path segments.network --filter
keeps entries whose body shape contains both--filter "title,score"
andtitle
as path segments, at any depth. It is not a regex.score- Screenshots are for humans, not for agents. Use
+state
unless the page is genuinely visual (captcha, chart). Screenshots burn tokens and rarely add signal an agent can act on.find
Troubleshooting
| symptom | fix |
|---|---|
red: "Browser not connected" | Start Chrome with , or rerun the extension install. |
| Disable 1Password / other CDP-hungry extensions temporarily. |
right after | Page mutated. then retry. |
across every command | You are reusing refs from a prior page. Re-. |
succeeds but nothing happens | The element is probably a decorative wrapper stealing clicks from the real target. with a narrower selector and retry on the inner element. |
appears to finish but value is wrong | Autocomplete, masked input, or React controlled re-render. Verify with . Add or re-type. |
Giant output | Pass + . |
| Network cache seems stale | Bump down, or let it expire. The cache lives at . |
See also
— turning what you just figured out into a reusableopencli-adapter-author
.~/.opencli/clis/<site>/<command>.js
— when an existing adapter breaks, this skill walks you throughopencli-autofix
and filing a fix.OPENCLI_DIAGNOSTIC