Qaskills Vibe Check - Browser Automation
Browser automation for AI agents. Navigate pages, fill forms, click elements, take screenshots, and manage tabs — all through simple CLI commands. 2.6k+ GitHub stars.
install
source · Clone the upstream repo
git clone https://github.com/PramodDutta/qaskills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/PramodDutta/qaskills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/seed-skills/vibe-check" ~/.claude/skills/pramoddutta-qaskills-vibe-check-browser-automation && rm -rf "$T"
manifest:
seed-skills/vibe-check/SKILL.mdsource content
Vibium Browser Automation — CLI Reference
The
vibium CLI automates Chrome via the command line. The browser auto-launches on first use (daemon mode keeps it running between commands).
vibium go <url> && vibium map && vibium click @e1 && vibium map
Core Workflow
Every browser automation follows this pattern:
- Navigate:
vibium go <url> - Map:
(get element refs likevibium map
,@e1
)@e2 - Interact: Use refs to click, fill, select — e.g.
vibium click @e1 - Re-map: After navigation or DOM changes, get fresh refs with
vibium map
Binary Resolution
Before running any commands, resolve the
vibium binary path once:
- Try
directly (works if globally installed viavibium
)npm install -g vibium - Fall back to
(dev environment, in project root)./clicker/bin/vibium - Fall back to
(local npm install)./node_modules/.bin/vibium
Run
vibium --help (or the resolved path) to confirm. Use the resolved path for all subsequent commands.
Windows note: Use forward slashes in paths (e.g.
./clicker/bin/vibium.exe) and quote paths containing spaces.
Command Chaining
Chain commands with
&& to run them sequentially. The chain stops on first error:
vibium go https://example.com && vibium map && vibium click @e3 && vibium diff map
When to chain: Use
&& for sequences that should happen back-to-back (navigate → interact → verify). Run commands separately when you need to inspect output between steps.
When NOT to chain: Don't chain commands that depend on parsing the previous output (e.g. reading map output to decide what to click). Run those separately so you can analyze the result first.
Commands
Discovery
— map interactive elements with @refs (recommended before interacting)vibium map
— scope map to elements within a CSS subtreevibium map --selector "nav"
— compare current vs last map (see what changed)vibium diff map
Navigation
— go to a pagevibium go <url>
— go back in historyvibium back
— go forward in historyvibium forward
— reload the current pagevibium reload
— print current URLvibium url
— print page titlevibium title
Reading Content
— get all page textvibium text
— get text of a specific elementvibium text "<selector>"
— get page HTML (usevibium html
for outerHTML)--outer
— find element, returnvibium find "<selector>"
ref (clickable with@e1
)vibium click @e1
— find element by text content →vibium find --text "Sign In"@e1
— find input by label →vibium find --label "Email"@e1
— find by placeholder →vibium find --placeholder "Search"@e1
— find by data-testid →vibium find --testid "submit-btn"@e1
— find by XPath →vibium find --xpath "//div[@class]"@e1
— find by alt attribute →vibium find --alt "Logo"@e1
— find by title attribute →vibium find --title "Settings"@e1
— find all matching elements →vibium find-all "<selector>"
,@e1
, ... (@e2
)--limit N
— find element by ARIA role →vibium find --role <role>
(combine with@e1
,--text
, etc.)--label
— run JavaScript and print result (vibium eval "<js>"
to read from stdin)--stdin
— count matching elementsvibium count "<selector>"
— capture screenshot (vibium screenshot -o file.png
,--full-page
)--annotate
— accessibility tree (vibium a11y-tree
for all nodes)--everything
Interaction
— click an element (also acceptsvibium click "<selector>"
from map)@ref
— double-click an elementvibium dblclick "<selector>"
— type into an input (appends to existing value)vibium type "<selector>" "<text>"
— clear field and type new text (replaces value)vibium fill "<selector>" "<text>"
— press a key on element or focused elementvibium press <key> [selector]
— focus an elementvibium focus "<selector>"
— hover over an elementvibium hover "<selector>"
— scroll page (vibium scroll [direction]
,--amount N
)--selector
— scroll element into view (centered)vibium scroll-into-view "<selector>"
— press keys (Enter, Control+a, Shift+Tab)vibium keys "<combo>"
— pick a dropdown optionvibium select "<selector>" "<value>"
— check a checkbox/radio (idempotent)vibium check "<selector>"
— uncheck a checkbox (idempotent)vibium uncheck "<selector>"
Mouse Primitives
— click at coordinates or current position (vibium mouse-click [x] [y]
)--button 0|1|2
— move mouse to coordinatesvibium mouse-move <x> <y>
— press mouse button (vibium mouse-down
)--button 0|1|2
— release mouse button (vibium mouse-up
)--button 0|1|2
— drag from one element to anothervibium drag "<source>" "<target>"
Element State
— get input/textarea/select valuevibium value "<selector>"
— get HTML attribute valuevibium attr "<selector>" "<attribute>"
— check if element is visible (true/false)vibium is-visible "<selector>"
— check if element is enabled (true/false)vibium is-enabled "<selector>"
— check if checkbox/radio is checked (true/false)vibium is-checked "<selector>"
Waiting
— wait for element (vibium wait "<selector>"
,--state visible|hidden|attached
)--timeout ms
— wait until URL contains substring (vibium wait-for-url "<pattern>"
)--timeout ms
— wait until page is fully loaded (vibium wait-for-load
)--timeout ms
— wait until text appears on page (vibium wait-for-text "<text>"
)--timeout ms
— wait until JS expression returns truthy (vibium wait-for-fn "<expression>"
)--timeout ms
— pause execution (max 30000ms)vibium sleep <ms>
Capture
— capture screenshot (vibium screenshot -o file.png
,--full-page
)--annotate
— save page as PDFvibium pdf -o file.pdf
Dialogs
— accept dialog (optionally with prompt text)vibium dialog accept [text]
— dismiss dialogvibium dialog dismiss
Emulation
— set viewport size (vibium set-viewport <width> <height>
for device pixel ratio)--dpr
— get current viewport dimensionsvibium viewport
— get OS browser window dimensions and statevibium window
— set window size and position (vibium set-window <width> <height> [x] [y]
)--state
— override CSS media features (vibium emulate-media
,--color-scheme
,--reduced-motion
,--forced-colors
,--contrast
)--media
— override geolocation (vibium set-geolocation <lat> <lng>
)--accuracy
— replace page HTML (vibium set-content "<html>"
to read from stdin)--stdin
Frames
— list all iframes on the pagevibium frames
— find a frame by name or URL substringvibium frame "<nameOrUrl>"
File Upload
— set files on input[type=file]vibium upload "<selector>" <files...>
Tracing
— start recording (vibium trace start
,--screenshots
,--snapshots
)--name
— stop recording and save ZIP (vibium trace stop
)-o path
Cookies
— list all cookiesvibium cookies
— set a cookievibium cookies set <name> <value>
— clear all cookiesvibium cookies clear
Storage State
— export cookies + localStorage + sessionStorage (vibium storage-state
)-o state.json
— restore state from JSON filevibium restore-storage <path>
Downloads
— set download directoryvibium download set-dir <path>
Tabs
— list open tabsvibium tabs
— open new tabvibium tab-new [url]
— switch tabvibium tab-switch <index|url>
— close tabvibium tab-close [index]
Debug
— highlight element visually (3 seconds)vibium highlight "<selector>"
Session
— close the browser (daemon keeps running)vibium quit
— alias for quitvibium close
— start background browservibium daemon start
— check if runningvibium daemon status
— stop daemonvibium daemon stop
Common Patterns
Ref-based workflow (recommended for AI)
vibium go https://example.com vibium map vibium click @e1 vibium map # re-map after interaction
Verify action worked
vibium map vibium click @e3 vibium diff map # see what changed
Read a page
vibium go https://example.com && vibium text
Fill a form (end-to-end)
vibium go https://example.com/login vibium map # Look at map output to identify form fields vibium fill @e1 "user@example.com" vibium fill @e2 "secret" vibium click @e3 vibium wait-for-url "/dashboard" vibium screenshot -o after-login.png
Scoped map (large pages)
vibium map --selector "nav" # Only map elements in <nav> vibium map --selector "#sidebar" # Only map elements in #sidebar vibium map --selector "form" # Only map form controls
Semantic find (no CSS selectors needed)
vibium find --text "Sign In" # → @e1 [button] "Sign In" vibium find --label "Email" # → @e1 [input] placeholder="Email" vibium click @e1 # Click the found element vibium find --placeholder "Search..." # → @e1 [input] placeholder="Search..." vibium find --testid "submit-btn" # → @e1 [button] "Submit" vibium find --alt "Company logo" # → @e1 [img] alt="Company logo" vibium find --title "Close" # → @e1 [button] title="Close" vibium find --xpath "//a[@href='/about']" # → @e1 [a] "About"
Authentication with state persistence
# Log in once and save state vibium go https://app.example.com/login vibium fill "input[name=email]" "user@example.com" vibium fill "input[name=password]" "secret" vibium click "button[type=submit]" vibium wait-for-url "/dashboard" vibium storage-state -o auth.json # Restore in a later session (skips login) vibium restore-storage auth.json vibium go https://app.example.com/dashboard
Extract structured data
vibium go https://example.com vibium eval "JSON.stringify([...document.querySelectorAll('a')].map(a => ({text: a.textContent.trim(), href: a.href})))"
Check page structure without rendering
vibium go https://example.com && vibium a11y-tree
Multi-tab workflow
vibium tab-new https://docs.example.com vibium text "h1" vibium tab-switch 0
Annotated screenshot
vibium screenshot -o annotated.png --annotate
Inspect an element
vibium attr "a" "href" vibium value "input[name=email]" vibium is-visible ".modal"
Save as PDF
vibium go https://example.com && vibium pdf -o page.pdf
Eval / JavaScript
vibium eval is the escape hatch for any DOM query or mutation the CLI doesn't cover directly.
Simple expressions — use single quotes:
vibium eval 'document.title' vibium eval 'document.querySelectorAll("li").length'
Complex scripts — use
--stdin with a heredoc:
vibium eval --stdin <<'EOF' const rows = [...document.querySelectorAll('table tbody tr')]; JSON.stringify(rows.map(r => { const cells = r.querySelectorAll('td'); return { name: cells[0].textContent.trim(), price: cells[1].textContent.trim() }; })); EOF
JSON output — use
--json to get machine-readable output:
vibium eval --json 'JSON.stringify({url: location.href, title: document.title})'
Timeouts and Waiting
All interaction commands (
click, fill, type, etc.) auto-wait for the target element to be actionable. You usually don't need explicit waits.
Use explicit waits when:
- Waiting for navigation:
— after clicking a link that navigatesvibium wait-for-url "/dashboard" - Waiting for content:
— after form submission, wait for confirmationvibium wait-for-text "Success" - Waiting for element:
— wait for a modal to appearvibium wait ".modal" - Waiting for page load:
— after navigation to a slow pagevibium wait-for-load - Waiting for JS condition:
— wait for app initializationvibium wait-for-fn "window.appReady === true" - Fixed delay (last resort):
— only when no better signal exists (max 30s)vibium sleep 2000
All wait commands accept
--timeout <ms> (default varies by command).
Ref Lifecycle
Refs (
@e1, @e2) are invalidated when the page changes. Always re-map after:
- Clicking links or buttons that navigate
- Form submissions
- Dynamic content loading (dropdowns, modals)
Global Flags
| Flag | Description |
|---|---|
| Hide browser window |
| Output as JSON |
| Debug logging |
Tips
- All click/type/hover/fill actions auto-wait for the element to be actionable
- All selector arguments also accept
from@refvibium map - Use
before interacting to discover interactive elementsvibium map - Use
to reduce noise on large pagesvibium map --selector - Use
to replace a field's value,vibium fill
to append to itvibium type - Use
/vibium find --text
/--label
for semantic element lookup (more reliable than CSS selectors)--testid - Use
for ARIA-role-based lookupvibium find --role - Use
to understand page structure without visual renderingvibium a11y-tree - Use
to read specific sectionsvibium text "<selector>" - Use
after interactions to see what changedvibium diff map
is the escape hatch for complex DOM queriesvibium eval
/vibium check
are idempotent — safe to call without checking state firstvibium uncheck- Screenshots save to the current directory by default (
to change)-o - Use
/vibium storage-state
to persist auth across sessionsvibium restore-storage