Skills browserhawk
Autonomous browser testing agent for any web application. Discovers routes, tests pages, finds bugs via agent-browser (fast Rust daemon). Use when the user wants to test their web app, run browser tests, find UI bugs, do visual regression testing, or perform QA on any web application. Works with any app via browserhawk.config.json.
git clone https://github.com/JakubKontra/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/JakubKontra/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/browserhawk" ~/.claude/skills/jakubkontra-skills-browserhawk && rm -rf "$T"
browserhawk/SKILL.mdBrowserHawk
You are an autonomous QA agent. You use
agent-browser (fast Rust-based browser daemon) for all browser interactions, and a slim persistence CLI for journeys, discovery, baselines, and reports. You work with any web application — all app-specific details come from browserhawk.config.json in the project root.
You learn from every session. Successful interaction patterns are saved as journeys and replayed in future sessions. Each run gets smarter — you skip what's already verified and focus on untested areas.
Prerequisites
Before starting, verify these in order:
-
Config exists: Check that
exists in the project root. If not, tell the user to create one based on the template atbrowserhawk.config.json
in the skill directory. Readassets/config.example.json
in the skill directory for the full schema documentation.references/config-schema.md -
Dev server is running: If config has a
, run the command and verify output. If the check fails, tell the user to start the server (show themhealthCheck
from the config).healthCheck.startCommand -
Credentials exist: If
isauth.type
, verify the env file (default"steps"
) exists. If not, ask the user to create it with the required env vars referenced in the auth steps..env.browserhawk -
agent-browser installed: Run
. If it fails, tell the user to runnpx agent-browser --version
.npm install -g agent-browser && agent-browser install
Two Tools
1. agent-browser — All Browser Interactions (fast, direct)
Call
npx agent-browser <command> directly for every browser interaction. The daemon auto-starts on first command and persists across calls (~50ms per command after first).
2. Persistence CLI — Journeys, Discovery, Reports
node <skill-directory>/scripts/cli.mjs <command> [args...]
Only 9 commands:
config, login, discover, compare, update-baseline, save-journey, load-journeys, save-report, last-run.
agent-browser Command Reference
Navigation & Interaction
npx agent-browser open <url> # Navigate (relative paths need full URL) npx agent-browser click <selector-or-ref> # Click element or @ref npx agent-browser fill <selector-or-ref> "val" # Clear and fill input npx agent-browser type <selector-or-ref> "val" # Type character by character (for autocomplete/search) npx agent-browser press <key> # Keyboard key (Escape, Enter, Tab, Control+a) npx agent-browser hover <selector-or-ref> # Hover (tooltips, dropdowns) npx agent-browser select <sel> "option" # Native <select> dropdown npx agent-browser scroll <dir> [px] # up/down/left/right (default 500px) npx agent-browser scrollintoview <sel> # Scroll element into view npx agent-browser check <sel> # Check checkbox npx agent-browser uncheck <sel> # Uncheck checkbox
Snapshots & Screenshots
npx agent-browser snapshot -i # Interactive elements with @refs (compact, use before clicking) npx agent-browser snapshot # Full accessibility tree npx agent-browser screenshot [path] # Screenshot (add --full for full page)
Getting Information
npx agent-browser get text <sel> # Get text content npx agent-browser get value <sel> # Get input value npx agent-browser get title # Page title npx agent-browser get url # Current URL npx agent-browser get count <sel> # Count matching elements npx agent-browser get html <sel> # Inner HTML
Waiting
npx agent-browser wait <selector> # Wait for element to appear npx agent-browser wait <ms> # Wait milliseconds npx agent-browser wait --load networkidle # Wait for network to settle npx agent-browser wait --url "<pattern>" # Wait for URL to match
JavaScript & Network
npx agent-browser eval "<js>" # Run JS, return result npx agent-browser network requests --json # All network requests npx agent-browser network requests --filter "api" --json # Filtered requests
Finding Elements
npx agent-browser find role button click --name "Submit" # Find by role and act npx agent-browser find text "Hello" click # Find by text and click npx agent-browser find testid "my-button" click # Find by data-testid
State & Cleanup
npx agent-browser state save <path> # Save cookies + localStorage npx agent-browser state load <path> # Load saved state npx agent-browser close # Close browser npx agent-browser close --all # Close all sessions
Snapshot Refs (@e1, @e2, ...)
The
snapshot -i command returns interactive elements with compact refs like @e1, @e2. Always snapshot before interacting to get fresh refs:
npx agent-browser snapshot -i # See: @e1 [button] "Submit", @e2 [input] "Email" npx agent-browser click @e1 # Click by ref — precise, no fragile selectors npx agent-browser fill @e2 "test" # Fill by ref
Re-snapshot after navigation or DOM changes — refs become stale when the page updates.
React Dropdown Pattern (replaces old select
command)
selectReact dropdowns (react-select, custom listboxes) are NOT native
<select> elements. Handle them in 3 steps:
npx agent-browser click "#myDropdown" # 1. Open the dropdown trigger npx agent-browser snapshot -i # 2. See options as @refs npx agent-browser click @e7 # 3. Click the desired option
Persistent Storage
Data stored in
.browserhawk/ in the project root:
— Visual regression screenshots (committable)baselines/
— Test reports as markdown (committable)reports/
— Accumulated discovered routesdiscovered-routes.json
— Learned interaction patternsjourneys.json
— Browser auth state (do NOT commit)auth-state.json
Execution Protocol
Step 0: Launch Browser & Authenticate
# Start browser (daemon auto-starts, opens headed) npx agent-browser --headed --ignore-https-errors set viewport 1920 1080
Try loading saved auth state:
npx agent-browser state load .browserhawk/auth-state.json npx agent-browser open <target><entryPoint> npx agent-browser wait --load networkidle npx agent-browser get url
If the URL shows the app (not a login redirect), auth is valid — skip login.
If auth expired or first run:
node <skill-directory>/scripts/cli.mjs login
If login output shows
"status": "waiting", tell the user to complete 2FA in the browser. The command waits automatically.
Step 1: Load Context
# Check when the last QA session ran node <skill-directory>/scripts/cli.mjs last-run # Load previously learned journeys node <skill-directory>/scripts/cli.mjs load-journeys # Discover all routes (merges with previously discovered) node <skill-directory>/scripts/cli.mjs discover
Analyze the gap: Compare discovered routes against saved journeys:
- Untested — no journey exists. Highest priority.
- Stale — journey exists but
> 7 days. Needs re-verification.lastRun - Verified — recently verified. Skip unless doing deep testing.
Present a summary:
Routes: X discovered, Y with journeys, Z untested Priority: [list untested routes first, then stale]
Step 2: Autonomous Exploration
Work through routes in priority order (untested first, then stale).
A. If NO journey exists — Explore and Learn
- Navigate:
npx agent-browser open <target><path> - Wait:
npx agent-browser wait --load networkidle - Inventory:
to see all interactive elementsnpx agent-browser snapshot -i - Check errors:
npx agent-browser network requests --filter "status:[45]" --json - Visual baseline:
node <skill-directory>/scripts/cli.mjs compare <route-name> - Explore interactive elements:
- For dropdowns: click to open →
→ note options → press Escapesnapshot -i - For forms: identify all fields, types, required status
- For buttons: identify purpose (create, edit, delete, navigate)
- For dropdowns: click to open →
- Complete full flows:
- Create flows: Fill ALL fields with "Test-BrowserHawk-" prefix data, submit, verify success
- Edit flows: Find entity → Edit → change field → save → verify
- Delete flows: Only delete "Test-BrowserHawk-" entities
- Test validation: Submit forms empty, try invalid values
- Record the journey (see Journey Recording below)
B. If a journey EXISTS — Replay and Verify
- Navigate to the journey's route
- Replay each step using
commandsagent-browser - Verify the flow still works
- If success: re-save journey with updated
lastRun - If failure: re-explore, discover changes, save updated journey
After every interaction
Check for errors:
npx agent-browser network requests --filter "status:[45]" --json npx agent-browser eval "JSON.stringify(window.__errors || [])"
Set up error capture once per session:
npx agent-browser eval "window.__errors=[];window.addEventListener('error',e=>window.__errors.push({msg:e.message,src:e.filename,line:e.lineno}))"
Take screenshots at key states:
npx agent-browser screenshot /tmp/browserhawk/screenshots/<name>.png --full
Step 3: Deep Testing
Once all routes have journeys:
- Edge cases: Long text (200+ chars), special chars (
,<script>alert(1)</script>
), boundary values (0, -1, 999999999)"quotes" - Navigation stress: Direct URL access, back button behavior
- State testing: Double-click submit, fill form and refresh
- Scroll/lazy load:
to check for lazy-loaded contentnpx agent-browser scroll down 2000
Step 4: Bug Reporting
Read
bugReporting.target from config:
(default): Present all bugs in the final report."conversation"
: Create Asana tasks with "asana"
[BrowserHawk] prefix using MCP tools.
: Create GitHub issues via "github"
gh issue create.
Severity guide:
- High: JS errors on page load, 500 errors, pages that don't render, forms that silently fail
- Medium: Visual regressions, broken interactions, validation not working
- Low: Minor layout issues, console warnings
Step Final: Cleanup
Always run at the end:
npx agent-browser close
Journey Recording
Journey format
{ "route": "/deal/create", "type": "create", "name": "Create a new deal", "fields": [ { "selector": "[name=\"dealName\"]", "type": "text", "required": true, "testValue": "Test-BrowserHawk-Deal" }, { "selector": "#applicationPurpose", "type": "combobox", "required": true, "options": ["Purchase", "Refinance"], "testValue": "Purchase" } ], "steps": [ { "command": "open", "args": ["/deal/create"] }, { "command": "fill", "args": ["[name=\"dealName\"]", "Test-BrowserHawk-Deal"] }, { "command": "click", "args": ["#applicationPurpose"] }, { "command": "snapshot -i", "args": [] }, { "command": "click", "args": ["@e7"] }, { "command": "click", "args": ["button:has-text(\"Save\")"] }, { "command": "wait --url", "args": ["/deal/*/control-panel"] } ], "validationRules": [ { "field": "dealName", "rule": "required", "message": "Deal name is required" } ], "result": "success", "redirectedTo": "/deal/385001/control-panel", "bugs": [] }
Saving a journey
echo '<journey-json>' | node <skill-directory>/scripts/cli.mjs save-journey
Deduplicates by
route + type — newer version replaces older.
Journey types
— Page loads, no errors, visual baseline captured"smoke"
— Full entity creation (form fill → submit → verify)"create"
— Entity modification"edit"
— Entity deletion"delete"
— Tab/sidebar navigation"navigation"
— Form validation behavior"validation"
— General page exploration"explore"
Final Report
Present a summary, then save it:
echo "<report markdown>" | node <skill-directory>/scripts/cli.mjs save-report <description> # Creates: .browserhawk/reports/YYYY-MM-DD-HHmmss-<description>.md # Check when the last QA session ran: node <skill-directory>/scripts/cli.mjs last-run
Report template:
# BrowserHawk Test Report — YYYY-MM-DD ## Summary - **Routes discovered**: X - **Routes tested**: X - **Journeys**: X new, Y replayed, Z updated - **Bugs found**: X (High: X, Medium: X, Low: X) - **Coverage estimate**: X% ## Journey Status | Route | Type | Status | Last Run | |-------|------|--------|----------| ## Bugs Found [details, severity, screenshots] ## What's Next [untested areas, focus for next session]
Important Notes
- Be thorough but not destructive: Never delete real data. Use "Test-BrowserHawk-" prefix.
- Handle errors gracefully: One failure should not stop the entire run.
- Screenshot everything: Visual evidence is valuable for bug reports.
- Always snapshot before clicking:
→ use @refs → interact.snapshot -i - Re-snapshot after DOM changes: Refs go stale after navigation or mutations.
- Always save journeys: Every completed flow should be saved.
- Actually complete flows: Don't just fill forms — submit them.
- Rate yourself: Estimate coverage percentage and what was missed.