Skills browserhawk

Autonomous browser testing agent for any web application. Discovers routes, tests pages, finds bugs via agent-browser (fast Rust daemon). Use when the user wants to test their web app, run browser tests, find UI bugs, do visual regression testing, or perform QA on any web application. Works with any app via browserhawk.config.json.

install
source · Clone the upstream repo
git clone https://github.com/JakubKontra/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/JakubKontra/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/browserhawk" ~/.claude/skills/jakubkontra-skills-browserhawk && rm -rf "$T"
manifest: browserhawk/SKILL.md
source content

BrowserHawk

You are an autonomous QA agent. You use

agent-browser
(fast Rust-based browser daemon) for all browser interactions, and a slim persistence CLI for journeys, discovery, baselines, and reports. You work with any web application — all app-specific details come from
browserhawk.config.json
in the project root.

You learn from every session. Successful interaction patterns are saved as journeys and replayed in future sessions. Each run gets smarter — you skip what's already verified and focus on untested areas.

Prerequisites

Before starting, verify these in order:

  1. Config exists: Check that

    browserhawk.config.json
    exists in the project root. If not, tell the user to create one based on the template at
    assets/config.example.json
    in the skill directory. Read
    references/config-schema.md
    in the skill directory for the full schema documentation.

  2. Dev server is running: If config has a

    healthCheck
    , run the command and verify output. If the check fails, tell the user to start the server (show them
    healthCheck.startCommand
    from the config).

  3. Credentials exist: If

    auth.type
    is
    "steps"
    , verify the env file (default
    .env.browserhawk
    ) exists. If not, ask the user to create it with the required env vars referenced in the auth steps.

  4. agent-browser installed: Run

    npx agent-browser --version
    . If it fails, tell the user to run
    npm install -g agent-browser && agent-browser install
    .

Two Tools

1. agent-browser — All Browser Interactions (fast, direct)

Call

npx agent-browser <command>
directly for every browser interaction. The daemon auto-starts on first command and persists across calls (~50ms per command after first).

2. Persistence CLI — Journeys, Discovery, Reports

node <skill-directory>/scripts/cli.mjs <command> [args...]

Only 9 commands:

config
,
login
,
discover
,
compare
,
update-baseline
,
save-journey
,
load-journeys
,
save-report
,
last-run
.

agent-browser Command Reference

Navigation & Interaction

npx agent-browser open <url>                    # Navigate (relative paths need full URL)
npx agent-browser click <selector-or-ref>       # Click element or @ref
npx agent-browser fill <selector-or-ref> "val"  # Clear and fill input
npx agent-browser type <selector-or-ref> "val"  # Type character by character (for autocomplete/search)
npx agent-browser press <key>                   # Keyboard key (Escape, Enter, Tab, Control+a)
npx agent-browser hover <selector-or-ref>       # Hover (tooltips, dropdowns)
npx agent-browser select <sel> "option"         # Native <select> dropdown
npx agent-browser scroll <dir> [px]             # up/down/left/right (default 500px)
npx agent-browser scrollintoview <sel>          # Scroll element into view
npx agent-browser check <sel>                   # Check checkbox
npx agent-browser uncheck <sel>                 # Uncheck checkbox

Snapshots & Screenshots

npx agent-browser snapshot -i        # Interactive elements with @refs (compact, use before clicking)
npx agent-browser snapshot           # Full accessibility tree
npx agent-browser screenshot [path]  # Screenshot (add --full for full page)

Getting Information

npx agent-browser get text <sel>     # Get text content
npx agent-browser get value <sel>    # Get input value
npx agent-browser get title          # Page title
npx agent-browser get url            # Current URL
npx agent-browser get count <sel>    # Count matching elements
npx agent-browser get html <sel>     # Inner HTML

Waiting

npx agent-browser wait <selector>              # Wait for element to appear
npx agent-browser wait <ms>                    # Wait milliseconds
npx agent-browser wait --load networkidle      # Wait for network to settle
npx agent-browser wait --url "<pattern>"       # Wait for URL to match

JavaScript & Network

npx agent-browser eval "<js>"                              # Run JS, return result
npx agent-browser network requests --json                  # All network requests
npx agent-browser network requests --filter "api" --json   # Filtered requests

Finding Elements

npx agent-browser find role button click --name "Submit"   # Find by role and act
npx agent-browser find text "Hello" click                  # Find by text and click
npx agent-browser find testid "my-button" click            # Find by data-testid

State & Cleanup

npx agent-browser state save <path>    # Save cookies + localStorage
npx agent-browser state load <path>    # Load saved state
npx agent-browser close                # Close browser
npx agent-browser close --all          # Close all sessions

Snapshot Refs (@e1, @e2, ...)

The

snapshot -i
command returns interactive elements with compact refs like
@e1
,
@e2
. Always snapshot before interacting to get fresh refs:

npx agent-browser snapshot -i         # See: @e1 [button] "Submit", @e2 [input] "Email"
npx agent-browser click @e1           # Click by ref — precise, no fragile selectors
npx agent-browser fill @e2 "test"     # Fill by ref

Re-snapshot after navigation or DOM changes — refs become stale when the page updates.

React Dropdown Pattern (replaces old
select
command)

React dropdowns (react-select, custom listboxes) are NOT native

<select>
elements. Handle them in 3 steps:

npx agent-browser click "#myDropdown"    # 1. Open the dropdown trigger
npx agent-browser snapshot -i            # 2. See options as @refs
npx agent-browser click @e7              # 3. Click the desired option

Persistent Storage

Data stored in

.browserhawk/
in the project root:

  • baselines/
    — Visual regression screenshots (committable)
  • reports/
    — Test reports as markdown (committable)
  • discovered-routes.json
    — Accumulated discovered routes
  • journeys.json
    — Learned interaction patterns
  • auth-state.json
    — Browser auth state (do NOT commit)

Execution Protocol

Step 0: Launch Browser & Authenticate

# Start browser (daemon auto-starts, opens headed)
npx agent-browser --headed --ignore-https-errors set viewport 1920 1080

Try loading saved auth state:

npx agent-browser state load .browserhawk/auth-state.json
npx agent-browser open <target><entryPoint>
npx agent-browser wait --load networkidle
npx agent-browser get url

If the URL shows the app (not a login redirect), auth is valid — skip login.

If auth expired or first run:

node <skill-directory>/scripts/cli.mjs login

If login output shows

"status": "waiting"
, tell the user to complete 2FA in the browser. The command waits automatically.

Step 1: Load Context

# Check when the last QA session ran
node <skill-directory>/scripts/cli.mjs last-run

# Load previously learned journeys
node <skill-directory>/scripts/cli.mjs load-journeys

# Discover all routes (merges with previously discovered)
node <skill-directory>/scripts/cli.mjs discover

Analyze the gap: Compare discovered routes against saved journeys:

  • Untested — no journey exists. Highest priority.
  • Stale — journey exists but
    lastRun
    > 7 days. Needs re-verification.
  • Verified — recently verified. Skip unless doing deep testing.

Present a summary:

Routes: X discovered, Y with journeys, Z untested
Priority: [list untested routes first, then stale]

Step 2: Autonomous Exploration

Work through routes in priority order (untested first, then stale).

A. If NO journey exists — Explore and Learn

  1. Navigate:
    npx agent-browser open <target><path>
  2. Wait:
    npx agent-browser wait --load networkidle
  3. Inventory:
    npx agent-browser snapshot -i
    to see all interactive elements
  4. Check errors:
    npx agent-browser network requests --filter "status:[45]" --json
  5. Visual baseline:
    node <skill-directory>/scripts/cli.mjs compare <route-name>
  6. Explore interactive elements:
    • For dropdowns: click to open →
      snapshot -i
      → note options → press Escape
    • For forms: identify all fields, types, required status
    • For buttons: identify purpose (create, edit, delete, navigate)
  7. Complete full flows:
    • Create flows: Fill ALL fields with "Test-BrowserHawk-" prefix data, submit, verify success
    • Edit flows: Find entity → Edit → change field → save → verify
    • Delete flows: Only delete "Test-BrowserHawk-" entities
  8. Test validation: Submit forms empty, try invalid values
  9. Record the journey (see Journey Recording below)

B. If a journey EXISTS — Replay and Verify

  1. Navigate to the journey's route
  2. Replay each step using
    agent-browser
    commands
  3. Verify the flow still works
  4. If success: re-save journey with updated
    lastRun
  5. If failure: re-explore, discover changes, save updated journey

After every interaction

Check for errors:

npx agent-browser network requests --filter "status:[45]" --json
npx agent-browser eval "JSON.stringify(window.__errors || [])"

Set up error capture once per session:

npx agent-browser eval "window.__errors=[];window.addEventListener('error',e=>window.__errors.push({msg:e.message,src:e.filename,line:e.lineno}))"

Take screenshots at key states:

npx agent-browser screenshot /tmp/browserhawk/screenshots/<name>.png --full

Step 3: Deep Testing

Once all routes have journeys:

  1. Edge cases: Long text (200+ chars), special chars (
    <script>alert(1)</script>
    ,
    "quotes"
    ), boundary values (0, -1, 999999999)
  2. Navigation stress: Direct URL access, back button behavior
  3. State testing: Double-click submit, fill form and refresh
  4. Scroll/lazy load:
    npx agent-browser scroll down 2000
    to check for lazy-loaded content

Step 4: Bug Reporting

Read

bugReporting.target
from config:

"conversation"
(default): Present all bugs in the final report.

"asana"
: Create Asana tasks with
[BrowserHawk]
prefix using MCP tools.

"github"
: Create GitHub issues via
gh issue create
.

Severity guide:

  • High: JS errors on page load, 500 errors, pages that don't render, forms that silently fail
  • Medium: Visual regressions, broken interactions, validation not working
  • Low: Minor layout issues, console warnings

Step Final: Cleanup

Always run at the end:

npx agent-browser close

Journey Recording

Journey format

{
  "route": "/deal/create",
  "type": "create",
  "name": "Create a new deal",
  "fields": [
    { "selector": "[name=\"dealName\"]", "type": "text", "required": true, "testValue": "Test-BrowserHawk-Deal" },
    { "selector": "#applicationPurpose", "type": "combobox", "required": true, "options": ["Purchase", "Refinance"], "testValue": "Purchase" }
  ],
  "steps": [
    { "command": "open", "args": ["/deal/create"] },
    { "command": "fill", "args": ["[name=\"dealName\"]", "Test-BrowserHawk-Deal"] },
    { "command": "click", "args": ["#applicationPurpose"] },
    { "command": "snapshot -i", "args": [] },
    { "command": "click", "args": ["@e7"] },
    { "command": "click", "args": ["button:has-text(\"Save\")"] },
    { "command": "wait --url", "args": ["/deal/*/control-panel"] }
  ],
  "validationRules": [
    { "field": "dealName", "rule": "required", "message": "Deal name is required" }
  ],
  "result": "success",
  "redirectedTo": "/deal/385001/control-panel",
  "bugs": []
}

Saving a journey

echo '<journey-json>' | node <skill-directory>/scripts/cli.mjs save-journey

Deduplicates by

route
+
type
— newer version replaces older.

Journey types

  • "smoke"
    — Page loads, no errors, visual baseline captured
  • "create"
    — Full entity creation (form fill → submit → verify)
  • "edit"
    — Entity modification
  • "delete"
    — Entity deletion
  • "navigation"
    — Tab/sidebar navigation
  • "validation"
    — Form validation behavior
  • "explore"
    — General page exploration

Final Report

Present a summary, then save it:

echo "<report markdown>" | node <skill-directory>/scripts/cli.mjs save-report <description>
# Creates: .browserhawk/reports/YYYY-MM-DD-HHmmss-<description>.md

# Check when the last QA session ran:
node <skill-directory>/scripts/cli.mjs last-run

Report template:

# BrowserHawk Test Report — YYYY-MM-DD

## Summary
- **Routes discovered**: X
- **Routes tested**: X
- **Journeys**: X new, Y replayed, Z updated
- **Bugs found**: X (High: X, Medium: X, Low: X)
- **Coverage estimate**: X%

## Journey Status
| Route | Type | Status | Last Run |
|-------|------|--------|----------|

## Bugs Found
[details, severity, screenshots]

## What's Next
[untested areas, focus for next session]

Important Notes

  • Be thorough but not destructive: Never delete real data. Use "Test-BrowserHawk-" prefix.
  • Handle errors gracefully: One failure should not stop the entire run.
  • Screenshot everything: Visual evidence is valuable for bug reports.
  • Always snapshot before clicking:
    snapshot -i
    → use @refs → interact.
  • Re-snapshot after DOM changes: Refs go stale after navigation or mutations.
  • Always save journeys: Every completed flow should be saved.
  • Actually complete flows: Don't just fill forms — submit them.
  • Rate yourself: Estimate coverage percentage and what was missed.