Squire Browser Use
Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, or extract information from web pages.
install
source · Clone the upstream repo
git clone https://github.com/eddiebelaval/squire
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/eddiebelaval/squire "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/browser-use" ~/.claude/skills/eddiebelaval-squire-browser-use && rm -rf "$T"
manifest:
skills/browser-use/SKILL.mdsource content
Browser Automation with browser-use CLI
Core Workflows
Workflow 1: Page Interaction
- Navigate to URL with
browser-use open - Inspect elements with
browser-use state - Interact using element indices (click, type, select)
- Verify result with state or screenshot
Workflow 2: Form Filling
- Open target page
- Get element indices via state
- Fill each field using
browser-use input - Submit and verify success
Workflow 3: Data Extraction
- Navigate to data source
- Use Python session for complex extraction
- Scroll through paginated content
- Export results
The
browser-use command provides fast, persistent browser automation. It maintains browser sessions across commands, enabling complex multi-step workflows.
Quick Start
browser-use open https://example.com # Navigate to URL browser-use state # Get page elements with indices browser-use click 5 # Click element by index browser-use type "Hello World" # Type text browser-use screenshot # Take screenshot browser-use close # Close browser
Core Workflow
- Navigate:
- Opens URL (starts browser if needed)browser-use open <url> - Inspect:
- Returns clickable elements with indicesbrowser-use state - Interact: Use indices from state to interact (
,browser-use click 5
)browser-use input 3 "text" - Verify:
orbrowser-use state
to confirm actionsbrowser-use screenshot - Repeat: Browser stays open between commands
Browser Modes
browser-use --browser chromium open <url> # Default: headless Chromium browser-use --browser chromium --headed open <url> # Visible Chromium window browser-use --browser real open <url> # User's Chrome with login sessions browser-use --browser remote open <url> # Cloud browser (requires API key)
- chromium: Fast, isolated, headless by default
- real: Uses your Chrome with cookies, extensions, logged-in sessions
- remote: Cloud-hosted browser with proxy support (requires BROWSER_USE_API_KEY)
Commands
Navigation
browser-use open <url> # Navigate to URL browser-use back # Go back in history browser-use scroll down # Scroll down browser-use scroll up # Scroll up
Page State
browser-use state # Get URL, title, and clickable elements browser-use screenshot # Take screenshot (outputs base64) browser-use screenshot path.png # Save screenshot to file browser-use screenshot --full path.png # Full page screenshot
Interactions (use indices from browser-use state
)
browser-use statebrowser-use click <index> # Click element browser-use type "text" # Type text into focused element browser-use input <index> "text" # Click element, then type text browser-use keys "Enter" # Send keyboard keys browser-use keys "Control+a" # Send key combination browser-use select <index> "option" # Select dropdown option
Tab Management
browser-use switch <tab> # Switch to tab by index browser-use close-tab # Close current tab browser-use close-tab <tab> # Close specific tab
JavaScript & Data
browser-use eval "document.title" # Execute JavaScript, return result browser-use extract "all product prices" # Extract data using LLM (requires API key)
Python Execution (Persistent Session)
browser-use python "x = 42" # Set variable browser-use python "print(x)" # Access variable (outputs: 42) browser-use python "print(browser.url)" # Access browser object browser-use python --vars # Show defined variables browser-use python --reset # Clear Python namespace browser-use python --file script.py # Execute Python file
The Python session maintains state across commands. The
browser object provides:
- Current page URLbrowser.url
- Page titlebrowser.title
- Navigatebrowser.goto(url)
- Click elementbrowser.click(index)
- Type textbrowser.type(text)
- Take screenshotbrowser.screenshot(path)
- Scroll pagebrowser.scroll()
- Get page HTMLbrowser.html
Agent Tasks (Requires API Key)
browser-use run "Fill the contact form with test data" # Run AI agent browser-use run "Extract all product prices" --max-steps 50
Agent tasks use an LLM to autonomously complete complex browser tasks. Requires
BROWSER_USE_API_KEY or configured LLM API key (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc).
Session Management
browser-use sessions # List active sessions browser-use close # Close current session browser-use close --all # Close all sessions
Server Control
browser-use server status # Check if server is running browser-use server stop # Stop server browser-use server logs # View server logs
Global Options
| Option | Description |
|---|---|
| Use named session (default: "default") |
| Browser mode: chromium, real, remote |
| Show browser window (chromium mode) |
| Chrome profile (real mode only) |
| Output as JSON |
| Override API key |
Session behavior: All commands without
--session use the same "default" session. The browser stays open and is reused across commands. Use --session NAME to run multiple browsers in parallel.
Examples
Form Submission
browser-use open https://example.com/contact browser-use state # Shows: [0] input "Name", [1] input "Email", [2] textarea "Message", [3] button "Submit" browser-use input 0 "John Doe" browser-use input 1 "john@example.com" browser-use input 2 "Hello, this is a test message." browser-use click 3 browser-use state # Verify success
Multi-Session Workflows
browser-use --session work open https://work.example.com browser-use --session personal open https://personal.example.com browser-use --session work state # Check work session browser-use --session personal state # Check personal session browser-use close --all # Close both sessions
Data Extraction with Python
browser-use open https://example.com/products browser-use python " products = [] for i in range(20): browser.scroll('down') browser.screenshot('products.png') " browser-use python "print(f'Captured {len(products)} products')"
Using Real Browser (Logged-In Sessions)
browser-use --browser real open https://gmail.com # Uses your actual Chrome with existing login sessions browser-use state # Already logged in!
Tips
- Always run
first to see available elements and their indicesbrowser-use state - Use
for debugging to see what the browser is doing--headed - Sessions persist - the browser stays open between commands
- Use
for parsing output programmatically--json - Python variables persist across
commands within a sessionbrowser-use python - Real browser mode preserves your login sessions and extensions
Troubleshooting
Browser won't start?
browser-use server stop # Stop any stuck server browser-use --headed open <url> # Try with visible window
Element not found?
browser-use state # Check current elements browser-use scroll down # Element might be below fold browser-use state # Check again
Session issues?
browser-use sessions # Check active sessions browser-use close --all # Clean slate browser-use open <url> # Fresh start
Cleanup
Always close the browser when done. Run this after completing browser automation:
browser-use close