Full-stack-skills agent-browser
Automates browser interactions via CLI using agent-browser by Vercel Labs. Covers navigation, clicking, form filling, snapshots, refs-based selectors, agent mode with JSON output, session management, and CDP integration. Use when the user needs to automate web browsing, scrape pages, fill forms, or integrate browser automation into AI agent workflows.
git clone https://github.com/partme-ai/full-stack-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/partme-ai/full-stack-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/dev-utils-skills/agent-browser" ~/.claude/skills/partme-ai-full-stack-skills-agent-browser && rm -rf "$T"
skills/dev-utils-skills/agent-browser/SKILL.mdWhen to use this skill
Use this skill whenever the user wants to:
- Automate browser interactions (click, fill, navigate, screenshot) via CLI
- Scrape web content or extract data from pages
- Build AI agent workflows that interact with websites
- Use refs-based element selection for deterministic automation
- Run browser automation in agent mode with JSON output
- Manage authenticated sessions with custom headers or CDP
How to use this skill
This skill is organized to match the agent-browser official documentation structure (https://github.com/vercel-labs/agent-browser/blob/main/README.md). When working with agent-browser:
Quick-Start Example: Snapshot → Identify → Interact
# 1. Install npm install -g @anthropic-ai/agent-browser # 2. Open a page and take a snapshot to get element refs agent-browser open "https://example.com" agent-browser snapshot # Output includes refs like @e1, @e2, @e3 for each element # 3. Click an element by ref agent-browser click @e3 # 4. Fill a form field agent-browser fill @e5 "hello@example.com" # 5. Agent mode (JSON output for programmatic use) agent-browser snapshot --json
Detailed Documentation
-
Install agent-browser:
- Load
for installation instructionsexamples/getting-started/installation.md
- Load
-
Quick Start:
- Load
for basic workflow examplesexamples/quick-start/quick-start.md
- Load
-
Learn core commands:
- Load
for basic commands (open, click, fill, etc.)examples/commands/basic-commands.md - Load
for advanced commands (snapshot, eval, etc.)examples/commands/advanced-commands.md - Load
for information retrieval commandsexamples/commands/get-info/ - Load
for state checking commandsexamples/commands/check-state/ - Load
for semantic locator commandsexamples/commands/find-elements/ - Load
for wait commandsexamples/commands/wait/ - Load
for mouse control commandsexamples/commands/mouse-control/ - Load
for browser configurationexamples/commands/browser-settings/ - Load
for cookies and storage managementexamples/commands/cookies-storage/ - Load
for network interceptionexamples/commands/network/ - Load
for tab and window managementexamples/commands/tabs-windows/ - Load
for iframe handlingexamples/commands/frames/ - Load
for dialog handlingexamples/commands/dialogs/ - Load
for debugging commandsexamples/commands/debug/ - Load
for navigation commandsexamples/commands/navigation/ - Load
for setup commandsexamples/commands/setup/
- Load
-
Understand selectors:
- Load
for refs-based selection (@e1, @e2, etc.)examples/selectors/refs.md - Load
for CSS, XPath, and semantic locatorsexamples/selectors/traditional-selectors.md
- Load
-
Use agent mode:
- Load
for agent mode overviewexamples/agent-mode/introduction.md - Load
for optimal AI workflowexamples/agent-mode/optimal-workflow.md - Load
for integrating with AI agentsexamples/agent-mode/integration.md
- Load
-
Advanced features:
- Load
for session managementexamples/advanced/sessions.md - Load
for debugging with visible browserexamples/advanced/headed-mode.md - Load
for authentication via headersexamples/advanced/authenticated-sessions.md - Load
for custom browser executableexamples/advanced/custom-executable.md - Load
for Chrome DevTools Protocol integrationexamples/advanced/cdp-mode.md - Load
for browser viewport streamingexamples/advanced/streaming.md - Load
for architecture overviewexamples/advanced/architecture.md - Load
for platform supportexamples/advanced/platforms.md - Load
for AI agent integration patternsexamples/advanced/usage-with-agents.md
- Load
-
Configure options:
- Load
for global CLI optionsexamples/options/global-options.md - Load
for snapshot-specific optionsexamples/options/snapshot-options.md - Load
for session management optionsexamples/options/session-options.md
- Load
-
Reference API documentation when needed:
- Complete command referenceapi/commands.md
- Selector referenceapi/selectors.md
- Options referenceapi/options.md
-
Use templates for quick start:
- Basic automation workflowtemplates/basic-automation.md
- AI agent workflow templatetemplates/ai-agent-workflow.md
Doc mapping (one-to-one with official documentation)
- See examples and API files → https://github.com/vercel-labs/agent-browser
Examples and Templates
This skill includes detailed examples organized to match the official documentation structure. All examples are in the
examples/ directory (see mapping above).
To use examples:
- Identify the topic from the user's request
- Load the appropriate example file from the mapping above
- Follow the instructions, syntax, and best practices in that file
- Adapt the code examples to your specific use case
To use templates:
- Reference templates in
directory for common scaffoldingtemplates/ - Adapt templates to your specific needs and coding style
API Reference
- Commands API:
- Complete command reference with syntax and examplesapi/commands.md - Selectors API:
- Selector types and usage referenceapi/selectors.md - Options API:
- All options referenceapi/options.md
Best Practices
- Use Refs: Prefer refs (@e1, @e2) over traditional selectors for deterministic automation
- Snapshot First: Always snapshot before interacting with elements to get refs
- Agent Mode: Use
flag for machine-readable output in agent mode--json - Session Management: Use
to maintain state across commands--session - Interactive Snapshot: Use
flag for interactive snapshot selection-i - Semantic Locators: Use semantic locators (role/name) when refs are not available
- Error Handling: Check command exit codes and error messages
- Wait for Navigation: Commands automatically wait for navigation to complete
- Headed Mode: Use
for debugging, headless for production--headed - CDP Integration: Use
for Chrome DevTools Protocol integration--cdp - Streaming: Use
for live browser previewAGENT_BROWSER_STREAM_PORT - Authenticated Sessions: Use
for authentication without login flows--headers - Custom Executable: Use
for serverless deployments or custom browsers--executable-path - Snapshot Options: Combine
,-i
,-c
,-d
options to optimize snapshot output-s
Resources
- GitHub Repository: https://github.com/vercel-labs/agent-browser
- Official README: https://github.com/vercel-labs/agent-browser/blob/main/README.md
- Agent Mode Documentation: https://agent-browser.dev/agent-mode
- Issues: https://github.com/vercel-labs/agent-browser/issues
Keywords
agent-browser, CLI browser automation, AI agents, browser automation CLI, refs, snapshot, agent mode, semantic locators, browser automation tool, command-line browser, AI agent browser, deterministic selectors, accessibility tree, browser commands, web automation CLI, sessions, headed mode, authenticated sessions, CDP mode, streaming, Chrome DevTools Protocol, Playwright, browser automation for AI