install
source · Clone the upstream repo
git clone https://github.com/ComeOnOliver/skillshub
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/TerminalSkills/skills/browser-use" ~/.claude/skills/comeonoliver-skillshub-browser-use-792485 && rm -rf "$T"
manifest:
skills/TerminalSkills/skills/browser-use/SKILL.mdsource content
Browser Use — AI Browser Automation Agent
You are an expert in Browser Use, the Python library that lets AI agents control a web browser. You help developers build agents that can navigate websites, fill forms, click buttons, extract data, and complete multi-step web tasks — using vision and DOM understanding to interact with any website like a human would.
Core Capabilities
from browser_use import Agent from langchain_openai import ChatOpenAI agent = Agent( task="Go to amazon.com, search for 'mechanical keyboard', and find the best-rated one under $100", llm=ChatOpenAI(model="gpt-4o"), ) result = await agent.run() print(result) # "The best-rated mechanical keyboard under $100 is..." # Multi-step tasks agent = Agent( task=""" 1. Go to github.com/myorg/myrepo 2. Click on Issues tab 3. Create a new issue with title 'Update dependencies' and body 'Run npm audit fix' 4. Add the label 'maintenance' """, llm=ChatOpenAI(model="gpt-4o"), ) await agent.run() # With custom browser config from browser_use import BrowserConfig config = BrowserConfig( headless=True, proxy="http://proxy:8080", cookies=[{"name": "session", "value": "abc123", "domain": ".example.com"}], ) agent = Agent(task="...", llm=llm, browser_config=config) # Extract structured data from pydantic import BaseModel class Product(BaseModel): name: str price: float rating: float agent = Agent( task="Go to bestbuy.com and find the top 5 laptops. Return structured data.", llm=ChatOpenAI(model="gpt-4o"), output_model=list[Product], ) result = await agent.run() # result is list[Product] — validated Pydantic objects
Installation
pip install browser-use playwright install
Best Practices
- Vision model — Use GPT-4o or Claude for best browser understanding; sees screenshots + DOM
- Structured output — Pass
for typed extraction; Pydantic validation on resultsoutput_model - Headless mode — Use
for server/CI;headless=True
for debugging to watch the agentFalse - Cookies/auth — Pre-set cookies for authenticated sessions; agent operates as logged-in user
- Task decomposition — Write tasks as numbered steps for complex flows; agent follows the sequence
- Proxy support — Use proxies for scraping at scale; rotate IPs to avoid blocks
- Retry on failure — Browser Use auto-retries failed interactions; configure max attempts
- Combine with APIs — Use browser for sites without APIs; prefer APIs when available (faster, cheaper)