Awesome-omni-skill browser
Control a Chrome session via Stagehand to browse, act, extract, and screenshot on demand inside the Factory CLI.
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/tools/browser" ~/.claude/skills/diegosouzapw-awesome-omni-skill-browser-a3b5b1 && rm -rf "$T"
manifest:
skills/tools/browser/SKILL.mdsource content
Skill: Browser
Use this skill when you need live browser automation during a Factory session—opening sites, clicking through flows, gathering structured data, or capturing screenshots.
Inputs
- Target URL or task description (natural language)
- Optional structured extraction schema (JSON field → type)
Behavior
- Ensure Chrome is running via Stagehand with the local profile stored in
..chrome-profile - Support these commands:
: open a page and capture a screenshot.navigate <url>
: perform natural-language actions.act "<instruction>"
: return structured data.extract "<instruction>" '{"field":"type"}'
: list suggested steps the agent can take.observe "<goal>"
: capture the current viewport.screenshot
: shut down the session when finished.close
- Save screenshots to
and report the file path in the response.agent/browser_screenshots - When tasks finish, summarize what happened plus any follow-up steps for the user.
Verification
- If a navigation/action fails, include the error message and prompt the user for next steps.
- Before ending the session, ensure
has been run so Chrome processes don’t linger.close
Notes
- This skill expects
(for Stagehand) and, if used viaANTHROPIC_API_KEY
,droid exec
to already be configured.FACTORY_API_KEY - The working directory should remain inside the cloned skill folder so relative paths resolve correctly.