Skills gemini-computer-use
Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot → function_call → action → function_response), or asks to integrate safety confirmation for risky UI actions.
install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/am-will/gemini-computer-use" ~/.claude/skills/openclaw-skills-gemini-computer-use && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/am-will/gemini-computer-use" ~/.openclaw/skills/openclaw-skills-gemini-computer-use && rm -rf "$T"
manifest:
skills/am-will/gemini-computer-use/SKILL.mdsource content
Gemini Computer Use
Quick start
-
Source the env file and set your API key:
cp env.example env.sh $EDITOR env.sh source env.sh -
Create a virtual environment and install dependencies:
python -m venv .venv source .venv/bin/activate pip install google-genai playwright playwright install chromium -
Run the agent script with a prompt:
python scripts/computer_use_agent.py \ --prompt "Find the latest blog post title on example.com" \ --start-url "https://example.com" \ --turn-limit 6
Browser selection
- Default: Playwright's bundled Chromium (no env vars required).
- Choose a channel (Chrome/Edge) with
.COMPUTER_USE_BROWSER_CHANNEL - Use a custom Chromium-based executable (e.g., Brave) with
.COMPUTER_USE_BROWSER_EXECUTABLE
If both are set,
COMPUTER_USE_BROWSER_EXECUTABLE takes precedence.
Core workflow (agent loop)
- Capture a screenshot and send the user goal + screenshot to the model.
- Parse
actions in the response.function_call - Execute each action in Playwright.
- If a
issafety_decision
, prompt the user before executing.require_confirmation - Send
objects containing the latest URL + screenshot.function_response - Repeat until the model returns only text (no actions) or you hit the turn limit.
Operational guidance
- Run in a sandboxed browser profile or container.
- Use
to block risky actions you do not want the model to take.--exclude - Keep the viewport at 1440x900 unless you have a reason to change it.
Resources
- Script:
scripts/computer_use_agent.py - Reference notes:
references/google-computer-use.md - Env template:
env.example