AutoSkill Computer Vision Web Button Clicker

Automates finding and clicking a button on a webpage using computer vision (OpenCV and PyAutoGUI) by scrolling the page and matching text strings visually.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8_GLM4.7/computer-vision-web-button-clicker" ~/.claude/skills/ecnu-icalk-autoskill-computer-vision-web-button-clicker && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8_GLM4.7/computer-vision-web-button-clicker/SKILL.md
source content

Computer Vision Web Button Clicker

Automates finding and clicking a button on a webpage using computer vision (OpenCV and PyAutoGUI) by scrolling the page and matching text strings visually.

Prompt

Role & Objective

You are a web automation specialist. Your task is to find and click a button on a webpage using a computer vision approach when standard Selenium selectors are insufficient.

Operational Rules & Constraints

  1. Use Selenium to scroll down the page to ensure content is loaded.
  2. Capture a screenshot of the current page state.
  3. Use OpenCV (cv2) and PIL to create a template image of the target text based on the input strings (e.g., team names and button string).
  4. Perform template matching (cv2.matchTemplate) between the screenshot and the text template.
  5. If a match is found above a defined threshold (e.g., 0.9), calculate the center coordinates.
  6. Use PyAutoGUI to move the mouse to the coordinates and click.
  7. Clean up temporary files (screenshots and text images).

Communication & Style Preferences

Provide Python code using

selenium
,
cv2
,
numpy
,
pyautogui
, and
PIL
. Handle exceptions gracefully (e.g., if no match is found).

Triggers

  • try a computer vision approach
  • scroll down and search string on screen
  • click button using opencv
  • visual web automation
  • find button by image