Skills control
Advanced desktop automation with mouse, keyboard, and screen control
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/breckengan/control" ~/.claude/skills/openclaw-skills-control && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/breckengan/control" ~/.openclaw/skills/openclaw-skills-control && rm -rf "$T"
skills/breckengan/control/SKILL.mdDesktop Control Skill
The most advanced desktop automation skill for OpenClaw. Provides pixel-perfect mouse control, lightning-fast keyboard input, screen capture, window management, and clipboard operations.
🎯 Features
Mouse Control
- ✅ Absolute positioning - Move to exact coordinates
- ✅ Relative movement - Move from current position
- ✅ Smooth movement - Natural, human-like mouse paths
- ✅ Click types - Left, right, middle, double, triple clicks
- ✅ Drag & drop - Drag from point A to point B
- ✅ Scroll - Vertical and horizontal scrolling
- ✅ Position tracking - Get current mouse coordinates
Keyboard Control
- ✅ Text typing - Fast, accurate text input
- ✅ Hotkeys - Execute keyboard shortcuts (Ctrl+C, Win+R, etc.)
- ✅ Special keys - Enter, Tab, Escape, Arrow keys, F-keys
- ✅ Key combinations - Multi-key press combinations
- ✅ Hold & release - Manual key state control
- ✅ Typing speed - Configurable WPM (instant to human-like)
Screen Operations
- ✅ Screenshot - Capture entire screen or regions
- ✅ Image recognition - Find elements on screen (via OpenCV)
- ✅ Color detection - Get pixel colors at coordinates
- ✅ Multi-monitor - Support for multiple displays
Window Management
- ✅ Window list - Get all open windows
- ✅ Activate window - Bring window to front
- ✅ Window info - Get position, size, title
- ✅ Minimize/Maximize - Control window states
Safety Features
- ✅ Failsafe - Move mouse to corner to abort
- ✅ Pause control - Emergency stop mechanism
- ✅ Approval mode - Require confirmation for actions
- ✅ Bounds checking - Prevent out-of-screen operations
- ✅ Logging - Track all automation actions
🚀 Quick Start
Installation
First, install required dependencies:
pip install pyautogui pillow opencv-python pygetwindow
Basic Usage
from skills.desktop_control import DesktopController # Initialize controller dc = DesktopController(failsafe=True) # Mouse operations dc.move_mouse(500, 300) # Move to coordinates dc.click() # Left click at current position dc.click(100, 200, button="right") # Right click at position # Keyboard operations dc.type_text("Hello from OpenClaw!") dc.hotkey("ctrl", "c") # Copy dc.press("enter") # Screen operations screenshot = dc.screenshot() position = dc.get_mouse_position()
📋 Complete API Reference
Mouse Functions
move_mouse(x, y, duration=0, smooth=True)
move_mouse(x, y, duration=0, smooth=True)Move mouse to absolute screen coordinates.
Parameters:
(int): X coordinate (pixels from left)x
(int): Y coordinate (pixels from top)y
(float): Movement time in seconds (0 = instant, 0.5 = smooth)duration
(bool): Use bezier curve for natural movementsmooth
Example:
# Instant movement dc.move_mouse(1000, 500) # Smooth 1-second movement dc.move_mouse(1000, 500, duration=1.0)
move_relative(x_offset, y_offset, duration=0)
move_relative(x_offset, y_offset, duration=0)Move mouse relative to current position.
Parameters:
(int): Pixels to move horizontally (positive = right)x_offset
(int): Pixels to move vertically (positive = down)y_offset
(float): Movement time in secondsduration
Example:
# Move 100px right, 50px down dc.move_relative(100, 50, duration=0.3)
click(x=None, y=None, button='left', clicks=1, interval=0.1)
click(x=None, y=None, button='left', clicks=1, interval=0.1)Perform mouse click.
Parameters:
(int, optional): Coordinates to click (None = current position)x, y
(str): 'left', 'right', 'middle'button
(int): Number of clicks (1 = single, 2 = double)clicks
(float): Delay between multiple clicksinterval
Example:
# Simple left click dc.click() # Double-click at specific position dc.click(500, 300, clicks=2) # Right-click dc.click(button='right')
drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')
drag(start_x, start_y, end_x, end_y, duration=0.5, button='left')Drag and drop operation.
Parameters:
(int): Starting coordinatesstart_x, start_y
(int): Ending coordinatesend_x, end_y
(float): Drag durationduration
(str): Mouse button to usebutton
Example:
# Drag file from desktop to folder dc.drag(100, 100, 500, 500, duration=1.0)
scroll(clicks, direction='vertical', x=None, y=None)
scroll(clicks, direction='vertical', x=None, y=None)Scroll mouse wheel.
Parameters:
(int): Scroll amount (positive = up/left, negative = down/right)clicks
(str): 'vertical' or 'horizontal'direction
(int, optional): Position to scroll atx, y
Example:
# Scroll down 5 clicks dc.scroll(-5) # Scroll up 10 clicks dc.scroll(10) # Horizontal scroll dc.scroll(5, direction='horizontal')
get_mouse_position()
get_mouse_position()Get current mouse coordinates.
Returns:
(x, y) tuple
Example:
x, y = dc.get_mouse_position() print(f"Mouse is at: {x}, {y}")
Keyboard Functions
type_text(text, interval=0, wpm=None)
type_text(text, interval=0, wpm=None)Type text with configurable speed.
Parameters:
(str): Text to typetext
(float): Delay between keystrokes (0 = instant)interval
(int, optional): Words per minute (overrides interval)wpm
Example:
# Instant typing dc.type_text("Hello World") # Human-like typing at 60 WPM dc.type_text("Hello World", wpm=60) # Slow typing with 0.1s between keys dc.type_text("Hello World", interval=0.1)
press(key, presses=1, interval=0.1)
press(key, presses=1, interval=0.1)Press and release a key.
Parameters:
(str): Key name (see Key Names section)key
(int): Number of times to presspresses
(float): Delay between pressesinterval
Example:
# Press Enter dc.press('enter') # Press Space 3 times dc.press('space', presses=3) # Press Down arrow dc.press('down')
hotkey(*keys, interval=0.05)
hotkey(*keys, interval=0.05)Execute keyboard shortcut.
Parameters:
(str): Keys to press together*keys
(float): Delay between key pressesinterval
Example:
# Copy (Ctrl+C) dc.hotkey('ctrl', 'c') # Paste (Ctrl+V) dc.hotkey('ctrl', 'v') # Open Run dialog (Win+R) dc.hotkey('win', 'r') # Save (Ctrl+S) dc.hotkey('ctrl', 's') # Select All (Ctrl+A) dc.hotkey('ctrl', 'a')
key_down(key)
/ key_up(key)
key_down(key)key_up(key)Manually control key state.
Example:
# Hold Shift dc.key_down('shift') dc.type_text("hello") # Types "HELLO" dc.key_up('shift') # Hold Ctrl and click (for multi-select) dc.key_down('ctrl') dc.click(100, 100) dc.click(200, 100) dc.key_up('ctrl')
Screen Functions
screenshot(region=None, filename=None)
screenshot(region=None, filename=None)Capture screen or region.
Parameters:
(tuple, optional): (left, top, width, height) for partial captureregion
(str, optional): Path to save imagefilename
Returns: PIL Image object
Example:
# Full screen img = dc.screenshot() # Save to file dc.screenshot(filename="screenshot.png") # Capture specific region img = dc.screenshot(region=(100, 100, 500, 300))
get_pixel_color(x, y)
get_pixel_color(x, y)Get color of pixel at coordinates.
Returns: RGB tuple
(r, g, b)
Example:
r, g, b = dc.get_pixel_color(500, 300) print(f"Color at (500, 300): RGB({r}, {g}, {b})")
find_on_screen(image_path, confidence=0.8)
find_on_screen(image_path, confidence=0.8)Find image on screen (requires OpenCV).
Parameters:
(str): Path to template imageimage_path
(float): Match threshold (0-1)confidence
Returns:
(x, y, width, height) or None
Example:
# Find button on screen location = dc.find_on_screen("button.png") if location: x, y, w, h = location # Click center of found image dc.click(x + w//2, y + h//2)
get_screen_size()
get_screen_size()Get screen resolution.
Returns:
(width, height) tuple
Example:
width, height = dc.get_screen_size() print(f"Screen: {width}x{height}")
Window Functions
get_all_windows()
get_all_windows()List all open windows.
Returns: List of window titles
Example:
windows = dc.get_all_windows() for title in windows: print(f"Window: {title}")
activate_window(title_substring)
activate_window(title_substring)Bring window to front by title.
Parameters:
(str): Part of window title to matchtitle_substring
Example:
# Activate Chrome dc.activate_window("Chrome") # Activate VS Code dc.activate_window("Visual Studio Code")
get_active_window()
get_active_window()Get currently focused window.
Returns: Window title (str)
Example:
active = dc.get_active_window() print(f"Active window: {active}")
Clipboard Functions
copy_to_clipboard(text)
copy_to_clipboard(text)Copy text to clipboard.
Example:
dc.copy_to_clipboard("Hello from OpenClaw!")
get_from_clipboard()
get_from_clipboard()Get text from clipboard.
Returns: str
Example:
text = dc.get_from_clipboard() print(f"Clipboard: {text}")
⌨️ Key Names Reference
Alphabet Keys
'a' through 'z'
Number Keys
'0' through '9'
Function Keys
'f1' through 'f24'
Special Keys
/'enter''return'
/'esc''escape'
/'space''spacebar''tab''backspace'
/'delete''del''insert''home''end'
/'pageup''pgup'
/'pagedown''pgdn'
Arrow Keys
/'up'
/'down'
/'left''right'
Modifier Keys
/'ctrl''control''shift''alt'
/'win'
/'winleft''winright'
/'cmd'
(Mac)'command'
Lock Keys
'capslock''numlock''scrolllock'
Punctuation
/'.'
/','
/'?'
/'!'
/';'':'
/'['
/']'
/'{''}'
/'('')'
/'+'
/'-'
/'*'
/'/''='
🛡️ Safety Features
Failsafe Mode
Move mouse to any corner of the screen to abort all automation.
# Enable failsafe (enabled by default) dc = DesktopController(failsafe=True)
Pause Control
# Pause all automation for 2 seconds dc.pause(2.0) # Check if automation is safe to proceed if dc.is_safe(): dc.click(500, 500)
Approval Mode
Require user confirmation before actions:
dc = DesktopController(require_approval=True) # This will ask for confirmation dc.click(500, 500) # Prompt: "Allow click at (500, 500)? [y/n]"
🎨 Advanced Examples
Example 1: Automated Form Filling
dc = DesktopController() # Click name field dc.click(300, 200) dc.type_text("John Doe", wpm=80) # Tab to next field dc.press('tab') dc.type_text("john@example.com", wpm=80) # Tab to password dc.press('tab') dc.type_text("SecurePassword123", wpm=60) # Submit form dc.press('enter')
Example 2: Screenshot Region and Save
# Capture specific area region = (100, 100, 800, 600) # left, top, width, height img = dc.screenshot(region=region) # Save with timestamp import datetime timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S") img.save(f"capture_{timestamp}.png")
Example 3: Multi-File Selection
# Hold Ctrl and click multiple files dc.key_down('ctrl') dc.click(100, 200) # First file dc.click(100, 250) # Second file dc.click(100, 300) # Third file dc.key_up('ctrl') # Copy selected files dc.hotkey('ctrl', 'c')
Example 4: Window Automation
# Activate Calculator dc.activate_window("Calculator") time.sleep(0.5) # Type calculation dc.type_text("5+3=", interval=0.2) time.sleep(0.5) # Take screenshot of result dc.screenshot(filename="calculation_result.png")
Example 5: Drag & Drop File
# Drag file from source to destination dc.drag( start_x=200, start_y=300, # File location end_x=800, end_y=500, # Folder location duration=1.0 # Smooth 1-second drag )
⚡ Performance Tips
- Use instant movements for speed:
duration=0 - Batch operations instead of individual calls
- Cache screen positions instead of recalculating
- Disable failsafe for maximum performance (use with caution)
- Use hotkeys instead of menu navigation
⚠️ Important Notes
- Screen coordinates start at (0, 0) in top-left corner
- Multi-monitor setups may have negative coordinates for secondary displays
- Windows DPI scaling may affect coordinate accuracy
- Failsafe corners are: (0,0), (width-1, 0), (0, height-1), (width-1, height-1)
- Some applications may block simulated input (games, secure apps)
🔧 Troubleshooting
Mouse not moving to correct position
- Check DPI scaling settings
- Verify screen resolution matches expectations
- Use
to confirm dimensionsget_screen_size()
Keyboard input not working
- Ensure target application has focus
- Some apps require admin privileges
- Try increasing
for reliabilityinterval
Failsafe triggering accidentally
- Increase screen border tolerance
- Move mouse away from corners during normal use
- Disable if needed:
DesktopController(failsafe=False)
Permission errors
- Run Python with administrator privileges for some operations
- Some secure applications block automation
📦 Dependencies
- PyAutoGUI - Core automation engine
- Pillow - Image processing
- OpenCV (optional) - Image recognition
- PyGetWindow - Window management
Install all:
pip install pyautogui pillow opencv-python pygetwindow
Built for OpenClaw - The ultimate desktop automation companion 🦞