Awesome-omni-skill android-use
Control Android devices via ADB commands - tap, swipe, type, navigate apps
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/development/android-use" ~/.claude/skills/diegosouzapw-awesome-omni-skill-android-use-01f723 && rm -rf "$T"
skills/development/android-use/SKILL.mdAndroid Device Control Skill
This skill enables you to control Android devices connected via ADB (Android Debug Bridge). You act as both the reasoning and execution engine - reading the device's UI state directly and deciding what actions to take.
Prerequisites
- Android device connected via USB with USB debugging enabled
- ADB installed and accessible in PATH
- Device authorized for debugging (accepted the "Allow USB debugging?" prompt)
Multi-Device Support
All scripts support the
-s <serial> flag to target a specific device. This is essential when multiple devices are connected (e.g., a physical phone AND an emulator).
Identifying Devices
Run
scripts/check-device.sh to see all connected devices:
Multiple devices connected (2): [PHYSICAL] 1A051FDF6007PA - Pixel 6 [EMULATOR] emulator-5554 - sdk_gphone64_arm64 Use -s <serial> to specify which device to use.
Choosing the Right Device
When the user mentions:
- "phone", "my phone", "physical device" → Use the
device[PHYSICAL] - "emulator", "virtual device", "AVD" → Use the
device[EMULATOR] - If unclear, ask the user which device they want to target
Using the Serial Flag
Once you identify the target device, pass
-s <serial> to ALL subsequent scripts:
# Check specific device scripts/check-device.sh -s 1A051FDF6007PA # All actions on that device scripts/get-screen.sh -s 1A051FDF6007PA scripts/tap.sh -s 1A051FDF6007PA 540 960 scripts/launch-app.sh -s 1A051FDF6007PA chrome
Important: Be consistent - use the same serial for all commands in a session.
Core Workflow
When given a task, follow this perception-action loop:
- Check device connection - Run
firstscripts/check-device.sh- If multiple devices: identify target based on user intent or ask
- Note the serial number for subsequent commands
- Get current screen state - Run
to dump UI hierarchyscripts/get-screen.sh [-s serial] - Analyze the XML - Read the accessibility tree to understand what's on screen
- Decide next action - Based on goal + current state, choose an action
- Execute action - Run the appropriate script with
if needed-s serial - Wait briefly - Allow UI to update (typically 500ms-1s)
- Repeat - Go back to step 2 until goal is achieved
Reading UI XML
The
get-screen.sh script outputs Android's accessibility XML. Key attributes to look for:
<node index="0" text="Settings" resource-id="com.android.settings:id/title" class="android.widget.TextView" content-desc="" bounds="[42,234][1038,345]" clickable="true" />
Important attributes:
- Visible text on the elementtext
- Accessibility description (useful for icons)content-desc
- Unique identifier for the elementresource-id
- Screen coordinates asbounds[left,top][right,bottom]
- Whether element responds to tapsclickable
- Whether element can be scrolledscrollable
- Whether element has input focusfocused
Calculating tap coordinates: From
bounds="[left,top][right,bottom]", calculate center:
- x = (left + right) / 2
- y = (top + bottom) / 2
Example:
bounds="[42,234][1038,345]" → tap at x=540, y=289
Available Scripts
All scripts are in the
scripts/ directory. Run them via bash.
All scripts support
to target a specific device.-s <serial>
Device Management
| Script | Args | Description |
|---|---|---|
| | List devices / verify connection |
| | Wake device and dismiss lock screen |
| | Capture screen image |
Screen Reading
| Script | Args | Description |
|---|---|---|
| | Dump UI accessibility tree |
Input Actions
| Script | Args | Description |
|---|---|---|
| | Tap at coordinates |
| | Type text string |
| | Swipe up/down/left/right |
| | Press key (home/back/enter/recent) |
App Management
| Script | Args | Description |
|---|---|---|
| | Launch app by package or search by name |
| | Install APK to device |
Action Guidelines
When to tap
- Target clickable elements
- Always calculate center from bounds
- Prefer elements with
clickable="true"
When to type
- After tapping a text input field
- The field should have
orfocused="true"class="android.widget.EditText" - Clear existing text first if needed (select all + delete)
When to swipe
- To scroll lists or pages
- To navigate between screens (e.g., swipe left/right for tabs)
- Directions:
(scroll down),up
(scroll up),down
,leftright
When to use keys
- Return to home screenhome
- Go back / close dialogsback
- Submit forms / confirmenter
- Open recent appsrecent
When to take screenshots
- For visual debugging when XML doesn't capture enough info
- To verify visual state (colors, images, etc.)
- When the task requires visual confirmation
When to wake the device
- Before starting any task (device may have gone to sleep)
- If
returns empty or minimal XMLget-screen.sh - If actions don't seem to be working (screen may be off)
- Note: Won't bypass PIN/pattern/password - user must unlock manually
Common Patterns
Opening an app
# By package name (fastest) scripts/launch-app.sh com.android.chrome # By app name (searches installed apps) scripts/launch-app.sh "Chrome"
Tapping a button
- Get screen:
scripts/get-screen.sh - Find element with matching text/content-desc
- Calculate center from bounds
- Tap:
scripts/tap.sh 540 289
Entering text in a field
- Tap the text field to focus it
- Wait for keyboard
- Type:
scripts/type-text.sh "your text here" - Press enter if needed:
scripts/key.sh enter
Scrolling to find content
- Get screen to check if target is visible
- If not found, swipe:
scripts/swipe.sh up - Get screen again, repeat until found or reached end
Handling dialogs/popups
- Look for elements with text like "OK", "Allow", "Accept", "Cancel"
- Tap the appropriate button
- Or press back to dismiss:
scripts/key.sh back
Error Handling
No device connected
- Check USB connection
- Verify USB debugging is enabled
- Run
manually to troubleshootadb devices
Element not found
- The UI may have changed - get fresh screen dump
- Try scrolling to find the element
- Element might be in a different screen/state
Action didn't work
- Wait longer between actions (UI might be slow)
- Verify coordinates are correct
- Check if a popup/dialog appeared
App not responding
- Press home and reopen the app
- Or force close and restart
Example Sessions
Single Device
User request: "Open Chrome and search for weather"
1. scripts/check-device.sh → Device connected: Pixel 6 → Serial: 1A051FDF6007PA → Type: Physical 2. scripts/launch-app.sh com.android.chrome → Chrome launched 3. scripts/get-screen.sh → [Read XML, find search/URL bar] → Found: bounds="[0,141][1080,228]" resource-id="com.android.chrome:id/url_bar" → Center: x=540, y=184 4. scripts/tap.sh 540 184 → Tapped URL bar 5. scripts/get-screen.sh → [Verify keyboard appeared and field is focused] 6. scripts/type-text.sh "weather" → Typed "weather" 7. scripts/key.sh enter → Pressed enter to search 8. scripts/get-screen.sh → [Verify search results loaded] → Task complete!
Multiple Devices
User request: "Open Settings on my phone" (with emulator also running)
1. scripts/check-device.sh → Multiple devices connected (2): → [PHYSICAL] 1A051FDF6007PA - Pixel 6 → [EMULATOR] emulator-5554 - sdk_gphone64_arm64 User said "my phone" → target the PHYSICAL device Serial to use: 1A051FDF6007PA 2. scripts/check-device.sh -s 1A051FDF6007PA → Device connected: Pixel 6 → Serial: 1A051FDF6007PA → Type: Physical → Status: Ready 3. scripts/launch-app.sh -s 1A051FDF6007PA settings → Resolved 'settings' to package: com.android.settings → Launched: com.android.settings 4. scripts/get-screen.sh -s 1A051FDF6007PA → [Read XML, verify Settings app is open] → Task complete!
Tips
- Be patient - Android UI can be slow, wait between actions
- Read carefully - The XML tells you exactly what's on screen
- Check your work - Get screen after each action to verify state
- Use screenshots - When XML doesn't give enough context
- Start simple - Break complex tasks into small steps
- Multi-device - Always check for multiple devices first; ask user if target is unclear
- Consistent serial - Once you pick a device, use
on ALL commands-s <serial>