Claudeclaw add-image-vision

Add image vision to ClaudeClaw agents. Resizes and processes WhatsApp image attachments, then sends them to Claude as multimodal content blocks.

install
source · Clone the upstream repo
git clone https://github.com/sbusso/claudeclaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/sbusso/claudeclaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/add-image-vision" ~/.claude/skills/sbusso-claudeclaw-add-image-vision && rm -rf "$T"
manifest: skills/add-image-vision/SKILL.md
source content

Image Vision Skill

Adds the ability for ClaudeClaw agents to see and understand images sent via WhatsApp. Images are downloaded, resized with sharp, saved to the group workspace, and passed to the agent as base64-encoded multimodal content blocks.

Phase 1: Pre-flight

  1. Check if
    src/image.ts
    exists — skip to Phase 3 if already applied
  2. Confirm
    sharp
    is installable (native bindings require build tools)

Prerequisite: WhatsApp must be installed first (

skill/whatsapp
merged). This skill modifies WhatsApp channel files.

Phase 2: Apply Code Changes

Ensure WhatsApp fork remote

git remote -v

If

whatsapp
is missing, add it:

git remote add whatsapp https://github.com/qwibitai/claudeclaw-whatsapp.git

Merge the skill branch

git fetch whatsapp skill/image-vision
git merge whatsapp/skill/image-vision || {
  git checkout --theirs package-lock.json
  git add package-lock.json
  git merge --continue
}

This merges in:

  • src/image.ts
    (image download, resize via sharp, base64 encoding)
  • src/image.test.ts
    (8 unit tests)
  • Image attachment handling in
    src/channels/whatsapp.ts
  • Image passing to agent in
    src/index.ts
    and
    src/orchestrator/container-runner.ts
  • Image content block support in
    agent/runner/src/index.ts
  • sharp
    npm dependency in
    package.json

If the merge reports conflicts, resolve them by reading the conflicted files and understanding the intent of both sides.

Validate code changes

npm install
npm run build
npx vitest run src/image.test.ts

All tests must pass and build must be clean before proceeding.

Phase 3: Configure

  1. Rebuild the container (agent-runner changes need a rebuild):

    ./src/runtimes/docker/build.sh
    
  2. Sync agent-runner source to group caches:

    for dir in data/sessions/*/agent-runner-src/; do
      cp agent/runner/src/*.ts "$dir"
    done
    

Service name: Derived from the directory name:

com.claudeclaw.<dirname>
(macOS) /
claudeclaw-<dirname>
(Linux). For example, if cwd is
my-assistant
, the service is
com.claudeclaw.my-assistant
. Determine the correct service name before running service commands below.

  1. Restart the service:
    launchctl kickstart -k gui/$(id -u)/com.claudeclaw
    

Phase 4: Verify

  1. Send an image in a registered WhatsApp group
  2. Check the agent responds with understanding of the image content
  3. Check logs for "Processed image attachment":
    tail -50 groups/*/logs/container-*.log
    

Troubleshooting

  • "Image - download failed": Check WhatsApp connection stability. The download may timeout on slow connections.
  • "Image - processing failed": Sharp may not be installed correctly. Run
    npm ls sharp
    to verify.
  • Agent doesn't mention image content: Check container logs for "Loaded image" messages. If missing, ensure agent-runner source was synced to group caches.