Skills phone-call
Make autonomous phone calls with AI voice using Twilio, Deepgram, and ElevenLabs
install
source · Clone the upstream repo
git clone https://github.com/openclaw/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/arein/concierge/skills/phone-call" ~/.claude/skills/openclaw-skills-phone-call && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/arein/concierge/skills/phone-call" ~/.openclaw/skills/openclaw-skills-phone-call && rm -rf "$T"
manifest:
skills/arein/concierge/skills/phone-call/SKILL.mdsource content
Phone Call Skill
Make autonomous phone calls with a goal-driven AI agent. The AI handles the conversation until the goal is achieved.
Prerequisites
-
Required configuration:
concierge config set twilioAccountSid <your-sid> concierge config set twilioAuthToken <your-token> concierge config set twilioPhoneNumber <your-number> concierge config set deepgramApiKey <your-key> concierge config set elevenLabsApiKey <your-key> concierge config set elevenLabsVoiceId <voice-id> concierge config set anthropicApiKey <your-key> -
Optional for auto-managed ngrok:
concierge config set ngrokAuthToken <your-ngrok-token>
Usage
Basic call
concierge call "+1-555-123-4567" \ --goal "Book a hotel room for February 15" \ --name "John Smith" \ --email "john@example.com" \ --customer-phone "+1-555-444-1212" \ --context "2 nights, king bed preferred"
Interactive mode
concierge call "+1-555-123-4567" \ --goal "Make a reservation" \ --name "John Smith" \ --email "john@example.com" \ --customer-phone "+1-555-444-1212" \ --interactive
In interactive mode, you type what the AI should say in real-time.
Infrastructure behavior
- By default,
auto-startscall
andngrok
if server is unavailable.server - Use
to disable this and run everything manually.--no-auto-infra - Auto-managed processes are stopped automatically when the call ends.
- Log files are written to:
~/.config/concierge/call-runs/<run-id>/server.log~/.config/concierge/call-runs/<run-id>/ngrok.log
Server management
# Check server status concierge server status # Start server concierge server start --public-url <ngrok-url> # Stop server concierge server stop
Preflight checks
Before dialing, the system validates:
- Local runtime dependencies (
binary + MP3 decode support, plusffmpeg
if auto-infra is used)ngrok - Twilio credentials/account status/from-number availability
- Deepgram API key/auth reachability
- ElevenLabs character quota sufficiency (estimated call budget)
How It Works
- CLI sends a call request with goal + customer identity details
- The server places the call via Twilio
- Audio streams bidirectionally via WebSocket
- Deepgram transcribes human speech in real-time
- Claude generates appropriate responses
- ElevenLabs synthesizes speech for responses
- Call continues until goal is achieved or human hangs up
Examples
Book a hotel reservation
concierge call "+1-800-HILTON" \ --goal "Book a room for 2 nights" \ --name "Sarah Johnson" \ --email "sarah@example.com" \ --customer-phone "+1-555-000-2222" \ --context "Check-in: March 10, Guest: Sarah Johnson, King bed, non-smoking"
Make a restaurant reservation
concierge call "+1-555-DINER" \ --goal "Reserve a table for dinner" \ --name "Garcia" \ --email "garcia@example.com" \ --customer-phone "+1-555-000-3333" \ --context "Party of 4, 7:30 PM, Saturday, name: Garcia"
Cancel an appointment
concierge call "+1-555-DOCTOR" \ --goal "Cancel appointment" \ --name "Mike Chen" \ --email "mike@example.com" \ --customer-phone "+1-555-000-4444" \ --context "Patient: Mike Chen, Appointment on Tuesday at 2 PM"
Supported Voice IDs
Some popular ElevenLabs voices:
- Rachel (default, conversational female)EXAVITQu4vr4xnSDxMaL
- Adam (conversational male)pNInz6obpgDQGcFmaJgB
- Rachel (narration)21m00Tcm4TlvDq8ikWAM
- Domi (young female)AZnzlk1XvdvUeBnXmlld
Set your preferred voice:
concierge config set elevenLabsVoiceId <voice-id>
Latency
Target voice-to-voice latency: < 500ms
- Deepgram STT: ~150ms
- Response generation: ~100-200ms
- ElevenLabs TTS: ~75ms
- Network: ~50ms
Troubleshooting
Server won't start
- Check all config keys are set:
concierge config show - If using manual mode, ensure ngrok is running and URL is correct
- Check port 3000 is available
Call not connecting
- Verify Twilio phone number is active
- Check Twilio account has sufficient balance
- Ensure ngrok URL is publicly accessible (manual mode)
TTS fails mid-call
- Check ElevenLabs quota/credits.
- New preflight usually catches this before dialing.
- If it still happens, reduce prompt/context length or top up ElevenLabs.
Audio quality issues
- ElevenLabs uses optimized phone call settings
- Deepgram uses the phone call model
- Audio is at 8kHz (telephone quality)