Skills arc-shield
Output sanitization for agent responses - prevents accidental secret leaks
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/arc-claw-bot/arc-shield" ~/.claude/skills/openclaw-skills-arc-shield && rm -rf "$T"
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/arc-claw-bot/arc-shield" ~/.openclaw/skills/openclaw-skills-arc-shield && rm -rf "$T"
skills/arc-claw-bot/arc-shield/SKILL.mdarc-shield
Output sanitization for agent responses. Scans ALL outbound messages for leaked secrets, tokens, keys, passwords, and PII before they leave the agent.
⚠️ This is NOT an input scanner —
clawdefender already handles that. This is an OUTPUT filter for catching things your agent accidentally includes in its own responses.
Why You Need This
Agents have access to sensitive data: 1Password vaults, environment variables, config files, wallet keys. Sometimes they accidentally include these in responses when:
- Debugging and showing full command output
- Copying file contents that contain secrets
- Generating code examples with real credentials
- Summarizing logs that include tokens
Arc-shield catches these leaks before they reach Discord, Signal, X, or any external channel.
What It Detects
🔴 CRITICAL (blocks in --strict
mode)
--strict- API Keys & Tokens: 1Password (
), GitHub (ops_*
), OpenAI (ghp_*
), Stripe, AWS, Bearer tokenssk-* - Passwords: Assignments like
orpassword=...passwd: ... - Private Keys: Ethereum (0x + 64 hex), SSH keys, PGP blocks
- Wallet Mnemonics: 12/24 word recovery phrases
- PII: Social Security Numbers, credit card numbers
- Platform Tokens: Slack, Telegram, Discord
🟠 HIGH (warns loudly)
- High-entropy strings: Shannon entropy > 4.5 for strings > 16 chars (catches novel secret patterns)
- Credit cards: 16-digit card numbers
- Base64 credentials: Long base64 strings that look like tokens
🟡 WARN (informational)
- Secret file paths:
, paths containing "password", "token", "key"~/.secrets/* - Environment variables:
exportsENV_VAR=secret_value - Database URLs: Connection strings with credentials
Installation
cd ~/.openclaw/workspace/skills git clone <arc-shield-repo> arc-shield chmod +x arc-shield/scripts/*.sh arc-shield/scripts/*.py
Or download as a skill bundle.
Usage
Command-line
# Scan agent output before sending agent-response.txt | arc-shield.sh # Block if critical secrets found (use before external messaging) echo "Message text" | arc-shield.sh --strict || echo "BLOCKED" # Redact secrets and return sanitized text cat response.txt | arc-shield.sh --redact # Full report arc-shield.sh --report < conversation.log # Python version with entropy detection cat message.txt | output-guard.py --strict
Integration with OpenClaw Agents
Pre-send hook (recommended)
Add to your messaging skill or wrapper:
#!/bin/bash # send-message.sh wrapper MESSAGE="$1" CHANNEL="$2" # Sanitize output SANITIZED=$(echo "$MESSAGE" | arc-shield.sh --strict --redact) EXIT_CODE=$? if [[ $EXIT_CODE -eq 1 ]]; then echo "ERROR: Message contains critical secrets and was blocked." >&2 exit 1 fi # Send sanitized message openclaw message send --channel "$CHANNEL" "$SANITIZED"
Manual pipe
Before any external message:
# Generate response RESPONSE=$(agent-generate-response) # Sanitize CLEAN=$(echo "$RESPONSE" | arc-shield.sh --redact) # Send signal send "$CLEAN"
Testing
cd skills/arc-shield/tests ./run-tests.sh
Includes test cases for:
- Real leaked patterns (1Password tokens, Instagram passwords, wallet mnemonics)
- False positive prevention (normal URLs, email addresses, file paths)
- Redaction accuracy
- Strict mode blocking
Configuration
Patterns are defined in
config/patterns.conf:
CRITICAL|GitHub PAT|ghp_[a-zA-Z0-9]{36,} CRITICAL|OpenAI Key|sk-[a-zA-Z0-9]{20,} WARN|Secret Path|~\/\.secrets\/[^\s]*
Edit to add custom patterns or adjust severity levels.
Modes
| Mode | Behavior | Exit Code | Use Case |
|---|---|---|---|
| Default | Pass through + warnings to stderr | 0 | Development, logging |
| Block on CRITICAL findings | 1 if critical | Production outbound messages |
| Replace secrets with | 0 | Safe logging, auditing |
| Analysis only, no pass-through | 0 | Auditing conversations |
Entropy Detection
The Python version (
output-guard.py) includes Shannon entropy analysis to catch secrets that don't match regex patterns:
# Detects high-entropy strings like: kJ8nM2pQ5rT9vWxY3zA6bC4dE7fG1hI0 # Novel API key format Zm9vOmJhcg== # Base64 credentials
Threshold: 4.5 bits (configurable with
--entropy-threshold)
Performance
- Bash version: ~10ms for typical message (< 1KB)
- Python version: ~50ms with entropy analysis
- Zero external dependencies: bash + Python stdlib only
Fast enough to run on every outbound message without noticeable delay.
Real-World Catches
From our own agent sessions:
# 1Password token "ops_eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..." # Instagram password in debug output "instagram login: user@example.com / MyInsT@Gr4mP4ss!" # Wallet mnemonic in file listing "cat ~/.secrets/wallet-recovery-phrase.txt abandon ability able about above absent absorb abstract..." # GitHub PAT in git config "[remote "origin"] url = https://ghp_abc123:@github.com/user/repo"
All blocked by arc-shield before reaching external channels.
Best Practices
- Always use
for external messages (Discord, Signal, X, email)--strict - Use
for logs you want to review later--redact - Run tests after adding custom patterns to check for false positives
- Pipe through both bash and Python versions for maximum coverage:
message | arc-shield.sh --strict | output-guard.py --strict - Don't rely on this alone — educate your agent to avoid including secrets in the first place (see AGENTS.md output sanitization directive)
Limitations
- Context-free: Can't distinguish between "here's my password: X" (bad) and "set your password to X" (instruction)
- No semantic understanding: Won't catch "my token is in the previous message"
- Pattern-based: New secret formats require pattern updates
Use in combination with agent instructions and careful prompt engineering.
Integration Example
Full OpenClaw agent integration:
# In your agent's message wrapper send_external_message() { local message="$1" local channel="$2" # Pre-flight sanitization if ! echo "$message" | arc-shield.sh --strict > /dev/null 2>&1; then echo "ERROR: Message blocked by arc-shield (contains secrets)" >&2 return 1 fi # Double-check with entropy detection if ! echo "$message" | output-guard.py --strict > /dev/null 2>&1; then echo "ERROR: High-entropy secret detected" >&2 return 1 fi # Safe to send openclaw message send --channel "$channel" "$message" }
Troubleshooting
False positives on normal text:
- Adjust entropy threshold:
output-guard.py --entropy-threshold 5.0 - Edit
to refine regex patternsconfig/patterns.conf - Add exceptions to the pattern file
Secrets not detected:
- Check pattern file for coverage
- Run with
to see what's being scanned--report - Test with
using your sampletests/run-tests.sh - Consider lowering entropy threshold (but watch for false positives)
Performance issues:
- Use bash version only (skip entropy detection)
- Limit input size with
head -c 10000 - Run in background:
arc-shield.sh --report &
Contributing
Add new patterns to
config/patterns.conf following the format:
SEVERITY|Category Name|regex_pattern
Test with
tests/run-tests.sh before deploying.
License
MIT — use freely, protect your secrets.
Remember: Arc-shield is your safety net, not your strategy. Train your agent to never include secrets in responses. This tool catches mistakes, not malice.