Babysitter content-moderation-api
Content moderation API integration using OpenAI Moderation, Perspective API, and others
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/ai-agents-conversational/skills/content-moderation-api" ~/.claude/skills/a5c-ai-babysitter-content-moderation-api && rm -rf "$T"
manifest:
library/specializations/ai-agents-conversational/skills/content-moderation-api/SKILL.mdsource content
Content Moderation API Skill
Capabilities
- Integrate OpenAI Moderation API
- Set up Perspective API for toxicity detection
- Configure moderation thresholds
- Implement content filtering pipelines
- Design moderation response handling
- Create moderation logging and reporting
Target Processes
- content-moderation-safety
- system-prompt-guardrails
Implementation Details
Moderation APIs
- OpenAI Moderation: Hate, violence, self-harm, sexual content
- Perspective API: Toxicity, insult, profanity, threat
- Azure Content Safety: Text and image moderation
- LlamaGuard: Open-source safety classifier
Configuration Options
- API credentials and endpoints
- Category thresholds
- Action policies (block, warn, flag)
- Logging configuration
- Fallback behavior
Best Practices
- Set appropriate thresholds
- Handle edge cases gracefully
- Log moderation decisions
- Regular threshold review
- Multi-layer moderation
Dependencies
- openai
- google-cloud-language (Perspective)
- azure-ai-contentsafety