Awesome-llm-skills resemble-detect
Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI
git clone https://github.com/Prat011/awesome-llm-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/Prat011/awesome-llm-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/resemble-detect" ~/.claude/skills/prat011-awesome-llm-skills-resemble-detect && rm -rf "$T"
resemble-detect/SKILL.mdResemble Detect — Deepfake Detection & Media Safety
Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.
Core Principle — THE IRON LAW
"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."
Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned
label, score, and status: "completed". If the detection is still processing, wait. If it failed, say so — do not substitute your own judgment.
When to Use
Use this skill whenever the user's request involves any of these:
- Checking if audio, video, image, or text is AI-generated or manipulated
- Detecting deepfakes in any media format
- Verifying media authenticity or provenance
- Identifying which AI platform synthesized audio (source tracing)
- Applying or detecting watermarks on media
- Analyzing media for speaker info, emotion, transcription, or misinformation
- Asking natural-language questions about detection results
- Matching or verifying speaker identity against known voice profiles
- Detecting AI-generated or machine-written text
- Any mention of: "deepfake", "fake detection", "synthetic media", "voice verification", "watermark", "media forensics", "authenticity check", "source tracing", "is this real", "AI-written text", "text detection"
Do NOT use for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.
Capability Decision Tree
| User wants to... | Use this | API endpoint |
|---|---|---|
| Check if media is AI-generated / deepfake | Deepfake Detection | |
| Know which AI platform made fake audio | Audio Source Tracing | with flag |
| Get speaker info, emotion, transcription from media | Intelligence | |
| Ask questions about a completed detection | Detect Intelligence | |
| Apply an invisible watermark to media | Watermark Apply | |
| Check if media contains a watermark | Watermark Detect | |
| Verify a speaker's identity against known profiles | Identity Search | |
| Check if text is AI-generated | Text Detection | |
| Create a voice identity profile for future matching | Identity Create | |
When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single
POST /detect call using the intelligence: true flag rather than making separate requests.
Required Setup
- API Key: Bearer token from the Resemble AI dashboard
- Base URL:
https://app.resemble.ai/api/v2 - Auth Header:
Authorization: Bearer <RESEMBLE_API_KEY> - Media Requirement: All media must be at a publicly accessible HTTPS URL
If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API.
MCP Tools Available
When the Resemble MCP server is connected, use these tools instead of raw API calls:
| Tool | Purpose |
|---|---|
| Get comprehensive docs for any detect sub-topic |
| Search across all documentation |
| Get exact OpenAPI spec for any endpoint |
| Find endpoints by keyword |
| Read specific documentation pages |
| List all available topics |
Tool usage pattern: Use
resemble_docs_lookup with topic "detect" to get the full picture, then resemble_api_endpoint for exact request/response schemas before making API calls.
Phase 1: Deepfake Detection
The core capability. Submit any audio, image, or video for AI-generated content analysis.
Submit a Detection
POST /detect Content-Type: application/json Authorization: Bearer <API_KEY> { "url": "https://example.com/media.mp4", "visualize": true, "intelligence": true, "audio_source_tracing": true }
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | HTTPS URL to audio, image, or video file |
| string | No | Webhook URL for async completion notification |
| boolean | No | Generate heatmap/visualization artifacts |
| boolean | No | Run multimodal intelligence analysis alongside detection |
| boolean | No | Identify which AI platform synthesized fake audio |
| integer | No | Audio/video analysis window size in seconds (1–4, default 2) |
| number | No | Start of segment to analyze (seconds) |
| number | No | End of segment to analyze (seconds) |
| string | No | or (for face-swap detection) |
| boolean | No | Enable reverse image search (image only) |
| boolean | No | Enable out-of-distribution detection |
| boolean | No | Auto-delete media after detection completes |
Supported formats:
- Audio: WAV, MP3, OGG, M4A, FLAC
- Video: MP4, MOV, AVI, WMV
- Image: JPG, PNG, GIF, WEBP
Poll for Results
Detection is asynchronous. Poll
GET /detect/{uuid} until status is "completed" or "failed".
GET /detect/{uuid} Authorization: Bearer <API_KEY>
Polling best practice: Start at 2s intervals, back off to 5s, then 10s. Most detections complete within 10–60 seconds depending on media length.
Reading Results by Media Type
Audio results — in
metrics:
{ "label": "fake", "score": ["0.92", "0.88", "0.95"], "consistency": "0.91", "aggregated_score": "0.92", "image": "https://..." }
:label
or"fake"
— the verdict"real"
: Per-chunk prediction scores (array)score
: Overall confidence (0.0–1.0, higher = more likely synthetic)aggregated_score
: How consistent the prediction is across chunksconsistency
: Visualization heatmap URL (ifimage
)visualize: true
Image results — in
image_metrics:
{ "type": "ImageAnalysis", "label": "fake", "score": 0.87, "image": "https://...", "ifl": { "score": 0.82, "heatmap": "https://..." }, "reverse_image_search_sources": [ { "url": "...", "title": "...", "verdict": "known_fake", "similarity": 0.95 } ] }
/label
: Verdict and confidencescore
: Invisible Frequency Layer analysis with heatmapifl
: Known sources found online (ifreverse_image_search_sources
)use_reverse_search: true
Video results — in
video_metrics:
{ "label": "fake", "score": 0.89, "certainty": 0.91, "children": [ { "type": "VideoResult", "conclusion": "Fake", "score": 0.89, "timestamp": 2.5, "children": [...] } ] }
- Hierarchical tree of frame-level and segment-level results
- Each child has
,timestamp
,score
, and may have nestedcertaintychildren - Video with audio track returns both
(audio) andmetrics
(visual)video_metrics
Interpreting Scores
| Score Range | Interpretation |
|---|---|
| 0.0 – 0.3 | Strong indication of authentic/real media |
| 0.3 – 0.5 | Inconclusive — recommend additional analysis |
| 0.5 – 0.7 | Likely synthetic — flag for review |
| 0.7 – 1.0 | High confidence synthetic/AI-generated |
Always present scores with context. Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."
Phase 2: Intelligence — Media Analysis
Analyze media for rich structured insights independent of or alongside detection.
Standalone Intelligence
POST /intelligence Content-Type: application/json Authorization: Bearer <API_KEY> { "url": "https://example.com/audio.mp3", "json": true }
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | One of | HTTPS URL to media file |
| string | One of | Token from secure upload (alternative to URL) |
| string | No | UUID of existing detect to associate |
| string | No | , , or (auto-detected) |
| boolean | No | Return structured fields (default: false for audio/video, true for image) |
| string | No | Webhook for async mode |
Audio/Video structured response (
json: true):
— speaker description (age, gender)speaker_info
/language
— detected languagedialect
— detected emotional stateemotion
— conversational, formal, etc.speaking_style
— inferred context of the speechcontext
— content summarymessage
— anomalies detected in the mediaabnormalities
— full transcripttranscription
— translation if non-Englishtranslation
— misinformation analysismisinformation
Image structured response:
— what the image showsscene_description
— people/objects identifiedsubjects
— visual authenticity assessmentauthenticity_analysis
— environment descriptioncontext_and_setting
— visual anomaliesabnormalities
— misinformation analysismisinformation
Detect Intelligence — Ask Questions About Results
After a detection completes, ask natural-language questions about it:
POST /detects/{detect_uuid}/intelligence Content-Type: application/json Authorization: Bearer <API_KEY> { "query": "How confident is the model that this audio is fake?" }
This returns a question UUID. Poll
GET /detects/{detect_uuid}/intelligence/{question_uuid} until status is "completed" to get the answer.
Good questions to suggest:
- "Summarize the detection results in plain language"
- "What specific indicators suggest this is AI-generated?"
- "How do the audio and video detection results differ?"
- "What is the confidence level and what does it mean?"
- "Are there any inconsistencies in the analysis?"
Status flow:
pending → processing → completed (or failed)
Prerequisite: The detection must have
status: "completed". Submitting a question against a processing or failed detection returns a 422 error.
Phase 3: Audio Source Tracing
When audio is detected as synthetic (
label: "fake"), identify which AI platform generated it.
Enable it by setting
audio_source_tracing: true in the POST /detect request.
Result appears in the detection response under
audio_source_tracing:
{ "label": "elevenlabs", "error_message": null }
Known source labels include:
resemble_ai, elevenlabs, real, and others as the model expands.
Important: Source tracing only runs when audio is labeled as
"fake". If the audio is "real", no source tracing result will appear.
Standalone query:
— list all source tracing reportsGET /audio_source_tracings
— get specific reportGET /audio_source_tracings/{uuid}
Phase 4: Watermarking
Apply invisible watermarks to media for provenance tracking, or detect existing watermarks.
Apply a Watermark
POST /watermark/apply Content-Type: application/json Authorization: Bearer <API_KEY> Prefer: wait { "url": "https://example.com/image.png", "strength": 0.3, "custom_message": "my-organization" }
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | HTTPS URL to media file |
| number | No | Watermark strength 0.0–1.0 (image/video only, default 0.2) |
| string | No | Custom message to embed (image/video only, default "resembleai") |
- Add
header for synchronous responsePrefer: wait - Without it, poll
GET /watermark/apply/{uuid}/result - Response includes
URL to download the watermarked filewatermarked_media
Detect a Watermark
POST /watermark/detect Content-Type: application/json Authorization: Bearer <API_KEY> Prefer: wait { "url": "https://example.com/suspect-image.png" }
Audio detection result:
{ "has_watermark": true, "confidence": 0.95 }
Image/Video detection result:
{ "has_watermark": true }
Phase 5: Identity — Speaker Verification (Beta)
Create voice identity profiles and match incoming audio against them.
Beta feature — requires joining the preview program. Inform the user if they encounter access errors.
Create an Identity Profile
POST /identity Content-Type: application/json Authorization: Bearer <API_KEY> { "audio_url": "https://example.com/known-speaker.wav", "name": "Jane Doe" }
Search Against Known Identities
POST /identity/search Content-Type: application/json Authorization: Bearer <API_KEY> { "audio_url": "https://example.com/unknown-speaker.wav", "top_k": 5 }
Response:
{ "success": true, "item": [ { "uuid": "...", "name": "Jane Doe", "confidence": 0.92, "distance": 0.08 } ] }
Lower
distance = closer match. Higher confidence = stronger match.
Phase 6: Text Detection
Detect whether text content is AI-generated or human-written.
Beta feature — requires the
role or a billing plan that includes thedetect_beta_userproduct.dfd_text
Submit a Text Detection
POST /text_detect Content-Type: application/json Authorization: Bearer <API_KEY>
Add the
Prefer: wait header for a synchronous (blocking) response. Without it, the job runs asynchronously — poll or use a callback.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
| string | Yes | Text to analyze (max 100,000 characters) |
| string | No | Always use (default) |
| float | No | Decision threshold 0.0–1.0 (default: 0.5) |
| string | No | Webhook URL for async completion notification |
| boolean | No | If true, text content is not stored after analysis |
Response:
{ "success": true, "item": { "uuid": "abc-123", "status": "completed", "prediction": "ai", "confidence": 0.91, "text_content": "This is some text to analyze.", "privacy_mode": false, "created_at": "...", "updated_at": "..." } }
:prediction
or"ai"
— the verdict"human"
: 0.0–1.0, higher = more confident in the predictionconfidence
:status
,"processing"
, or"completed""failed"
Poll for Results
If you did not use
Prefer: wait, poll until status is "completed" or "failed":
GET /text_detect/{uuid} Authorization: Bearer <API_KEY>
List Text Detections
GET /text_detect Authorization: Bearer <API_KEY>
Returns paginated text detections for the team.
Callback
If
callback_url was provided, a POST is sent on completion:
{ "success": true, "item": { ... } }
On failure:
{ "success": false, "item": { ... }, "error": "Error message here" }
Recommended Workflows
Full Media Forensics (Most Thorough)
For a comprehensive analysis, combine all capabilities:
- Submit detection with all flags enabled:
{ "url": "https://example.com/suspect.mp4", "visualize": true, "intelligence": true, "audio_source_tracing": true, "use_reverse_search": true } - Poll until
status: "completed" - Read
/metrics
/image_metrics
for the verdictvideo_metrics - Read
for structured media analysisintelligence.description - If audio labeled
, check"fake"
for the source platformaudio_source_tracing.label - Ask follow-up questions via Detect Intelligence if anything needs clarification
- Check for watermarks via
if provenance is relevantPOST /watermark/detect
Quick Authenticity Check (Fastest)
For a fast pass/fail:
- Submit minimal detection:
{ "url": "..." } - Poll until complete
- Check
andlabel
(audio) oraggregated_score
andlabel
(image/video)score - Report result with score context
Provenance Pipeline (Content Creators)
For creators who want to prove their content is authentic:
- Apply watermark to original content:
POST /watermark/apply - Distribute watermarked media
- Later, verify provenance:
against any copyPOST /watermark/detect
Red Flags — Stop and Reassess
- Declaring authenticity without a detection result — Never say media is real or fake based on visual/auditory inspection alone
- Ignoring the score and reporting only the label — A
label with score 0.51 means something very different from score 0.95"fake" - Submitting local file paths to the API — The API requires publicly accessible HTTPS URLs (does not apply to text detection)
- Sending text longer than 100,000 characters to text detection — Split into chunks or inform the user of the limit
- Polling too aggressively — Start at 2s intervals, back off exponentially; do not loop at <1s
- Asking Detect Intelligence questions before detection completes — Results in 422 error
- Expecting source tracing on "real" audio — Source tracing only runs on audio labeled
"fake" - Treating beta features (Identity) as production-ready — Warn users about beta status
- Ignoring
for sensitive media — Always suggest this flag when the user indicates the media is sensitive or privatezero_retention_mode - Making multiple separate API calls when flags can combine — Use
andintelligence: true
on the detection call instead of separate requestsaudio_source_tracing: true
Response Presentation Guidelines
When presenting results to users:
- Lead with the verdict — "The detection indicates this audio is likely AI-generated (score: 0.87)"
- Provide score context — Use the score interpretation table above
- Mention limitations — Detection is probabilistic, not absolute proof
- Include actionable next steps — Suggest intelligence queries, source tracing, or watermark checks as appropriate
- For inconclusive results (0.3–0.5) — Explicitly state the result is inconclusive and recommend additional analysis with different parameters or manual review
- Never present detection as legal evidence — Detection results are analytical tools, not forensic certifications
Error Handling
| Error | Cause | Resolution |
|---|---|---|
| 400 | Invalid request body or missing | Check required parameters |
| 401 | Invalid or missing API key | Verify |
| 404 | Detection UUID not found | Verify the UUID from the creation response |
| 422 | Detection not completed (for Intelligence) | Wait for detection to reach status |
| 429 | Rate limited | Back off and retry with exponential delay |
| 500 | Server error | Retry once, then report to user |
Privacy & Compliance Notes
- Zero retention mode: Set
to auto-delete media after analysis. The URL is redacted andzero_retention_mode: true
is set to true post-completion.media_deleted - Text privacy mode: Set
on text detection to prevent text content from being stored after analysis.privacy_mode: true - Data handling: Media URLs and text content are stored by default. For GDPR/compliance-sensitive workflows, enable zero retention (media) or privacy mode (text).
- Callback security: If using
, ensure the endpoint is HTTPS and authenticated on the receiving end.callback_url