Awesome-llm-skills resemble-detect

Deepfake detection and media safety — detect AI-generated audio, images, video, and text, trace synthesis sources, apply watermarks, verify speaker identity, and analyze media intelligence using Resemble AI

install
source · Clone the upstream repo
git clone https://github.com/Prat011/awesome-llm-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Prat011/awesome-llm-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/resemble-detect" ~/.claude/skills/prat011-awesome-llm-skills-resemble-detect && rm -rf "$T"
manifest: resemble-detect/SKILL.md
source content

Resemble Detect — Deepfake Detection & Media Safety

Analyze audio, image, video, and text for synthetic manipulation, AI-generated content, watermarks, speaker identity, and media intelligence using the Resemble AI platform.

Core Principle — THE IRON LAW

"NEVER DECLARE MEDIA AS REAL OR FAKE WITHOUT A COMPLETED DETECTION RESULT."

Do not guess, infer, or speculate about media authenticity. Every authenticity claim must be backed by a completed Resemble detect job with a returned

label
,
score
, and
status: "completed"
. If the detection is still
processing
, wait. If it
failed
, say so — do not substitute your own judgment.

When to Use

Use this skill whenever the user's request involves any of these:

  • Checking if audio, video, image, or text is AI-generated or manipulated
  • Detecting deepfakes in any media format
  • Verifying media authenticity or provenance
  • Identifying which AI platform synthesized audio (source tracing)
  • Applying or detecting watermarks on media
  • Analyzing media for speaker info, emotion, transcription, or misinformation
  • Asking natural-language questions about detection results
  • Matching or verifying speaker identity against known voice profiles
  • Detecting AI-generated or machine-written text
  • Any mention of: "deepfake", "fake detection", "synthetic media", "voice verification", "watermark", "media forensics", "authenticity check", "source tracing", "is this real", "AI-written text", "text detection"

Do NOT use for text-to-speech generation, voice cloning, or speech-to-text transcription — those are separate Resemble capabilities.

Capability Decision Tree

User wants to...Use thisAPI endpoint
Check if media is AI-generated / deepfakeDeepfake Detection
POST /detect
Know which AI platform made fake audioAudio Source Tracing
POST /detect
with flag
Get speaker info, emotion, transcription from mediaIntelligence
POST /intelligence
Ask questions about a completed detectionDetect Intelligence
POST /detects/{uuid}/intelligence
Apply an invisible watermark to mediaWatermark Apply
POST /watermark/apply
Check if media contains a watermarkWatermark Detect
POST /watermark/detect
Verify a speaker's identity against known profilesIdentity Search
POST /identity/search
Check if text is AI-generatedText Detection
POST /text_detect
Create a voice identity profile for future matchingIdentity Create
POST /identity

When multiple capabilities apply (e.g., user wants deepfake detection AND intelligence), combine them in a single

POST /detect
call using the
intelligence: true
flag rather than making separate requests.

Required Setup

  • API Key: Bearer token from the Resemble AI dashboard
  • Base URL:
    https://app.resemble.ai/api/v2
  • Auth Header:
    Authorization: Bearer <RESEMBLE_API_KEY>
  • Media Requirement: All media must be at a publicly accessible HTTPS URL

If the user provides a local file path instead of a URL, inform them the file must be hosted at a public HTTPS URL first. Do not attempt to upload local files to the API.

MCP Tools Available

When the Resemble MCP server is connected, use these tools instead of raw API calls:

ToolPurpose
resemble_docs_lookup
Get comprehensive docs for any detect sub-topic
resemble_search
Search across all documentation
resemble_api_endpoint
Get exact OpenAPI spec for any endpoint
resemble_api_search
Find endpoints by keyword
resemble_get_page
Read specific documentation pages
resemble_list_topics
List all available topics

Tool usage pattern: Use

resemble_docs_lookup
with topic
"detect"
to get the full picture, then
resemble_api_endpoint
for exact request/response schemas before making API calls.


Phase 1: Deepfake Detection

The core capability. Submit any audio, image, or video for AI-generated content analysis.

Submit a Detection

POST /detect
Content-Type: application/json
Authorization: Bearer <API_KEY>

{
  "url": "https://example.com/media.mp4",
  "visualize": true,
  "intelligence": true,
  "audio_source_tracing": true
}

Parameters:

ParameterTypeRequiredDescription
url
stringYesHTTPS URL to audio, image, or video file
callback_url
stringNoWebhook URL for async completion notification
visualize
booleanNoGenerate heatmap/visualization artifacts
intelligence
booleanNoRun multimodal intelligence analysis alongside detection
audio_source_tracing
booleanNoIdentify which AI platform synthesized fake audio
frame_length
integerNoAudio/video analysis window size in seconds (1–4, default 2)
start_region
numberNoStart of segment to analyze (seconds)
end_region
numberNoEnd of segment to analyze (seconds)
model_types
stringNo
"image"
or
"talking_head"
(for face-swap detection)
use_reverse_search
booleanNoEnable reverse image search (image only)
use_ood_detector
booleanNoEnable out-of-distribution detection
zero_retention_mode
booleanNoAuto-delete media after detection completes

Supported formats:

  • Audio: WAV, MP3, OGG, M4A, FLAC
  • Video: MP4, MOV, AVI, WMV
  • Image: JPG, PNG, GIF, WEBP

Poll for Results

Detection is asynchronous. Poll

GET /detect/{uuid}
until
status
is
"completed"
or
"failed"
.

GET /detect/{uuid}
Authorization: Bearer <API_KEY>

Polling best practice: Start at 2s intervals, back off to 5s, then 10s. Most detections complete within 10–60 seconds depending on media length.

Reading Results by Media Type

Audio results — in

metrics
:

{
  "label": "fake",
  "score": ["0.92", "0.88", "0.95"],
  "consistency": "0.91",
  "aggregated_score": "0.92",
  "image": "https://..."
}
  • label
    :
    "fake"
    or
    "real"
    — the verdict
  • score
    : Per-chunk prediction scores (array)
  • aggregated_score
    : Overall confidence (0.0–1.0, higher = more likely synthetic)
  • consistency
    : How consistent the prediction is across chunks
  • image
    : Visualization heatmap URL (if
    visualize: true
    )

Image results — in

image_metrics
:

{
  "type": "ImageAnalysis",
  "label": "fake",
  "score": 0.87,
  "image": "https://...",
  "ifl": { "score": 0.82, "heatmap": "https://..." },
  "reverse_image_search_sources": [
    { "url": "...", "title": "...", "verdict": "known_fake", "similarity": 0.95 }
  ]
}
  • label
    /
    score
    : Verdict and confidence
  • ifl
    : Invisible Frequency Layer analysis with heatmap
  • reverse_image_search_sources
    : Known sources found online (if
    use_reverse_search: true
    )

Video results — in

video_metrics
:

{
  "label": "fake",
  "score": 0.89,
  "certainty": 0.91,
  "children": [
    {
      "type": "VideoResult",
      "conclusion": "Fake",
      "score": 0.89,
      "timestamp": 2.5,
      "children": [...]
    }
  ]
}
  • Hierarchical tree of frame-level and segment-level results
  • Each child has
    timestamp
    ,
    score
    ,
    certainty
    , and may have nested
    children
  • Video with audio track returns both
    metrics
    (audio) and
    video_metrics
    (visual)

Interpreting Scores

Score RangeInterpretation
0.0 – 0.3Strong indication of authentic/real media
0.3 – 0.5Inconclusive — recommend additional analysis
0.5 – 0.7Likely synthetic — flag for review
0.7 – 1.0High confidence synthetic/AI-generated

Always present scores with context. Say "The detection returned a score of 0.87, indicating high confidence that this audio is AI-generated" — never just "it's fake."


Phase 2: Intelligence — Media Analysis

Analyze media for rich structured insights independent of or alongside detection.

Standalone Intelligence

POST /intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>

{
  "url": "https://example.com/audio.mp3",
  "json": true
}

Parameters:

ParameterTypeRequiredDescription
url
stringOne ofHTTPS URL to media file
media_token
stringOne ofToken from secure upload (alternative to URL)
detect_id
stringNoUUID of existing detect to associate
media_type
stringNo
"audio"
,
"video"
, or
"image"
(auto-detected)
json
booleanNoReturn structured fields (default: false for audio/video, true for image)
callback_url
stringNoWebhook for async mode

Audio/Video structured response (

json: true
):

  • speaker_info
    — speaker description (age, gender)
  • language
    /
    dialect
    — detected language
  • emotion
    — detected emotional state
  • speaking_style
    — conversational, formal, etc.
  • context
    — inferred context of the speech
  • message
    — content summary
  • abnormalities
    — anomalies detected in the media
  • transcription
    — full transcript
  • translation
    — translation if non-English
  • misinformation
    — misinformation analysis

Image structured response:

  • scene_description
    — what the image shows
  • subjects
    — people/objects identified
  • authenticity_analysis
    — visual authenticity assessment
  • context_and_setting
    — environment description
  • abnormalities
    — visual anomalies
  • misinformation
    — misinformation analysis

Detect Intelligence — Ask Questions About Results

After a detection completes, ask natural-language questions about it:

POST /detects/{detect_uuid}/intelligence
Content-Type: application/json
Authorization: Bearer <API_KEY>

{
  "query": "How confident is the model that this audio is fake?"
}

This returns a question UUID. Poll

GET /detects/{detect_uuid}/intelligence/{question_uuid}
until
status
is
"completed"
to get the
answer
.

Good questions to suggest:

  • "Summarize the detection results in plain language"
  • "What specific indicators suggest this is AI-generated?"
  • "How do the audio and video detection results differ?"
  • "What is the confidence level and what does it mean?"
  • "Are there any inconsistencies in the analysis?"

Status flow:

pending
processing
completed
(or
failed
)

Prerequisite: The detection must have

status: "completed"
. Submitting a question against a processing or failed detection returns a 422 error.


Phase 3: Audio Source Tracing

When audio is detected as synthetic (

label: "fake"
), identify which AI platform generated it.

Enable it by setting

audio_source_tracing: true
in the
POST /detect
request.

Result appears in the detection response under

audio_source_tracing
:

{
  "label": "elevenlabs",
  "error_message": null
}

Known source labels include:

resemble_ai
,
elevenlabs
,
real
, and others as the model expands.

Important: Source tracing only runs when audio is labeled as

"fake"
. If the audio is
"real"
, no source tracing result will appear.

Standalone query:

  • GET /audio_source_tracings
    — list all source tracing reports
  • GET /audio_source_tracings/{uuid}
    — get specific report

Phase 4: Watermarking

Apply invisible watermarks to media for provenance tracking, or detect existing watermarks.

Apply a Watermark

POST /watermark/apply
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait

{
  "url": "https://example.com/image.png",
  "strength": 0.3,
  "custom_message": "my-organization"
}
ParameterTypeRequiredDescription
url
stringYesHTTPS URL to media file
strength
numberNoWatermark strength 0.0–1.0 (image/video only, default 0.2)
custom_message
stringNoCustom message to embed (image/video only, default "resembleai")
  • Add
    Prefer: wait
    header for synchronous response
  • Without it, poll
    GET /watermark/apply/{uuid}/result
  • Response includes
    watermarked_media
    URL to download the watermarked file

Detect a Watermark

POST /watermark/detect
Content-Type: application/json
Authorization: Bearer <API_KEY>
Prefer: wait

{
  "url": "https://example.com/suspect-image.png"
}

Audio detection result:

{ "has_watermark": true, "confidence": 0.95 }

Image/Video detection result:

{ "has_watermark": true }

Phase 5: Identity — Speaker Verification (Beta)

Create voice identity profiles and match incoming audio against them.

Beta feature — requires joining the preview program. Inform the user if they encounter access errors.

Create an Identity Profile

POST /identity
Content-Type: application/json
Authorization: Bearer <API_KEY>

{
  "audio_url": "https://example.com/known-speaker.wav",
  "name": "Jane Doe"
}

Search Against Known Identities

POST /identity/search
Content-Type: application/json
Authorization: Bearer <API_KEY>

{
  "audio_url": "https://example.com/unknown-speaker.wav",
  "top_k": 5
}

Response:

{
  "success": true,
  "item": [
    { "uuid": "...", "name": "Jane Doe", "confidence": 0.92, "distance": 0.08 }
  ]
}

Lower

distance
= closer match. Higher
confidence
= stronger match.


Phase 6: Text Detection

Detect whether text content is AI-generated or human-written.

Beta feature — requires the

detect_beta_user
role or a billing plan that includes the
dfd_text
product.

Submit a Text Detection

POST /text_detect
Content-Type: application/json
Authorization: Bearer <API_KEY>

Add the

Prefer: wait
header for a synchronous (blocking) response. Without it, the job runs asynchronously — poll or use a callback.

Parameters:

ParameterTypeRequiredDescription
text
stringYesText to analyze (max 100,000 characters)
thinking
stringNoAlways use
"low"
(default)
threshold
floatNoDecision threshold 0.0–1.0 (default: 0.5)
callback_url
stringNoWebhook URL for async completion notification
privacy_mode
booleanNoIf true, text content is not stored after analysis

Response:

{
  "success": true,
  "item": {
    "uuid": "abc-123",
    "status": "completed",
    "prediction": "ai",
    "confidence": 0.91,
    "text_content": "This is some text to analyze.",
    "privacy_mode": false,
    "created_at": "...",
    "updated_at": "..."
  }
}
  • prediction
    :
    "ai"
    or
    "human"
    — the verdict
  • confidence
    : 0.0–1.0, higher = more confident in the prediction
  • status
    :
    "processing"
    ,
    "completed"
    , or
    "failed"

Poll for Results

If you did not use

Prefer: wait
, poll until
status
is
"completed"
or
"failed"
:

GET /text_detect/{uuid}
Authorization: Bearer <API_KEY>

List Text Detections

GET /text_detect
Authorization: Bearer <API_KEY>

Returns paginated text detections for the team.

Callback

If

callback_url
was provided, a
POST
is sent on completion:

{ "success": true, "item": { ... } }

On failure:

{ "success": false, "item": { ... }, "error": "Error message here" }

Recommended Workflows

Full Media Forensics (Most Thorough)

For a comprehensive analysis, combine all capabilities:

  1. Submit detection with all flags enabled:
    {
      "url": "https://example.com/suspect.mp4",
      "visualize": true,
      "intelligence": true,
      "audio_source_tracing": true,
      "use_reverse_search": true
    }
    
  2. Poll until
    status: "completed"
  3. Read
    metrics
    /
    image_metrics
    /
    video_metrics
    for the verdict
  4. Read
    intelligence.description
    for structured media analysis
  5. If audio labeled
    "fake"
    , check
    audio_source_tracing.label
    for the source platform
  6. Ask follow-up questions via Detect Intelligence if anything needs clarification
  7. Check for watermarks via
    POST /watermark/detect
    if provenance is relevant

Quick Authenticity Check (Fastest)

For a fast pass/fail:

  1. Submit minimal detection:
    { "url": "..." }
  2. Poll until complete
  3. Check
    label
    and
    aggregated_score
    (audio) or
    label
    and
    score
    (image/video)
  4. Report result with score context

Provenance Pipeline (Content Creators)

For creators who want to prove their content is authentic:

  1. Apply watermark to original content:
    POST /watermark/apply
  2. Distribute watermarked media
  3. Later, verify provenance:
    POST /watermark/detect
    against any copy

Red Flags — Stop and Reassess

  • Declaring authenticity without a detection result — Never say media is real or fake based on visual/auditory inspection alone
  • Ignoring the score and reporting only the label — A
    "fake"
    label with score 0.51 means something very different from score 0.95
  • Submitting local file paths to the API — The API requires publicly accessible HTTPS URLs (does not apply to text detection)
  • Sending text longer than 100,000 characters to text detection — Split into chunks or inform the user of the limit
  • Polling too aggressively — Start at 2s intervals, back off exponentially; do not loop at <1s
  • Asking Detect Intelligence questions before detection completes — Results in 422 error
  • Expecting source tracing on "real" audio — Source tracing only runs on audio labeled
    "fake"
  • Treating beta features (Identity) as production-ready — Warn users about beta status
  • Ignoring
    zero_retention_mode
    for sensitive media
    — Always suggest this flag when the user indicates the media is sensitive or private
  • Making multiple separate API calls when flags can combine — Use
    intelligence: true
    and
    audio_source_tracing: true
    on the detection call instead of separate requests

Response Presentation Guidelines

When presenting results to users:

  1. Lead with the verdict — "The detection indicates this audio is likely AI-generated (score: 0.87)"
  2. Provide score context — Use the score interpretation table above
  3. Mention limitations — Detection is probabilistic, not absolute proof
  4. Include actionable next steps — Suggest intelligence queries, source tracing, or watermark checks as appropriate
  5. For inconclusive results (0.3–0.5) — Explicitly state the result is inconclusive and recommend additional analysis with different parameters or manual review
  6. Never present detection as legal evidence — Detection results are analytical tools, not forensic certifications

Error Handling

ErrorCauseResolution
400Invalid request body or missing
url
Check required parameters
401Invalid or missing API keyVerify
RESEMBLE_API_KEY
404Detection UUID not foundVerify the UUID from the creation response
422Detection not completed (for Intelligence)Wait for detection to reach
completed
status
429Rate limitedBack off and retry with exponential delay
500Server errorRetry once, then report to user

Privacy & Compliance Notes

  • Zero retention mode: Set
    zero_retention_mode: true
    to auto-delete media after analysis. The URL is redacted and
    media_deleted
    is set to true post-completion.
  • Text privacy mode: Set
    privacy_mode: true
    on text detection to prevent text content from being stored after analysis.
  • Data handling: Media URLs and text content are stored by default. For GDPR/compliance-sensitive workflows, enable zero retention (media) or privacy mode (text).
  • Callback security: If using
    callback_url
    , ensure the endpoint is HTTPS and authenticated on the receiving end.