Knowledge-work-plugins scribe

Reference skill for Zoom AI Services Scribe. Use after routing to a transcription workflow when handling uploaded or stored media, Build-platform JWT auth, fast mode transcription, batch jobs, or transcript pipeline design.

install
source · Clone the upstream repo
git clone https://github.com/anthropics/knowledge-work-plugins
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/anthropics/knowledge-work-plugins "$T" && mkdir -p ~/.claude/skills && cp -r "$T/partner-built/zoom-plugin/skills/scribe" ~/.claude/skills/anthropics-knowledge-work-plugins-scribe && rm -rf "$T"
manifest: partner-built/zoom-plugin/skills/scribe/SKILL.md
source content

Zoom AI Services Scribe

Background reference for Zoom AI Services Scribe across:

  • synchronous single-file transcription (
    POST /aiservices/scribe/transcribe
    )
  • asynchronous batch jobs (
    /aiservices/scribe/jobs*
    )
  • browser microphone pseudo-streaming via repeated short file uploads
  • webhook-driven batch status updates
  • Build-platform JWT generation and credential handling

Official docs:

Routing Guardrail

  • If the user needs uploaded or stored media transcribed into text, route here first.
  • If the user needs live meeting media without file-based upload/batch jobs, route to ../rtms/SKILL.md.
  • If the user needs Zoom REST API inventory for AI Services paths, chain ../rest-api/SKILL.md.
  • If the user needs webhook signature patterns or generic HMAC receiver hardening, optionally chain ../webhooks/SKILL.md.

Quick Links

  1. concepts/auth-and-processing-modes.md
  2. scenarios/high-level-scenarios.md
  3. examples/fast-mode-node.md
  4. examples/batch-webhook-pipeline.md
  5. references/api-reference.md
  6. references/environment-variables.md
  7. references/samples-validation.md
  8. references/versioning-and-drift.md
  9. troubleshooting/common-drift-and-breaks.md
  10. RUNBOOK.md

Core Workflow

  1. Get Build-platform credentials and generate an HS256 JWT.
  2. Choose fast mode for one short file or batch mode for stored archives / large sets.
  3. Submit the transcription request.
  4. For batch jobs, poll job/file status or receive webhook notifications.
  5. Persist and post-process transcript JSON.

Hosted Fast-Mode Guardrail

  • The formal fast-mode API limits are
    100 MB
    and
    2 hours
    , but hosted browser flows can still time out before the upstream response returns.
  • Current deployed-sample observations:
    • ~17.2 MB MP4 completed in about
      26s
    • ~38.6 MB MP4 completed in about
      26-37s
    • ~59.2 MB MP4 completed in about
      32-34s
      on the backend
    • some ~59.2 MB browser requests still surfaced as frontend
      504
      while backend logs later showed
      200
  • Treat frontend
    504
    plus backend
    200
    as a browser/edge timeout race, not an automatic transcription failure.
  • For hosted UIs, prefer an async request/polling wrapper for fast mode instead of holding the browser open for the full upstream response.
  • For larger or less predictable media, prefer batch mode even when the file is still within the formal fast-mode size limit.

Browser Microphone Pattern

  • scribe
    does not expose a documented real-time streaming API surface.
  • If you want a browser microphone experience, use pseudo-streaming:
    1. capture microphone audio in short chunks
    2. upload each chunk through the async fast-mode wrapper
    3. poll for completion
    4. append chunk transcripts in sequence
  • Recommended starting cadence:
    • chunk size:
      5 seconds
    • acceptable range:
      5-10 seconds
    • in-flight chunk requests:
      2-3
  • This is a practical UI pattern for incremental transcript updates, not a substitute for
    rtms
    .
  • Treat this as a fallback demo pattern, not the preferred production architecture.
  • It adds repeated upload overhead, chunk-boundary drift, browser codec/container variability, and transcript stitching complexity.
  • If the user asks for actual live stream ingestion, low-latency continuous media, or server-push media transport, route to ../rtms/SKILL.md instead.

Endpoint Surface

ModeMethodPathUse
Fast
POST
/aiservices/scribe/transcribe
Synchronous transcription for one file
Batch
POST
/aiservices/scribe/jobs
Submit asynchronous batch job
Batch
GET
/aiservices/scribe/jobs
List jobs
Batch
GET
/aiservices/scribe/jobs/{jobId}
Inspect job summary/state
Batch
DELETE
/aiservices/scribe/jobs/{jobId}
Cancel queued/processing job
Batch
GET
/aiservices/scribe/jobs/{jobId}/files
Inspect per-file results

High-Level Scenarios

  • On-demand clip transcription after a user uploads one recording.
  • Batch transcription of stored S3 call archives.
  • Webhook-driven ETL pipeline that writes transcripts to your database/search index.
  • Re-transcription of Zoom-managed recordings after exporting them to your own storage.
  • Offline compliance or QA workflows that need timestamps, channel separation, and speaker hints.

Chaining

Operations

  • RUNBOOK.md - 5-minute preflight and debugging checklist.