Skills IMA TTS Generator

Name: IMA TTS Generator
Author: openclaw

install

source · Clone the upstream repo

git clone https://github.com/openclaw/skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/allenfancy-gan/ima-tts-ai" ~/.claude/skills/openclaw-skills-ima-tts-generator && rm -rf "$T"

OpenClaw · Install into ~/.openclaw/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/allenfancy-gan/ima-tts-ai" ~/.openclaw/skills/openclaw-skills-ima-tts-generator && rm -rf "$T"

manifest: skills/allenfancy-gan/ima-tts-ai/SKILL.md

IMA TTS AI — Text-to-Speech Generator

For complete API documentation, security details, all parameters, speaker list, and Python examples, read

SKILL-DETAIL.md

Model ID Reference (CRITICAL)

Friendly Name	model_id	Notes
Seed TTS 2.0	`seed-tts-2.0`	✅ Default and only supported model

Sub-models (via extra-params):

```
seed-tts-2.0-expressive
```
— More expressive, emotional (default)
```
seed-tts-2.0-standard
```
— More stable, neutral

When User Says "帮我制作旁白/配音"

Must ask first:

Question	Parameter	Required
要朗读的内容/文案	`prompt`	✅ Yes

Recommend asking:

Question	Parameter	Options
音色/发音人	`speaker`	魅力苏菲、Vivi、云舟、大壹等 (see SKILL-DETAIL.md)

Optional:

Question	Parameter	Range
情感/情绪	`audio_params.emotion`	neutral, sad, angry
语速	`audio_params.speech_rate`	[-50, 100], 0=normal
音量	`audio_params.loudness_rate`	[-50, 100], 0=normal

User Input Parsing

User says	Parameter	Value
旁白/配音/朗读	prompt + speaker	Ask for content first
女声/female	speaker	e.g. `zh_female_vv_uranus_bigtts`
男声/male	speaker	e.g. `zh_male_sophie_uranus_bigtts`
语速快/slow	audio_params.speech_rate	Positive/negative value
expressive/standard	model	Sub-model selection

Script Usage

# List available TTS models
python3 {baseDir}/scripts/ima_tts_create.py --api-key $IMA_API_KEY --list-models

# Generate speech (default model: seed-tts-2.0)
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "Text to be spoken here." \
  --user-id {user_id} \
  --output-json

# With speaker and emotion
python3 {baseDir}/scripts/ima_tts_create.py \
  --api-key $IMA_API_KEY \
  --model-id seed-tts-2.0 \
  --prompt "阳光青年音色测试，你好世界。" \
  --extra-params '{"model":"seed-tts-2.0-expressive","speaker":"zh_male_sophie_uranus_bigtts","audio_params":{"emotion":"neutral"}}' \
  --user-id {user_id} \
  --output-json

Sending Results to User

# ✅ CORRECT: Use remote URL directly
message(action="send", media=audio_url, caption="✅ 语音合成成功！\n• 模型：[Name]\n• 耗时：[X]s\n• 积分：[N pts]\n\n🔗 原始链接：[url]")

# ❌ WRONG: Never download to local file

UX Protocol (Brief)

Pre-generation: "🔊 开始语音合成… 模型：[Name]，预计[X~Y]秒，消耗[N]积分"
Progress: Every 10-15s: "⏳ 语音合成中… [P]%"
Success: Send audio via
```
media=audio_url
```
+ include link in caption
Failure: Natural language error + suggest retry. See SKILL-DETAIL.md for error translation.

Never say to users: script names, API endpoints, attribute_id, technical parameter names.

Environment

Base URL:

https://api.imastudio.com

Headers:

Authorization: Bearer $IMA_API_KEY

x-app-source: ima_skills

x_app_language: en

Core Flow

GET /open/v1/product/list?app=ima&platform=web&category=text_to_speech

→ get

attribute_id

credit

```
POST /open/v1/tasks/create
```
→ get
```
task_id
```

POST /open/v1/tasks/detail

→ poll every 2-5s until

resource_status==1

MANDATORY: Always query product list first.

attribute_id

is required.

Estimated Generation Time

Model	Estimated Time	Poll Every
seed-tts-2.0	5~30s	3s

User Preference Memory

Storage:

~/.openclaw/memory/ima_prefs.json

Save when user explicitly says "用XXX音色" / "默认用XXX"
Clear when user says "换个音色" / "推荐一个"

Popular Speakers (Quick Reference)

Category	Speaker Name	speaker ID
通用	魅力苏菲	`zh_male_sophie_uranus_bigtts`
通用	Vivi	`zh_female_vv_uranus_bigtts`
通用	云舟	`zh_male_m191_uranus_bigtts`
视频配音	大壹	`zh_male_dayi_uranus_bigtts`
角色扮演	知性灿灿	`zh_female_cancan_uranus_bigtts`

Full speaker list: See

volcengine_tts_timbre_list.json

in project or SKILL-DETAIL.md.

⚠️ Important: Use native format (

*_uranus_bigtts

), NOT

BV*_streaming

format.