Chan-skills chanjing-tts
git clone https://github.com/chanjing-ai/chan-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/chanjing-ai/chan-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/chanjing-tts" ~/.claude/skills/chanjing-ai-chan-skills-chanjing-tts && rm -rf "$T"
skills/chanjing-tts/SKILL.mdChanjing TTS
功能说明
调用蝉镜 TTS Open API:列举音色、创建合成任务、轮询并从接口返回 URL 下载音频。脚本不依赖 ffmpeg/ffprobe。
运行依赖
- python3 与同仓库
scripts/*.py - 无 ffmpeg/ffprobe 门控
环境变量与机器可读声明
- 环境变量键名与说明:
(manifest.yaml
段)及本文environment - 变量、凭据、合规
、permissions
、clientPermissions
:agentPolicymanifest.yaml
使用命令
- ClawHub(slug 以注册表为准):
clawhub run chanjing-tts - 本仓库:
(见正文 How to Use)python skills/chanjing-tts/scripts/create_task.py …
登记与审稿(单一事实来源)
主凭据、下载行为、
省略等:以 primaryEnv
为准。本篇 How to Use 起为 API 步骤说明。manifest.yaml
When to Use This Skill
Use this skill when the user needs to generate audio from text.
Chanjing TTS supports:
- both Chinese and English
- multiple system voices
- adjustment of speech speed
- sentence-level timestamp in result
How to Use This Skill
前置条件(权限验证):执行本 Skill 前,必须先通过 chanjing-credentials-guard 完成 AK/SK 与 Token 校验。本 Skill 与 guard 使用同一套凭证(
~/.chanjing/credentials.json);脚本在无凭证时会执行 open_login_page.py 脚本,在默认浏览器打开 AK/SK 注册/登录页,并提示配置命令。凭据与审稿对表见 manifest.yaml。
Security & credentials(引用)
详见
中 manifest.yaml
与 credentials
(及合规顶层 clientPermissions
)。permissions
Multiple APIs need to be invoked. All share the domain: "https://open-api.chanjing.cc". All requests communicate using json. You should use utf-8 to encode and decode text throughout this task.
- Obtain an
, which is required for all subsequent API callsaccess_token - List all voice IDs and select one to use
- Call the Create Speech API, record
task_id - Poll the Query Speech Status API until success, then download generated audio file using the url in response
Obtain AccessToken
从
~/.chanjing/credentials.json 读取 app_id 和 secret_key,若无有效 Token 则调用:
POST /open/v1/access_token Content-Type: application/json
请求体(使用本地配置的 app_id、secret_key):
{ "app_id": "<从 credentials.json 读取>", "secret_key": "<从 credentials.json 读取>" }
Response example:
{ "trace_id": "8ff3fcd57b33566048ef28568c6cee96", "code": 0, "msg": "success", "data": { "access_token": "1208CuZcV1Vlzj8MxqbO0kd1Wcl4yxwoHl6pYIzvAGoP3DpwmCCa73zmgR5NCrNu", "expire_in": 1721289220 } }
Response field description:
| First-level Field | Second-level Field | Description |
|---|---|---|
| code | Response status code | |
| msg | Response message | |
| data | Response data | |
| access_token | Valid for one day, previous token will be invalidated | |
| expire_in | Token expiration time |
Response Status Code Description
| Code | Description |
|---|---|
| 0 | Success |
| 400 | Invalid parameter format |
| 40000 | Parameter error |
| 50000 | System internal error |
Select a Voice ID
Obtain all available voice IDs via API, and select one that fits the task at hand. The dialect/accent can be deduced from the voice name.
GET /open/v1/list_common_audio access_token: {{access_token}}
Use the following request body:
{ "page": 1, "size": 100 } Response example: ```json { "trace_id": "25eb6794ffdaaf3672c25ed9efbe49c6", "code": 0, "msg": "success", "data": { "list": [ { "id": "f9248f3b1b42447fb9282829321cfcf2", "grade": 0, "name": "带货小芸", "gender": "female", "lang": "multilingual", "desc": "", "speed": 1, "pitch": 1, "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-06-05/7945e0474b8cb526e884ee7e28e4af8d.wav" }, { "id": "f5e69c1bbe414bec860da3294e177625", "grade": 0, "name": "方言口音老奶奶", "gender": "female", "lang": "multilingual", "desc": "", "speed": 1, "pitch": 1, "audition": "https://res.chanjing.cc/chanjing/res/upload/ms/2025-04-30/1b248ad05953028db5a6bcba9a951164.wav" }, ... ], "page_info": { "page": 1, "size": 100, "total_count": 98, "total_page": 1 } } }
Response field description:
| First-level Field | Second-level Field | Third-level Field | Description |
|---|---|---|---|
| code | Response status code | ||
| message | Response message | ||
| data | Response data | ||
| list | List data | Public voice - list data | |
| id | Voice ID | ||
| name | Voice name, if it includes a place name, the generated speech is in dialect | ||
| gender | Gender | ||
| lang | Language | ||
| desc | Description | ||
| speed | Speech speed | ||
| pitch | Pitch | ||
| audition | Audition link | ||
| grade | Grade |
Response status code description:
| Code | Description |
|---|---|
| 0 | Response successful |
| 10400 | AccessToken verification failed |
| 40000 | Parameter error |
| 50000 | System internal error |
| 51000 | System internal error |
Create Speech API
Submit a speech creating task, which returns a task ID for polling later.
POST /open/v1/create_audio_task access_token: {{access_token}} Content-Type: application/json
Request body example:
{ "audio_man": "89843d52ccd04e2d854decd28d6143ce ", "speed": 1, "pitch": 1, "text": { "text": "Hello, I am your AI assistant." } }
Request field description:
| Parameter Name | Type | Nested Key | Required | Example | Description |
|---|---|---|---|---|---|
| audio_man | string | Yes | 89843d52ccd04e2d854decd28d6143ce | Voice ID | |
| speed | number | Yes | 1 | Speech speed: 0.5 (slow) - 2 (fast) | |
| pitch | number | Yes | 1 | Just set to 1 | |
| text | object | text | Yes | Hello, I am your AI assistant. | Rich text, length must be less than 4000 characters |
| aigc_watermark | bool | No | false | Whether to add visible watermark to audio, default to false |
Response example:
{ "trace_id": "dd09f123a25b43cf2119a2449daea6de", "code": 0, "msg": "success", "data": { "task_id": "88f635dd9b8e4a898abb9d4679e0edc8" } }
Response field description:
| Field | Description |
|---|---|
| code | Response status code |
| msg | Response message |
| task_id | Task ID, to be used in subsequent polling step |
Response status code description:
| code | Description |
|---|---|
| 0 | Response successful |
| 400 | Invalid parameter format |
| 10400 | AccessToken verification failed |
| 40000 | Parameter error |
| 40001 | Exceeds QPS limit |
| 40002 | Production duration reached limit |
| 50000 | System internal error |
Poll Query Speech Status API
Poll the following API until speech is generated.
POST /open/v1/audio_task_state access_token: {{access_token}} Content-Type: application/json
Request example:
{ "task_id": "88f635dd9b8e4a898abb9d4679e0edc8" }
Request field description:
| Parameter Name | Type | Required | Example | Description |
|---|---|---|---|---|
| task_id | string | Yes | 88f789dd9b8e4a121abb9d4679e0edc8 | Speech synthesis task ID |
Response example:
{ "trace_id": "ab18b14574bbcc31df864099d474080e", "code": 0, "msg": "success", "data": { "id": "9546a0fb1f0a4ae3b5c7489b77e4a94d", "type": "tts", "status": 9, "text": [ "猫在跌落时能够在空中调整身体,通常能够四脚着地,这种”猫右自己“反射显示了它们惊人的身体协调能力和灵活性。核磁共振成像技术通过利用人体细胞中氢原子的磁性来生成详细的内部图像,为医学诊断提供了重要工具。" ], "full": { "url": "https://cy-cds-test-innovation.cds8.cn/chanjing/res/upload/tts/2025-04-08/093a59021d85a72d28a491f21820ece4.wav", "path": "093a59013d85a72d28a491f21820ece4.wav", "duration": 18.81 }, "slice": null, "errMsg": "", "errReason": "", "subtitles": [ { "key": "20c53ff8cce9831a8d9c347263a400a54d72be15", "start_time": 0, "end_time": 2.77, "subtitle": "猫在跌落时能够在空中调整身体" }, { "key": "e19f481b6cd2219225fa4ff67836448e054b2271", "start_time": 2.77, "end_time": 4.49, "subtitle": "通常能够四脚着地" }, { "key": "140beae4046bd7a99fbe4706295c19aedfeeb843", "start_time": 4.49, "end_time": 5.73, "subtitle": "这种,猫右自己" }, { "key": "e851881271876ab5a90f4be754fde2dc6b5498fd", "start_time": 5.73, "end_time": 7.97, "subtitle": "反射显示了它们惊人的身体" }, { "key": "fbb0b4138bad189b9fc02669fe1f95116e9991b4", "start_time": 7.97, "end_time": 9.45, "subtitle": "协调能力和灵活性" }, { "key": "f73404d135feaf84dd8fbea13af32eac847ac26d", "start_time": 9.45, "end_time": 12.49, "subtitle": "核磁共振成像技术通过利用人体" }, { "key": "e18827931223962e477b14b2b8046947039ac222", "start_time": 12.49, "end_time": 14.77, "subtitle": "细胞中氢原子的磁性来生成" }, { "key": "d137bf2b0c8b7a39e3f6753b7cf5d92bd877d2d9", "start_time": 14.77, "end_time": 15.97, "subtitle": "详细的内部图像" }, { "key": "0773911ae0dbaa763a64352abdb6bdac3ff8f149", "start_time": 15.97, "end_time": 18.41, "subtitle": "为医学诊断提供了重要工具" } ] } }
Response field description:
| First-level Field | Second-level Field | Third-level Field | Description |
|---|---|---|---|
| code | Response status code | ||
| msg | Response message | ||
| data | id | Audio ID | |
| type | |||
| status | 1: generating; 9: completed | ||
| text | Speech text | ||
| full | url | url to download the generated audio file | |
| path | |||
| duration | Audio duration | ||
| slice | |||
| errMsg | Error message | ||
| errReason | Error reason | ||
| subtitles (array type) | key | Subtitle ID | |
| start_time | Subtitle start time | ||
| end_time | Subtitle end time | ||
| subtitle | Subtitle text |
Response status code description:
| code | Description |
|---|---|
| 0 | Response successful |
| 10400 | AccessToken verification failed |
| 40000 | Parameter error |
| 50000 | System internal error |
Scripts
本 Skill 提供脚本(
skills/chanjing-tts/scripts/),带权限验证:与 chanjing-credentials-guard 使用同一配置文件;无 AK/SK 时会执行 guard 的 open_login_page.py 脚本,在浏览器打开注册/登录页,并提示配置命令。
| 脚本 | 说明 |
|---|---|
| 列出公共声音人,默认输出 id/name 表,可选 输出完整数据 |
| 创建 TTS 任务,输出 task_id |
| 轮询任务直到完成,输出音频下载 URL(full.url) |
示例(在项目根或 skill 目录下执行):
# 1. 列出可用声音,选取一个 id python skills/chanjing-tts/scripts/list_voices.py # 2. 创建合成任务 TASK_ID=$(python skills/chanjing-tts/scripts/create_task.py \ --audio-man "f9248f3b1b42447fb9282829321cfcf2" \ --text "Hello, I am your AI assistant.") # 3. 轮询到完成,得到音频下载链接 python skills/chanjing-tts/scripts/poll_task.py --task-id "$TASK_ID"