git clone https://github.com/didclawapp-ai/didclaw
T=$(mktemp -d) && git clone --depth=1 https://github.com/didclawapp-ai/didclaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/didclaw-ui/skills/glmocr-table" ~/.claude/skills/didclawapp-ai-didclaw-glmocr-table && rm -rf "$T"
didclaw-ui/skills/glmocr-table/SKILL.mdGLM-OCR Table Recognition Skill / GLM-OCR 表格识别技能
Extract tables from images and PDFs and convert them to Markdown format using the ZhiPu GLM-OCR layout parsing API.
When to Use / 使用场景
- Extract tables from images or scanned documents / 从图片或扫描件中提取表格
- Convert table images to Markdown or Excel format / 将表格图片转为 Markdown 或可编辑格式
- Recognize complex tables with merged cells / 识别含合并单元格的复杂表格
- Parse financial statements, invoices, reports with tables / 解析财务报表、发票、带表格的报告
- User mentions "extract table", "recognize table", "表格识别", "提取表格", "表格OCR", "表格转文字"
Key Features / 核心特性
- Complex table support: Handles merged cells, nested tables, multi-row headers
- Markdown output: Tables are output in clean Markdown format, easy to edit and convert
- Multi-page PDF: Supports batch extraction from multi-page PDF documents
- Local file & URL: Supports both local files and remote URLs
Resource Links / 资源链接
| Resource | Link |
|---|---|
| Get API Key | 智谱开放平台 API Keys |
| API Docs | Layout Parsing / 版面解析 |
Prerequisites / 前置条件
API Key Setup / API Key 配置(Required / 必需)
脚本通过
ZHIPU_API_KEY 环境变量获取密钥,可与其他智谱技能复用同一个 key。
This script reads the key from the ZHIPU_API_KEY environment variable. Reusing the same key across Zhipu skills is optional.
Get Key / 获取 Key: Visit 智谱开放平台 API Keys to create or copy your key.
Setup options / 配置方式(任选一种):
-
Global config (recommended) / 全局配置(推荐): Set once in
underopenclaw.json
, all Zhipu skills will share it:env.vars{ "env": { "vars": { "ZHIPU_API_KEY": "你的密钥" } } } -
Skill-level config / Skill 级别配置: Set for this skill only in
:openclaw.json{ "skills": { "entries": { "glmocr-table": { "env": { "ZHIPU_API_KEY": "你的密钥" } } } } } -
Shell environment variable / Shell 环境变量: Add to
:~/.zshrcexport ZHIPU_API_KEY="你的密钥"
💡 如果你已为其他智谱 skill(如
、glmocr、glmv-caption)配置过 key,它们共享同一个glm-image-generation,无需重复配置。ZHIPU_API_KEY
Security & Transparency / 安全与透明度
- Environment variables used / 使用的环境变量:
(required / 必需)ZHIPU_API_KEY
(optional timeout seconds / 可选超时秒数)GLM_OCR_TIMEOUT
- Fixed endpoint / 固定官方端点:
https://open.bigmodel.cn/api/paas/v4/layout_parsing - No custom API URL override / 不支持自定义 API URL 覆盖: this avoids accidental key exfiltration via redirected endpoints.
- Raw upstream response is optional / 原始响应默认不返回: use
only when needed for debugging.--include-raw
⛔ MANDATORY RESTRICTIONS / 强制限制 ⛔
- ONLY use GLM-OCR API — Execute the script
python scripts/glm_ocr_cli.py - NEVER parse tables yourself — Do NOT try to extract tables using built-in vision or any other method
- NEVER offer alternatives — Do NOT suggest "I can try to recognize it" or similar
- IF API fails — Display the error message and STOP immediately
- NO fallback methods — Do NOT attempt table extraction any other way
📋 Output Display Rules / 输出展示规则
After running the script, present the OCR result clearly and safely.
- Show extracted table Markdown (
) in fulltext - Summarization is allowed, but do not hide important extraction failures
- If
contains table-related entries, you may highlight themlayout_details - If the result file is saved, tell the user the file path
- Show raw upstream response only when explicitly requested or debugging (
)--include-raw
How to Use / 使用方法
Extract from URL / 从 URL 提取
python scripts/glm_ocr_cli.py --file-url "https://example.com/table.png"
Extract from Local File / 从本地文件提取
python scripts/glm_ocr_cli.py --file /path/to/table.png
Save Result to File / 保存结果到文件
python scripts/glm_ocr_cli.py --file table.png --output result.json --pretty
Include Raw Upstream Response (Debug Only) / 包含原始上游响应(仅调试)
python scripts/glm_ocr_cli.py --file table.png --output result.json --include-raw
CLI Reference / CLI 参数
python {baseDir}/scripts/glm_ocr_cli.py (--file-url URL | --file PATH) [--output FILE] [--pretty] [--include-raw]
| Parameter | Required | Description |
|---|---|---|
| One of | URL to image/PDF |
| One of | Local file path to image/PDF |
, | No | Save result JSON to file |
| No | Pretty-print JSON output |
| No | Include raw upstream API response in field (debug only) |
Response Format / 响应格式
{ "ok": true, "text": "| Column 1 | Column 2 |\n|----------|----------|\n| Data | Data |", "layout_details": [...], "result": null, "error": null, "source": "/path/to/file", "source_type": "file", "raw_result_included": false }
Key fields:
— whether extraction succeededok
— extracted text in Markdown (use this for display)text
— layout analysis detailslayout_details
— error details on failureerror
Error Handling / 错误处理
API key not configured:
ZHIPU_API_KEY not configured. Get your API key at: https://bigmodel.cn/usercenter/proj-mgmt/apikeys
→ Show exact error to user, guide them to configure
Authentication failed (401/403): API key invalid/expired → reconfigure
Rate limit (429): Quota exhausted → inform user to wait
File not found: Local file missing → check path