Skills wechat-article-fetcher
install
source · Clone the upstream repo
git clone https://github.com/wulaosiji/skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/wulaosiji/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/wechat-article-fetcher" ~/.claude/skills/wulaosiji-skills-wechat-article-fetcher && rm -rf "$T"
manifest:
wechat-article-fetcher/SKILL.mdsource content
🚀 Migrated to wulaosiji/founder-skills as
.china-content-researchThis version is kept for backward compatibility. For the latest updates, use the founder-skills version.
WeChat Article Fetcher
微信公众号文章抓取工具,基于 Playwright,可绕过微信的反爬机制获取完整文章内容。
When to Use
Use This Skill When
- 需要抓取微信公众号完整文章内容
- 批量获取多个公众号文章
- 将微信文章保存为Markdown格式
- 收集公众号文章作为研究素材
- 微信文章需要离线存档
- 其他方法(如web_fetch)无法获取内容
Do NOT Use This Skill If
- 链接已过期或失效
- 文章需要登录才能查看
- 需要抓取大量文章(建议控制频率)
- 目标文章是临时分享链接(含tempkey参数)
- 网络环境不稳定
Typical Trigger Phrases
Chinese:
- "抓取公众号文章"
- "获取微信文章内容"
- "下载公众号文章"
- "微信文章转Markdown"
- "批量抓取微信文章"
- "保存公众号文章"
English:
- "Fetch WeChat article"
- "Scrape WeChat content"
- "Download WeChat article"
- "WeChat to Markdown"
- "Batch fetch WeChat"
- "Archive WeChat article"
Workflow
Step 1: 准备URL
- 确保URL有效(非临时链接)
- 检查链接是否过期
- 确认文章是公开访问
Step 2: 执行抓取
from skills.wechat_article_fetcher.wechat_fetcher import fetch_wechat_article result = fetch_wechat_article(url="https://mp.weixin.qq.com/s/xxxxx")
Step 3: 处理结果
if result['success']: print(f"标题: {result['title']}") print(f"内容: {result['content'][:500]}")
Step 4: 保存文件(可选)
- Markdown格式
- 纯文本格式
- 导入其他系统
Guardrails
Anti-Patterns
- ❌ 频繁抓取触发反爬
- ❌ 抓取付费或私密内容
- ❌ 不处理抓取失败的情况
- ❌ 使用过期临时链接
Limitations
- 需要安装Chromium(约100MB)
- 首次运行需下载浏览器
- 不处理图片/视频,仅提取文本
- 临时分享链接会过期
Technical Constraints
- 浏览器依赖: 需要Playwright和Chromium
- 链接时效: 临时链接(含tempkey)会过期
- 内容限制: 仅提取文本,不下载媒体
- 频率限制: 建议控制抓取频率
Installation
pip install playwright playwright install chromium
Usage
Command Line
# 基本用法 python3 skills/wechat-article-fetcher/wechat_fetcher.py \ "https://mp.weixin.qq.com/s/xxxxx" # 保存到文件 python3 skills/wechat-article-fetcher/wechat_fetcher.py \ "https://mp.weixin.qq.com/s/xxxxx" \ -o "output/article.md"
Python API
from skills.wechat_article_fetcher.wechat_fetcher import fetch_wechat_article result = fetch_wechat_article( url="https://mp.weixin.qq.com/s/xxxxx", output_path="output/article.md" ) if result['success']: print(f"标题: {result['title']}") print(f"内容: {result['content'][:500]}")
Return Format
{ 'title': '文章标题', 'content': '完整正文内容(纯文本)', 'url': '原始链接', 'fetch_time': '2026-02-12 11:00:00', 'success': True }
Why Playwright?
| Method | Feasibility | Notes |
|---|---|---|
| ❌ | 微信反爬,只能获取标题 |
(Chrome) | ⚠️ | 需要连接已授权的Chrome实例 |
| ✅ | 模拟真实浏览器,稳定可靠 |
Scraping Strategy
- 启动 headless Chromium
- 访问目标URL,等待网络空闲
- 执行JS提取正文(
选择器)#js_content - 返回纯文本内容
Use Cases
Use Case 1: 写书项目素材收集
import os from datetime import datetime url = "https://mp.weixin.qq.com/s/xxxxx" filename = f"article-{datetime.now().strftime('%Y%m%d')}.md" output_path = f"01-Projects/book-openclaw/01-raw-materials/community-cases/{filename}" os.makedirs(os.path.dirname(output_path), exist_ok=True) result = fetch_wechat_article(url, output_path)
Use Case 2: 批量抓取
urls = [ "https://mp.weixin.qq.com/s/xxx1", "https://mp.weixin.qq.com/s/xxx2", "https://mp.weixin.qq.com/s/xxx3", ] for url in urls: result = fetch_wechat_article(url) print(f"✅ {result['title']}" if result['success'] else f"❌ 失败: {url}")
Use Case 3: 子Agent任务
使用 wechat-article-fetcher skill 抓取链接 https://mp.weixin.qq.com/s/xxxxx 保存到 01-Projects/book-openclaw/01-raw-materials/community-cases/
Troubleshooting
Issue: Chromium Not Found
playwright install chromium
Issue: Empty Content
- 检查链接是否有效(非临时链接)
- 临时分享链接(含
参数)会过期tempkey - 尝试使用原始永久链接
Issue: Timeout Error
- 网络问题,重试即可
- 微信服务器偶尔响应慢
Related Skills
| Skill | Relationship | Use Case |
|---|---|---|
| content-extractor | 通用替代 | 多平台内容提取 |
| document-hub | 下游处理 | 将文章生成Word/PDF |
| 格式转换 | 文章转PDF存档 | |
| md-to-wechat | 反向操作 | 将Markdown发公众号 |
| long-form-writer | 内容加工 | 基于文章素材写作 |
Changelog
v1.0.0 (2026-02-12)
- 初始版本
- 基于Playwright实现微信文章抓取
- 支持命令行和Python API
About UniqueClub
Part of the UniqueClub toolkit - a collection of skills for AI-powered content creation and automation.