Actionbook article-exporter
git clone https://github.com/actionbook/actionbook
T=$(mktemp -d) && git clone --depth=1 https://github.com/actionbook/actionbook "$T" && mkdir -p ~/.claude/skills && cp -r "$T/playground/article-exporter" ~/.claude/skills/actionbook-actionbook-article-exporter && rm -rf "$T"
playground/article-exporter/SKILL.mdArticle Exporter - Export Articles to Obsidian
Version: 0.5.0 | Last Updated: 2026-03-13
You are an expert at web content archiving and Obsidian workflow automation.
Lessons from Failed Exports
These rules were extracted from real export failures. Each one prevents a specific class of error:
- Twitter/X needs AI reformatting —
returns flat text because Twitter uses custom UI without semantic HTML. The AI reformatting step reconstructs headings, lists, and code blocks. Seefetch
.references/twitter-handling.md - Ask for output path first — users have different vault locations. Assuming a default creates files in the wrong place and wastes time moving them.
- Check actionbook version >= 0.9.1 — the
parameter was added in 0.9.1. Without it, dynamic content (SPAs, lazy-loaded pages) returns empty or partial results.--wait-hint - Wait after navigation — use
for Twitter, Medium, and other dynamic sites. Without it, the page hasn't finished rendering when content is extracted.--wait-hint heavy - Rate limit batch exports — 3-5s delay between requests prevents being flagged as a bot (ToS compliance).
Quick Reference
| Task | Command | Success Criteria |
|---|---|---|
| Check deps | | Shows version >= 0.9.1 |
| Fetch article | | Returns plain text (AI reformats to Markdown in Step 1b) |
| Translate | AI session directly | README_CN.md created |
| Open in Obsidian | | File opens in Obsidian |
Complete Export Workflow
Goal: Export web article to Obsidian directory with images and optional translation
Success criteria:
- Article directory created with README.md
- All images downloaded to images/
- index.md navigation file created
- Optional: README_CN.md translation
- Opened in Obsidian (if obsidian-cli available)
Step 1: Fetch Article Content
Execution: Direct (Bash)
# Fetch article as readability text (with log cleaning) actionbook browser fetch "$URL" --wait-hint heavy 2>/dev/null | \ sed '/^[[:space:]]*$/d;/^\x1b\[/d;/^INFO/d' > /tmp/article_raw.txt
Success criteria:
exists and size > 0 bytes/tmp/article_raw.txt- Content contains the article's main text
The fetch command returns readability-extracted plain text (not Markdown). AI reformatting in Step 1b is always needed to produce proper Markdown.
Rules:
- Use
for Twitter, Medium, dynamic content--wait-hint heavy - Use
for static blogs--wait-hint light
suppresses stderr logs2>/dev/null
removes ANSI codes, INFO lines, empty linessed
Twitter/X Special Handling
Twitter uses non-semantic HTML, so
fetch output loses all structure (headings become flat text, code blocks disappear). If the URL contains x.com or twitter.com, pay extra attention to structure reconstruction in Step 1b. See references/twitter-handling.md.
Step 1b: AI Reformat to Markdown
Execution: Direct (AI session)
Read
/tmp/article_raw.txt and convert the plain text into well-structured Markdown. Save the result to /tmp/article.md.
Reformatting rules:
- Reconstruct headings (
,#
,##
) from the text structure### - Preserve original image URLs as
references - Format code blocks, lists, tables, and blockquotes
- Keep the original article title as the first
heading# H1
Success criteria:
exists and starts with/tmp/article.md# <Title>- Image URLs are preserved as Markdown image syntax
Step 2: Extract Metadata
Execution: Direct (Bash)
# Extract title (first H1 heading from AI-reformatted markdown) TITLE=$(grep -m 1 "^# " /tmp/article.md | sed 's/^# //') # Extract image URLs (filter out data: URLs) IMAGE_URLS=$(grep -o '!\[[^]]*\]([^)]*)' /tmp/article.md | \ sed -E 's/!\[[^]]*\]\(([^)]*)\)/\1/' | \ grep -v '^data:')
Success criteria:
is non-empty$TITLE
count matches expected (use$IMAGE_URLS
)wc -l
Step 3: Ask Output Directory
Execution: [human] Human checkpoint: Confirm output location before creating files
Ask user: "Where should I save the exported article?"
Suggested paths:
(default)~/Work/Write/Articles~/Documents/Obsidian/Articles~/Notes/Imported- (or custom path from
argument)$output_dir
Success criteria: User confirms output directory
Artifacts:
$OUTPUT_DIR variable set
Step 4: Create Directory Structure
Execution: Direct (Bash)
# Use argument if provided, otherwise use confirmed path OUTPUT_DIR="${output_dir:-$USER_CONFIRMED_PATH}" # Sanitize title for directory name SAFE_TITLE=$(echo "$TITLE" | sed 's/[/:*?"<>|]//g' | cut -c1-100 | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') # Create output directory ARTICLE_DIR="$OUTPUT_DIR/$SAFE_TITLE" mkdir -p "$ARTICLE_DIR/images"
Success criteria:
- Directory
exists$ARTICLE_DIR - Subdirectory
existsimages/ - Directory is writable
Rules:
- Remove special characters:
/ : * ? " < > | - Limit title length to 100 characters
- Trim leading/trailing whitespace
Step 5: Download Images (Parallel if possible)
Execution: Direct (Bash)
counter=1 for url in $IMAGE_URLS; do ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg") curl -L -s "$url" -o "$ARTICLE_DIR/images/image_${counter}${ext}" # Check file size (detect 0-byte failures) if [ ! -s "$ARTICLE_DIR/images/image_${counter}${ext}" ]; then # Try alternative format (Twitter) curl -L -s "${url}?format=jpg&name=orig" -o "$ARTICLE_DIR/images/image_${counter}.jpg" fi counter=$((counter + 1)) done
Success criteria:
- All image files exist and size > 0 bytes
- File count matches
count$IMAGE_URLS
Rules:
- Use
to follow redirectscurl -L - Check file size after download
- Try alternative formats for Twitter images
Step 6: Update Image References
Execution: Direct (Bash)
# Replace remote URLs with local paths counter=1 for url in $IMAGE_URLS; do ext=$(echo "$url" | grep -oE '\.(jpg|jpeg|png|gif|webp|svg)' || echo ".jpg") sed -i.bak "s|$url|./images/image_${counter}${ext}|g" /tmp/article.md counter=$((counter + 1)) done # Save updated markdown cp /tmp/article.md "$ARTICLE_DIR/README.md" rm /tmp/article.md.bak
Success criteria:
containsREADME.md
references./images/image_N.*- No remote URLs remain in image links
Step 7: AI Translation (Optional)
Execution: Direct (AI session)
Human checkpoint: Ask user: "Do you want to translate the article? (y/n)"
If yes:
- Read
$ARTICLE_DIR/README.md - Translate using AI capabilities (no external API)
- Write to
(or other language code)$ARTICLE_DIR/README_CN.md
Translation Prompt Template:
Translate the following Markdown article to [LANGUAGE] while preserving: - All Markdown formatting (headings, lists, code blocks, tables) - Image references exactly as-is:  - Links and URLs unchanged - Code blocks and technical terms in original language Only output the translated Markdown content. --- [Paste README.md content]
Success criteria: Translation file exists and size ≈ original ± 20%
Supported languages: en, zh, es, fr, de, ja, ko
Step 8: Create Navigation Index
Execution: Direct (Bash)
# Auto-detect source from URL case "$URL" in *x.com*|*twitter.com*) SOURCE="X" ;; *medium.com*) SOURCE="Medium" ;; *dev.to*) SOURCE="Dev.to" ;; *openai.com*) SOURCE="OpenAI Blog" ;; *substack.com*) SOURCE="Substack" ;; *github.com*) SOURCE="GitHub" ;; *) SOURCE=$(echo "$URL" | sed 's|https\?://||' | cut -d/ -f1) ;; esac # Create index.md cat > "$ARTICLE_DIR/index.md" <<EOF # $TITLE > **Export Date**: $(date +%Y-%m-%d) > **Original URL**: $URL > **Source**: $SOURCE ## 📚 Language Versions - 🇬🇧 **English**: [[README]] - 🇨🇳 **Chinese**: [[README_CN]] <!-- if translated --> ## 📊 Metadata | Property | Value | |----------|-------| | **Source** | $SOURCE | | **Images** | $(ls images/ 2>/dev/null | wc -l) images | | **Export Tool** | actionbook CLI | | **Export Date** | $(date +%Y-%m-%d) | --- **Exported using**: actionbook browser automation + AI assistant EOF
Success criteria:
index.md exists with metadata table
Step 9: Open in Obsidian
Execution: Direct (Bash)
if command -v obsidian-cli &> /dev/null; then VAULT_ROOT="$OUTPUT_DIR" REL_PATH=$(echo "$ARTICLE_DIR" | sed "s|$VAULT_ROOT/||") obsidian-cli open "$REL_PATH/index.md" echo "✓ Opened in Obsidian: $REL_PATH/index.md" else # Fallback: Open in file manager case "$(uname)" in Darwin) open "$ARTICLE_DIR" ;; Linux) xdg-open "$ARTICLE_DIR" ;; CYGWIN*|MINGW*|MSYS*) start "$ARTICLE_DIR" ;; esac echo "⚠️ Install obsidian-cli for automatic opening: npm install -g obsidian-cli" fi
Success criteria:
- File opens in Obsidian OR directory opens in file manager
- User sees success message
Step 10: Report Success
Execution: Direct (Output)
echo "" echo "════════════════════════════════════════════" echo "✓ Article exported successfully!" echo "" echo "📁 Location: $ARTICLE_DIR" echo "📄 Files:" echo " - README.md (original)" [ -f "$ARTICLE_DIR/README_CN.md" ] && echo " - README_CN.md (translation)" echo " - index.md (navigation)" echo "🖼️ Images: $(ls images/ 2>/dev/null | wc -l) files" echo "════════════════════════════════════════════"
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| "actionbook: command not found" | CLI not installed | |
| "unknown flag: --wait-hint" | Version < 0.9.1 | Upgrade: |
| Twitter format broken | loses structure | Use AI reformatting (see references/twitter-handling.md) |
| Images 0 bytes | URL expired | Try |
| obsidian-cli not found | Not installed | |
| Batch export blocked | Too fast, flagged as bot | Add 3-5s between requests |
Detailed troubleshooting: See
./references/troubleshooting.md
Edge Cases Handled
- Long titles → Auto-truncate to 100 chars
- Special characters → Sanitized (
removed)/ : * ? " < > | - No images → Steps 5-6 skip gracefully
- 0-byte images → Auto-retry with alternative formats
- Data URLs → Filtered out in Step 2
When Using This Skill
- Check dependencies first —
actionbook --version >= 0.9.1 - Test with one article — Verify before batch processing
- Twitter/X requires special handling — See references/twitter-handling.md
- Respect ToS — Personal use only, rate limit batch exports
References (Progressive Disclosure)
For detailed documentation, see:
— Twitter/X special handling (AI reformatting)./references/twitter-handling.md
— Batch export with rate limiting./references/batch-export.md
— Detailed troubleshooting guide./references/troubleshooting.md
— obsidian-cli setup and configuration./references/obsidian-setup.md
— Complete website compatibility list./references/supported-websites.md
Last Updated: 2026-03-13 | Version: 0.5.0