Openakita jimliu/baoyu-skills@baoyu-url-to-markdown

Fetch any URL and convert to markdown using Chrome CDP. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.

install
source · Clone the upstream repo
git clone https://github.com/openakita/openakita
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/openakita/openakita "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/baoyu-url-to-markdown" ~/.claude/skills/openakita-openakita-jimliu-baoyu-skills-baoyu-url-to-markdown && rm -rf "$T"
manifest: skills/baoyu-url-to-markdown/SKILL.md
source content

URL to Markdown

Fetches any URL via Chrome CDP and converts HTML to clean markdown.

Script Directory

Important: All scripts are located in the

scripts/
subdirectory of this skill.

Agent Execution Instructions:

  1. Determine this SKILL.md file's directory path as
    SKILL_DIR
  2. Script path =
    ${SKILL_DIR}/scripts/<script-name>.ts
  3. Replace all
    ${SKILL_DIR}
    in this document with the actual path

Script Reference:

ScriptPurpose
scripts/main.ts
CLI entry point for URL fetching

Preferences (EXTEND.md)

Use Bash to check EXTEND.md existence (priority order):

# Check project-level first
test -f .baoyu-skills/baoyu-url-to-markdown/EXTEND.md && echo "project"

# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md" && echo "user"

┌────────────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ Project directory │ ├────────────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md │ User home │ └────────────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ Use defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘

EXTEND.md Supports: Default output directory | Default capture mode | Timeout settings

Features

  • Chrome CDP for full JavaScript rendering
  • Two capture modes: auto or wait-for-user
  • Clean markdown output with metadata
  • Handles login-required pages via wait mode

Usage

# Auto mode (default) - capture when page loads
npx -y bun ${SKILL_DIR}/scripts/main.ts <url>

# Wait mode - wait for user signal before capture
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --wait

# Save to specific file
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> -o output.md

Options

OptionDescription
<url>
URL to fetch
-o <path>
Output file path (default: auto-generated)
--wait
Wait for user signal before capturing
--timeout <ms>
Page load timeout (default: 30000)

Capture Modes

ModeBehaviorUse When
Auto (default)Capture on network idlePublic pages, static content
Wait (
--wait
)
User signals when readyLogin-required, lazy loading, paywalls

Wait mode workflow:

  1. Run with
    --wait
    → script outputs "Press Enter when ready"
  2. Ask user to confirm page is ready
  3. Send newline to stdin to trigger capture

Output Format

YAML front matter with

url
,
title
,
description
,
author
,
published
,
captured_at
fields, followed by converted markdown content.

Output Directory

url-to-markdown/<domain>/<slug>.md
  • <slug>
    : From page title or URL path (kebab-case, 2-6 words)
  • Conflict resolution: Append timestamp
    <slug>-YYYYMMDD-HHMMSS.md

Environment Variables

VariableDescription
URL_CHROME_PATH
Custom Chrome executable path
URL_DATA_DIR
Custom data directory
URL_CHROME_PROFILE_DIR
Custom Chrome profile directory

Troubleshooting: Chrome not found → set

URL_CHROME_PATH
. Timeout → increase
--timeout
. Complex pages → try
--wait
mode.

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.