Cc-skills dump-channel

Use when user wants to archive, dump, or back up an entire Telegram channel or chat history to NDJSON with all media files downloaded. Full history extraction with resume support.

install
source · Clone the upstream repo
git clone https://github.com/terrylica/cc-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/terrylica/cc-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/tlg/skills/dump-channel" ~/.claude/skills/terrylica-cc-skills-dump-channel && rm -rf "$T"
manifest: plugins/tlg/skills/dump-channel/SKILL.md
source content

Dump Telegram Channel History

Archive a complete Telegram channel/group/chat to NDJSON + downloaded media files.

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.

Preflight

  1. Session must exist:
    ~/.local/share/telethon/<profile>.session
    • If missing, run
      /tlg:setup
      first
  2. User must be subscribed to (or a member of) the target channel/chat

Usage

/usr/bin/env bash << 'EOF'
SCRIPT="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/cc-skills/plugins/tlg}/scripts/tg-cli.py"

# Full dump: NDJSON + all media (photos, videos, documents)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName

# NDJSON only (skip media downloads — much faster)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName --no-media

# Dump by numeric chat ID
uv run --python 3.13 "$SCRIPT" dump -1001234567890 ./output/MyChannel

# Use a different profile
uv run --python 3.13 "$SCRIPT" -p missterryli dump @ChannelName ./output/ChannelName
EOF

Parameters

ParameterTypeDescription
chatstring/intChannel username (@name) or numeric chat ID
outputpathOutput directory (messages.ndjson + media/ created inside)
--no-media
flagSkip media downloads, produce NDJSON only

Output Structure

output/ChannelName/
├── messages.ndjson   ← one JSON object per line, chronological (oldest first)
└── media/
    ├── 6.jpg         ← named by message ID for cross-referencing
    ├── 12.png
    ├── 45.mp4
    └── ...

NDJSON Record Schema

Each line is a JSON object with these fields:

FieldTypeDescription
id
intTelegram message ID
date
stringISO 8601 timestamp with timezone
text
string/nullFull message text (no truncation)
has_media
boolWhether message contains media
media_type
string/nullTelethon class name (MessageMediaPhoto, etc.)
media_file
string/nullFilename in media/ dir (e.g., "6.jpg")
views
int/nullView count (channels only)
forwards
int/nullForward count
reply_to_msg_id
int/nullParent message ID if reply
grouped_id
int/nullAlbum group ID (shared across album messages)
edit_date
string/nullISO 8601 timestamp of last edit
sender.id
intSender's Telegram user/channel ID
sender.name
stringDisplay name (channel title or user first name)
sender.username
string/null@username if set

Resume Support

Re-running the same command skips already-downloaded media files (checks

dest.exists()
). The NDJSON is fully rewritten each run. This makes it safe to resume interrupted downloads.

Querying the Output

# jq: find all GOLD BUY signals with chart screenshots
jq 'select(.text != null and (.text | test("GOLD.*BUY")) and .media_file != null)' messages.ndjson

# DuckDB: aggregate by date
duckdb -c "SELECT date::DATE as day, count(*) FROM read_ndjson('messages.ndjson') GROUP BY day ORDER BY day"

# Python/Polars
import polars as pl
df = pl.read_ndjson("messages.ndjson")

Performance Notes

  • ~3000 messages + 1700 media files takes ~3-5 minutes
  • Telegram may briefly disconnect mid-download (
    Server closed the connection
    ) — Telethon auto-reconnects
  • For very large channels (10k+ messages), expect 10-15 minutes with media

Recommended Storage Pattern

For git-tracked projects, gitignore the media folder:

# data/telegram/.gitignore
*/media/

This keeps the NDJSON (metadata) in version control while keeping large media files local-only.

Anti-Patterns

  • Don't dump channels you're not subscribed to — Telethon needs access via your account
  • Don't run multiple dumps concurrently on the same profile — session file contention

Post-Execution Reflection

After this skill completes, check before closing:

  1. Did the command succeed? — If not, fix the instruction or error table that caused the failure.
  2. Did parameters or output change? — If tg-cli.py's interface drifted, update Usage examples and Parameters table to match.
  3. Was a workaround needed? — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.

Only update if the issue is real and reproducible — not speculative.