Cc-skills dump-channel

Use when user wants to archive, dump, or back up an entire Telegram channel or chat history to NDJSON with all media files downloaded. Full history extraction with resume support.

install

source · Clone the upstream repo

git clone https://github.com/terrylica/cc-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/terrylica/cc-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/tlg/skills/dump-channel" ~/.claude/skills/terrylica-cc-skills-dump-channel && rm -rf "$T"

manifest: plugins/tlg/skills/dump-channel/SKILL.md

source content

Dump Telegram Channel History

Archive a complete Telegram channel/group/chat to NDJSON + downloaded media files.

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.

Preflight

Session must exist:

~/.local/share/telethon/<profile>.session

If missing, run
```
/tlg:setup
```
first

User must be subscribed to (or a member of) the target channel/chat

Usage

/usr/bin/env bash << 'EOF'
SCRIPT="${CLAUDE_PLUGIN_ROOT:-$HOME/.claude/plugins/marketplaces/cc-skills/plugins/tlg}/scripts/tg-cli.py"

# Full dump: NDJSON + all media (photos, videos, documents)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName

# NDJSON only (skip media downloads — much faster)
uv run --python 3.13 "$SCRIPT" dump @ChannelName ./output/ChannelName --no-media

# Dump by numeric chat ID
uv run --python 3.13 "$SCRIPT" dump -1001234567890 ./output/MyChannel

# Use a different profile
uv run --python 3.13 "$SCRIPT" -p missterryli dump @ChannelName ./output/ChannelName
EOF

Parameters

Parameter	Type	Description
chat	string/int	Channel username (@name) or numeric chat ID
output	path	Output directory (messages.ndjson + media/ created inside)
`--no-media`	flag	Skip media downloads, produce NDJSON only

Output Structure

output/ChannelName/
├── messages.ndjson   ← one JSON object per line, chronological (oldest first)
└── media/
    ├── 6.jpg         ← named by message ID for cross-referencing
    ├── 12.png
    ├── 45.mp4
    └── ...

NDJSON Record Schema

Each line is a JSON object with these fields:

Field	Type	Description
`id`	int	Telegram message ID
`date`	string	ISO 8601 timestamp with timezone
`text`	string/null	Full message text (no truncation)
`has_media`	bool	Whether message contains media
`media_type`	string/null	Telethon class name (MessageMediaPhoto, etc.)
`media_file`	string/null	Filename in media/ dir (e.g., "6.jpg")
`views`	int/null	View count (channels only)
`forwards`	int/null	Forward count
`reply_to_msg_id`	int/null	Parent message ID if reply
`grouped_id`	int/null	Album group ID (shared across album messages)
`edit_date`	string/null	ISO 8601 timestamp of last edit
`sender.id`	int	Sender's Telegram user/channel ID
`sender.name`	string	Display name (channel title or user first name)
`sender.username`	string/null	@username if set

Resume Support

Re-running the same command skips already-downloaded media files (checks

dest.exists()

). The NDJSON is fully rewritten each run. This makes it safe to resume interrupted downloads.

Querying the Output

# jq: find all GOLD BUY signals with chart screenshots
jq 'select(.text != null and (.text | test("GOLD.*BUY")) and .media_file != null)' messages.ndjson

# DuckDB: aggregate by date
duckdb -c "SELECT date::DATE as day, count(*) FROM read_ndjson('messages.ndjson') GROUP BY day ORDER BY day"

# Python/Polars
import polars as pl
df = pl.read_ndjson("messages.ndjson")

Performance Notes

~3000 messages + 1700 media files takes ~3-5 minutes
Telegram may briefly disconnect mid-download (
```
Server closed the connection
```
) — Telethon auto-reconnects
For very large channels (10k+ messages), expect 10-15 minutes with media

Recommended Storage Pattern

For git-tracked projects, gitignore the media folder:

# data/telegram/.gitignore
*/media/

This keeps the NDJSON (metadata) in version control while keeping large media files local-only.

Anti-Patterns

Don't dump channels you're not subscribed to — Telethon needs access via your account
Don't run multiple dumps concurrently on the same profile — session file contention

Post-Execution Reflection

After this skill completes, check before closing:

Did the command succeed? — If not, fix the instruction or error table that caused the failure.
Did parameters or output change? — If tg-cli.py's interface drifted, update Usage examples and Parameters table to match.
Was a workaround needed? — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.

Only update if the issue is real and reproducible — not speculative.