Claude-skill-registry git-leak-recovery
This skill provides guidance for recovering secrets or sensitive data from git repositories (including orphaned commits, reflog, and unreachable objects) and subsequently cleaning up those secrets from git history. It should be used when tasks involve finding leaked credentials, recovering data from git history, or ensuring secrets are completely removed from a repository's object store.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/git-leak-recovery" ~/.claude/skills/majiayu000-claude-skill-registry-git-leak-recovery && rm -rf "$T"
skills/data/git-leak-recovery/SKILL.mdGit Leak Recovery
Overview
This skill covers two related but distinct operations: (1) recovering secrets or data that may exist in orphaned commits, reflog entries, or unreachable git objects, and (2) thoroughly cleaning repositories to ensure sensitive data cannot be recovered. Both operations require understanding git's internal storage mechanisms beyond standard high-level commands.
Workflow Decision Tree
Is the goal to RECOVER or CLEAN UP a secret? │ ├─► RECOVER secret from git history │ └─► Follow "Recovery Workflow" section │ └─► CLEAN UP / ensure secret is removed └─► Follow "Cleanup Workflow" section └─► CRITICAL: Follow "Verification Workflow" - superficial checks are insufficient
Recovery Workflow
Step 1: Identify Potential Secret Locations
Search these locations in order of likelihood:
-
Reflog - Records all ref updates, even after commits are "deleted"
git reflog --all git reflog show HEAD -
Unreachable/Orphaned commits - Commits not referenced by any branch
git fsck --unreachable --no-reflogs git fsck --lost-found # Creates refs in .git/lost-found/ -
Dangling objects - Objects not referenced by anything
git fsck --dangling -
Stashes - Often forgotten storage location
git stash list git stash show -p stash@{N} -
Git notes - Metadata attached to commits
git notes list
Step 2: Examine Suspicious Objects
Once object hashes are identified:
# View commit content git show <commit-hash> # View any object type git cat-file -p <object-hash> # Determine object type git cat-file -t <object-hash>
Step 3: Extract and Save
After locating the secret:
# Save specific file from a commit git show <commit>:<path/to/file> > recovered_file.txt # Or checkout the entire commit temporarily git checkout <commit> -- <path/to/file>
Cleanup Workflow
Step 1: Remove References
Remove all references that point to commits containing the secret:
# Delete reflog entries git reflog expire --expire=now --all # Remove backup refs (created by filter-branch, etc.) rm -rf .git/refs/original/ # Remove stashes if they contain secrets git stash drop stash@{N} # Remove notes if applicable git notes remove <commit>
Step 2: Garbage Collection
Force immediate garbage collection:
git gc --prune=now --aggressive
Step 3: Handle Pack Files
Pack files may retain objects even after gc:
# Repack aggressively git repack -a -d -f --depth=250 --window=250 # Or for thorough cleanup, unpack and repack mv .git/objects/pack/*.pack . git unpack-objects < *.pack rm *.pack git gc --prune=now
Verification Workflow
CRITICAL: Standard verification commands are often insufficient. Follow this thorough approach.
Common Pitfall: Superficial Verification
These commands are NOT sufficient for security-sensitive cleanup:
# INSUFFICIENT: Only searches working directory, not git objects grep -r "secret" . # INSUFFICIENT: Greps fsck output (SHA hashes), not object content git fsck --unreachable | grep "secret"
Thorough Verification Steps
-
Search all reachable commit content:
git log --all -p -S "secret_pattern" -- -
Search unreachable objects content (see
):references/verification_commands.md# List all objects and search their content git rev-list --all --objects | cut -d' ' -f1 | while read obj; do git cat-file -p "$obj" 2>/dev/null | grep -l "secret_pattern" && echo "Found in: $obj" done -
Search loose objects directly:
find .git/objects -type f -name '[0-9a-f]*' | while read f; do dir=$(dirname "$f" | xargs basename) file=$(basename "$f") hash="${dir}${file}" git cat-file -p "$hash" 2>/dev/null | grep "secret_pattern" && echo "Found in loose object: $hash" done -
Search pack files:
for pack in .git/objects/pack/*.idx; do git verify-pack -v "$pack" 2>/dev/null | awk '{print $1}' | while read obj; do git cat-file -p "$obj" 2>/dev/null | grep "secret_pattern" && echo "Found in pack: $obj" done done -
Verify no backup refs exist:
ls -la .git/refs/original/ 2>/dev/null find .git -name "*.orig" -o -name "*_BACKUP_*" -
Check for any remaining refs:
git for-each-ref --format='%(refname)'
Common Pitfalls
| Pitfall | Why It Happens | Solution |
|---|---|---|
| Grepping fsck output | outputs hashes, not content | Use on each object |
| Missing pack files | Objects may be packed, not loose | Search both loose objects and pack files |
| Forgetting reflog | Reflog preserves "deleted" commits | |
| Ignoring backup refs | filter-branch creates refs/original/ | Remove |
| Overlooking stashes | Stashes are separate ref namespace | Check |
| Missing git notes | Notes attach to commits separately | Check |
| Shallow verification | Working directory != git object store | Search object store directly |
Resources
references/
- Comprehensive verification scripts for thorough cleanup verificationverification_commands.md
These references provide ready-to-use command sequences for the verification workflow, which is the most error-prone part of git leak recovery tasks.