Hacktricks-skills ai-fuzzing-assistant

AI-assisted fuzzing and vulnerability discovery. Use this skill whenever the user wants to generate fuzzing seeds, evolve grammars, analyze crashes, create proof-of-vulnerability exploits, or generate patches for discovered bugs. Trigger on mentions of fuzzing, AFL++, libFuzzer, vulnerability discovery, crash analysis, exploit generation, or security testing with LLMs.

install

source · Clone the upstream repo

git clone https://github.com/abelrguezr/hacktricks-skills

manifest: skills/AI/AI-Assisted-Fuzzing-and-Vulnerability-Discovery/SKILL.MD

source content

AI-Assisted Fuzzing & Vulnerability Discovery

This skill helps you leverage large language models to supercharge traditional vulnerability research pipelines. It covers seed generation, grammar evolution, crash analysis, exploit generation, and AI-guided patching.

When to Use This Skill

Use this skill when you need to:

Generate semantically valid fuzzing seeds for complex input formats (SQL, URLs, binary protocols)
Evolve fuzzing grammars based on coverage feedback
Analyze crashes and generate proof-of-vulnerability (PoV) exploits
Create mutation dictionaries for directed fuzzing
Cluster crash signatures and generate unified patches
Set up an end-to-end AI-assisted vulnerability discovery workflow

Core Techniques

1. LLM-Generated Seed Inputs

Traditional fuzzers mutate bytes blindly. LLMs can generate syntax-correct, security-relevant inputs that reach deeper code paths faster.

Use the seed generator script:

python scripts/gen_seeds.py --format <format> --count <N> --output <file>

Supported formats:

```
sql
```
- SQL injection payloads
```
xss
```
- Cross-site scripting payloads
```
path
```
- Path traversal payloads
```
url
```
- URL manipulation payloads
```
custom
```
- Custom format (provide prompt)

Example:

python scripts/gen_seeds.py --format sql --count 200 --output seeds.txt
afl-fuzz -i seeds.txt -o findings/ -- ./target @@

Tips:

Ask for diverse payload lengths and encodings (UTF-8, URL-encoded, UTF-16-LE)
Keep payloads under common length limits (≤256 bytes)
Regenerate with modified prompts to target specific vulnerabilities

2. Grammar-Evolution Fuzzing

Let the LLM evolve a grammar based on coverage feedback instead of just generating seeds.

Workflow:

Generate initial grammar via prompt
Fuzz for N minutes, collect coverage metrics
Feed uncovered areas back to LLM for grammar refinement
Repeat until coverage plateaus

Use the grammar evolution script:

python scripts/evolve_grammar.py \
  --grammar grammar.txt \
  --coverage-report coverage.json \
  --output grammar_v2.txt

Key parameters:

```
--max-epochs
```
- Number of refinement iterations (default: 5)
```
--coverage-threshold
```
- Stop when Δcoverage < threshold (default: 0.01)
```
--diff-mode
```
- Use diff/patch instructions for efficient edits

Example prompt for grammar refinement:

The previous grammar triggered 12% of program edges.
Functions not reached: parse_auth, handle_upload.
Add or modify rules to cover these areas.

3. Agent-Based PoV Generation

After finding a crash, you need a deterministic proof-of-vulnerability.

Use the crash analyzer script:

python scripts/analyze_crashes.py \
  --crash-db crashes/ \
  --target ./binary \
  --output povs/

What it does:

Reads crash signatures (PC, input slice, sanitizer messages)
Attempts to reproduce locally with gdb
Generates minimal exploit payloads
Validates in sandbox
Saves working PoVs, re-queues failures as fuzzing seeds

Output structure:

povs/
├── crash_001/
│   ├── input.bin          # Minimal triggering input
│   ├── gdb-session.txt    # Reproduction steps
│   └── analysis.md        # Vulnerability explanation
└── failed_seeds.txt       # Re-queued for fuzzing

4. Directed Fuzzing with Mutation Dictionaries

Fine-tuned code models can suggest targeted mutation patterns for specific functions.

Generate mutation dictionaries:

python scripts/gen_seeds.py \
  --format custom \
  --prompt "Give mutation dictionary entries likely to break memory safety in sprintf wrapper" \
  --output mutations.txt

Example output:

{"pattern": "%99999999s"}
{"pattern": "AAAAAAAA....<1024>....%n"}

Integrate with AFL++:

afl-fuzz -i seeds.txt -o findings/ \
  -x mutations.txt \
  -- ./target @@

5. AI-Guided Patching

Super Patches

Cluster crash signatures and generate unified patches that fix multiple bugs from a common root cause.

python scripts/analyze_crashes.py \
  --crash-db crashes/ \
  --mode super-patch \
  --output patches/

Prompt template:

Here are N stack traces + file snippets.
Identify the shared mistake and generate a unified diff fixing all occurrences.

Speculative Patch Queue

Interleave confirmed PoV-validated patches with speculative patches at a tunable ratio.

Configuration:

{
  "confirmed_ratio": 1,
  "speculative_ratio": 2,
  "penalty_threshold": 0.3
}

End-to-End Workflow

graph TD
    subgraph Discovery
        A[LLM Seed/Grammar Gen] --> B[Fuzzer]
        C[Fine-Tuned Model Dicts] --> B
    end
    B --> D[Crash DB]
    D --> E[Agent PoV Gen]
    E -->|valid PoV| PatchQueue
    D -->|cluster| F[LLM Super-Patch]
    PatchQueue --> G[Patch Submitter]

Recommended sequence:

Generate seeds with
```
gen_seeds.py
```
Run fuzzer (AFL++, libFuzzer, Honggfuzz)
Collect crashes in database
Run
```
analyze_crashes.py
```
for PoV generation
Generate patches with super-patch mode
Submit patches, monitor scoring
Feed failed PoVs back as fuzzing seeds

Best Practices

Seed Generation

Diversify encodings: Ask for UTF-8, URL-encoded, UTF-16-LE variants
Respect limits: Keep payloads under common length thresholds
Single script: Request self-contained Python scripts to avoid JSON parsing issues

Grammar Evolution

Budget tokens: Each refinement costs tokens; set reasonable limits
Use diffs: Prefer patch instructions over full rewrites
Stop early: Halt when coverage improvement plateaus (Δ < 0.01)

Crash Analysis

Parallelize: Spawn multiple agents with different models/temperatures
Validate: Always test PoVs in sandbox before submission
Feedback loop: Failed attempts become new fuzzing seeds

Patching

Cluster first: Group crashes by signature before patching
Cost model: Track penalties vs. points to tune speculative ratio
Unified diffs: Prefer single patches fixing multiple bugs

Integration with Existing Tools

AFL++

# Generate seeds
python scripts/gen_seeds.py --format sql --output seeds/

# Run with mutation dictionary
afl-fuzz -i seeds/ -o findings/ -x mutations.txt -- ./target @@

libFuzzer

# Generate grammar
python scripts/evolve_grammar.py --grammar grammar.txt

# Compile with grammar
clang -fsanitize=fuzzer -o fuzzer fuzzer.cpp
./fuzzer grammar.txt

Honggfuzz

# Generate seeds
python scripts/gen_seeds.py --format custom --prompt "..." --output seeds/

# Run
hfuzz_run -i seeds/ -o findings/ -- ./target @@

Troubleshooting

Seeds not triggering new coverage:

Increase payload diversity (ask for more encodings)
Try grammar evolution instead of static seeds
Check if target has input validation blocking malformed inputs

Grammar not improving:

Verify coverage metrics are accurate
Increase refinement epochs
Try different LLM or temperature settings

PoV generation failing:

Check crash reproducibility manually first
Increase agent count for parallel attempts
Lower temperature for more deterministic outputs

Patches being rejected:

Validate PoVs before patching
Reduce speculative patch ratio
Review crash clustering for false positives