Claude-skill-registry google-continuous-fuzzing
Apply Google's continuous fuzzing methodology using OSS-Fuzz and ClusterFuzz. Emphasizes coverage-guided fuzzing, automated bug triage, and integration into CI/CD. Use when building robust testing infrastructure or finding security vulnerabilities at scale.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/continuous-fuzzing" ~/.claude/skills/majiayu000-claude-skill-registry-google-continuous-fuzzing && rm -rf "$T"
skills/data/continuous-fuzzing/SKILL.mdGoogle Continuous Fuzzing
Overview
Google's continuous fuzzing infrastructure (OSS-Fuzz + ClusterFuzz) has found over 10,000 bugs in 1,000+ open source projects, including critical security vulnerabilities like Heartbleed-class bugs. This technique turns fuzzing from a one-time activity into a continuous quality gate.
References
- Paper: "OSS-Fuzz - Google's continuous fuzzing service for open source software" (USENIX Security '17)
- Documentation: https://google.github.io/oss-fuzz/
- ClusterFuzz: https://google.github.io/clusterfuzz/
Core Philosophy
"Fuzzing should be continuous, not a one-time event."
"Every bug found by fuzzing is a bug not found by attackers."
Fuzzing is most effective when it runs continuously against the latest code, with automatic bug reporting and regression tracking.
Key Concepts
Coverage-Guided Fuzzing
Traditional Fuzzing: Random input generation Coverage-Guided Fuzzing: Inputs that increase code coverage are kept Corpus → Mutate → Execute → Measure Coverage → Keep interesting inputs ↑ | └──────────────────────────────────────────────────────┘
The Fuzzing Pipeline
- Build: Compile with sanitizers (ASan, MSan, UBSan)
- Fuzz: Run fuzzers continuously on cluster
- Triage: Automatically deduplicate and file bugs
- Reproduce: Generate minimal reproducer
- Verify: Confirm fix eliminates the bug
- Regress: Add reproducer to regression corpus
When Implementing
Always
- Use sanitizers (AddressSanitizer, MemorySanitizer, UndefinedBehaviorSanitizer)
- Build seed corpus from existing tests and real inputs
- Integrate fuzzing into CI/CD pipeline
- Track coverage metrics over time
- Minimize reproducers for easier debugging
- Keep regression tests for all found bugs
Never
- Fuzz only once and declare victory
- Ignore crashes in dependencies
- Skip sanitizers to "improve performance"
- Discard valuable corpus data
- Treat fuzzing as separate from testing
Prefer
- LibFuzzer/AFL++ over basic random testing
- Structure-aware fuzzing for complex formats
- Continuous fuzzing over periodic runs
- Automated triage over manual analysis
- Coverage metrics over time-based metrics
Implementation Patterns
Basic Fuzz Target (C/C++)
// fuzz_target.cc // A fuzz target is a function that takes arbitrary bytes #include <stdint.h> #include <stddef.h> // Your library headers #include "parser.h" extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { // Call the function under test with fuzzer-provided data parse_input(data, size); // Return 0 - non-zero return values are reserved return 0; } // Build with: // clang++ -g -fsanitize=address,fuzzer fuzz_target.cc parser.cc -o fuzzer // Run with: // ./fuzzer corpus_dir/
Fuzz Target with Structure
// Structure-aware fuzzing for better coverage #include <stdint.h> #include <stddef.h> #include <string.h> // Fuzz a function expecting a specific structure struct Header { uint32_t magic; uint32_t version; uint32_t length; uint8_t flags; }; extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) { // Need at least header size if (size < sizeof(Header)) { return 0; } Header header; memcpy(&header, data, sizeof(Header)); // Constrain to valid magic (helps fuzzer find deeper paths) if (header.magic != 0xDEADBEEF) { return 0; } // Constrain length to available data size_t payload_size = size - sizeof(Header); if (header.length > payload_size) { header.length = payload_size; } const uint8_t *payload = data + sizeof(Header); // Now fuzz with valid-looking input process_packet(&header, payload, header.length); return 0; }
Python Fuzzing with Atheris
#!/usr/bin/env python3 # fuzz_json_parser.py import atheris import sys # Import the module to fuzz import json def test_one_input(data): """Fuzz target: called with random bytes""" fdp = atheris.FuzzedDataProvider(data) # Convert bytes to string for JSON parsing json_str = fdp.ConsumeUnicodeNoSurrogates( fdp.ConsumeIntInRange(0, 1024) ) try: # This should never crash, only raise ValueError json.loads(json_str) except (json.JSONDecodeError, ValueError): pass # Expected for invalid input except Exception as e: # Unexpected exception = potential bug raise def main(): atheris.Setup(sys.argv, test_one_input) atheris.Fuzz() if __name__ == "__main__": main() # Run with: # python fuzz_json_parser.py corpus_dir/ -max_len=1024
Go Fuzzing (Native)
// fuzz_test.go // Go 1.18+ has built-in fuzzing support package parser import ( "testing" ) func FuzzParseInput(f *testing.F) { // Seed corpus with known inputs f.Add([]byte("valid input")) f.Add([]byte("{\"key\": \"value\"}")) f.Add([]byte("")) f.Fuzz(func(t *testing.T, data []byte) { // Call function under test result, err := ParseInput(data) if err != nil { // Errors are fine, panics are not return } // Optionally verify invariants if result != nil && result.Length < 0 { t.Errorf("negative length: %d", result.Length) } }) } // Run with: // go test -fuzz=FuzzParseInput -fuzztime=60s
OSS-Fuzz Integration
# Dockerfile for OSS-Fuzz integration FROM gcr.io/oss-fuzz-base/base-builder RUN apt-get update && apt-get install -y \ make \ autoconf \ automake \ libtool # Clone your project RUN git clone --depth 1 https://github.com/your/project.git WORKDIR project COPY build.sh $SRC/
#!/bin/bash # build.sh - OSS-Fuzz build script # Build the library with fuzzing instrumentation ./configure make clean make -j$(nproc) CC="$CC" CXX="$CXX" CFLAGS="$CFLAGS" CXXFLAGS="$CXXFLAGS" # Build fuzz targets $CXX $CXXFLAGS $LIB_FUZZING_ENGINE \ fuzz_target.cc -o $OUT/fuzz_target \ -I. libproject.a # Copy seed corpus zip -j $OUT/fuzz_target_seed_corpus.zip seeds/* # Copy dictionary if available cp project.dict $OUT/fuzz_target.dict
Corpus Management
# corpus_manager.py # Manage and minimize fuzzing corpus import subprocess import hashlib import os from pathlib import Path class CorpusManager: def __init__(self, corpus_dir: str): self.corpus_dir = Path(corpus_dir) self.corpus_dir.mkdir(exist_ok=True) def add(self, data: bytes) -> str: """Add input to corpus with content-based filename""" hash_name = hashlib.sha256(data).hexdigest()[:16] path = self.corpus_dir / hash_name if not path.exists(): path.write_bytes(data) return str(path) def minimize(self, fuzzer_binary: str) -> int: """Minimize corpus using fuzzer's merge feature""" minimized_dir = self.corpus_dir.parent / "corpus_minimized" minimized_dir.mkdir(exist_ok=True) # LibFuzzer merge minimizes corpus result = subprocess.run([ fuzzer_binary, "-merge=1", str(minimized_dir), str(self.corpus_dir) ], capture_output=True) return len(list(minimized_dir.iterdir())) def get_coverage_report(self, fuzzer_binary: str) -> dict: """Generate coverage report for corpus""" # Run with coverage instrumentation result = subprocess.run([ fuzzer_binary, "-runs=0", # Don't generate new inputs str(self.corpus_dir) ], capture_output=True, text=True) # Parse coverage from output # (actual implementation depends on sanitizer output format) return {"corpus_size": len(list(self.corpus_dir.iterdir()))}
CI/CD Integration
# .github/workflows/fuzz.yml name: Continuous Fuzzing on: push: branches: [main] schedule: - cron: '0 0 * * *' # Daily jobs: fuzz: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Build fuzzer run: | clang++ -g -O1 \ -fsanitize=address,fuzzer \ -fno-omit-frame-pointer \ fuzz_target.cc -o fuzzer - name: Download corpus uses: actions/cache@v3 with: path: corpus key: fuzz-corpus-${{ github.sha }} restore-keys: fuzz-corpus- - name: Run fuzzer run: | mkdir -p corpus timeout 600 ./fuzzer corpus/ -max_total_time=600 || true - name: Upload crash artifacts if: always() uses: actions/upload-artifact@v3 with: name: crashes path: crash-* if-no-files-found: ignore - name: Check for crashes run: | if ls crash-* 1> /dev/null 2>&1; then echo "Crashes found!" exit 1 fi
Mental Model
Google's fuzzing approach asks:
- Is this running continuously? One-time fuzzing misses regression bugs
- Are sanitizers enabled? Crashes without sanitizers miss real bugs
- Is the corpus growing? Coverage should increase over time
- Are bugs being tracked? Automatic filing and deduplication
- Are fixes verified? Reproducers become regression tests
Signature Moves
- Coverage-guided mutation (LibFuzzer, AFL++)
- Sanitizer builds (ASan, MSan, UBSan, TSan)
- Automatic corpus management and minimization
- CI/CD integration for every commit
- Regression corpus from found bugs
- Structure-aware fuzzing for protocols