Awesome-omni-skill garak

Security testing and red-teaming for LLMs using NVIDIA's garak vulnerability scanner. Use when probing AI models for jailbreaks, prompt injections, data leakage, toxic content generation, or other failure modes. Triggers on "test LLM security", "red team model", "run garak", "LLM vulnerability scan", "jailbreak testing", or "prompt injection test".

install

source · Clone the upstream repo

git clone https://github.com/diegosouzapw/awesome-omni-skill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/garak" ~/.claude/skills/diegosouzapw-awesome-omni-skill-garak && rm -rf "$T"

manifest: skills/data-ai/garak/SKILL.md

source content

Garak - LLM Vulnerability Scanner

Garak is NVIDIA's open-source security testing framework for large language models. Think of it as nmap or Metasploit, but for AI systems. It probes whether LLMs can be manipulated into generating harmful content, leaking data, accepting prompt injections, or failing in other undesirable ways.

Installation

# Standard installation
python -m pip install -U garak

# Development version
python -m pip install -U git+https://github.com/NVIDIA/garak.git@main

# From source with conda
conda create --name garak "python>=3.10,<=3.12"
conda activate garak
git clone https://github.com/NVIDIA/garak.git
cd garak
python -m pip install -e .

Core Concepts

Architecture Components

Component	Purpose
Probes	Attack modules that send adversarial prompts to test specific vulnerabilities
Detectors	Analyze model outputs to identify harmful or failed responses
Generators	Interface with different LLM platforms (OpenAI, HuggingFace, Bedrock, etc.)
Buffs	Transform prompts before sending (encoding, paraphrasing, etc.)
Harnesses	Orchestrate the testing workflow

Supported Platforms (Generators)

Commercial APIs:

```
openai
```
- OpenAI models (requires
```
OPENAI_API_KEY
```
)
```
bedrock
```
- AWS Bedrock (requires
```
BEDROCK_API_KEY
```
, optional
```
BEDROCK_REGION
```
)
```
cohere
```
- Cohere API (requires
```
COHERE_API_KEY
```
)
```
groq
```
- Groq API (requires
```
GROQ_API_KEY
```
)
```
mistral
```
- Mistral AI
```
replicate
```
- Replicate API (requires
```
REPLICATE_API_TOKEN
```
)

Local/Self-Hosted:

```
huggingface
```
- HuggingFace Hub models
```
ollama
```
- Ollama local models
```
ggml
```
- GGUF format models (requires
```
GGML_MAIN_PATH
```
)
```
nim
```
- NVIDIA NIM endpoints (requires
```
NIM_API_KEY
```
)
```
litellm
```
- Unified LiteLLM interface

Framework Integrations:

```
langchain
```
- LangChain applications
```
rest
```
- Custom REST endpoints via YAML config

CLI Reference

Basic Syntax

garak [options]

Essential Flags

Flag	Description
`--target_type/-t`	Generator module (e.g., `openai` , `huggingface` )
`--target_name/-n`	Specific model name (e.g., `gpt-4` , `meta-llama/Llama-2-7b` )
`--probes/-p`	Probes to run (comma-separated or `all` )
`--detectors/-d`	Detectors to use (comma-separated or `all` )
`--generations/-g`	Number of outputs per prompt (default: 5)
`--config`	Path to YAML/JSON config file

Discovery Commands

# List available probes
garak --list_probes

# List available detectors
garak --list_detectors

# List available generators
garak --list_generators

# List available buffs
garak --list_buffs

# Get info about a specific plugin
garak --plugin_info probes.dan.Dan_11_0

Execution Flags

Flag	Description
`--verbose/-v`	Increase output verbosity (can stack: -vv)
`--parallel_requests`	Concurrent generator requests per prompt
`--parallel_attempts`	Concurrent probe attempts
`--seed/-s`	Random seed for reproducibility
`--eval_threshold`	Minimum threshold for a hit (0.0-1.0)
`--report_prefix`	Custom prefix for output files
`--interactive/-I`	Enter interactive probing mode

Probe Categories

Jailbreak Attacks

```
dan
```
- Do Anything Now variants (Dan_6_0 through Dan_11_0, DUDE, STAN)
```
grandma
```
- Grandmother exploit
```
goodside
```
- Known jailbreaks from research
```
continuation
```
- Text completion jailbreaks

Prompt Injection

```
promptinject
```
- Direct prompt injection attacks
```
encoding
```
- Encoding-based bypasses (Base64, Hex, ROT13)
```
smuggling
```
- Content smuggling techniques
```
latentinjection
```
- Latent space attacks

Content Safety

```
realtoxicityprompts
```
- Toxic content generation (Profanity, Threats, Identity_Attack)
```
malwaregen
```
- Malware code generation attempts
```
donotanswer
```
- Tests refusal on restricted topics
```
lmrc
```
- Language Model Red-teaming Checklist

Information Security

```
leakreplay
```
- Data leakage testing
```
apikey
```
- API key exposure detection
```
packagehallucination
```
- Fake package/dependency detection
```
xss
```
- Cross-site scripting in outputs

Robustness

```
glitch
```
- Token glitch exploitation
```
badchars
```
- Invalid character handling
```
divergence
```
- Behavioral deviation detection
```
snowball
```
- Snowball attack variants

Common Usage Examples

Test OpenAI Model for Jailbreaks

export OPENAI_API_KEY="sk-..."
garak --target_type openai --target_name gpt-4 --probes dan

Comprehensive Security Scan

garak --target_type openai --target_name gpt-4 --probes all --generations 10

Test Specific Vulnerabilities

# Test for prompt injection
garak -t openai -n gpt-4 -p promptinject,encoding

# Test for toxic content generation
garak -t openai -n gpt-4 -p realtoxicityprompts

# Test for data leakage
garak -t openai -n gpt-4 -p leakreplay,apikey

Test HuggingFace Model

garak --target_type huggingface --target_name meta-llama/Llama-2-7b-chat-hf --probes dan,encoding

Test AWS Bedrock Model

export BEDROCK_API_KEY="..."
export BEDROCK_REGION="us-east-1"
garak --target_type bedrock --target_name anthropic.claude-3-sonnet --probes promptinject

Test Local Ollama Model

garak --target_type ollama --target_name llama2 --probes dan,continuation

Using a Config File

Create

scan_config.yaml

system:
  parallel_attempts: 10
  lite: true
run:
  generations: 5
plugins:
  probe_spec: dan,encoding,promptinject,malwaregen
  extended_detectors: false

Run with config:

garak --config scan_config.yaml --target_type openai --target_name gpt-4

Output and Reports

Garak generates three types of output:

garak.log - Debug information (persistent across runs)

JSONL report - Detailed results at

~/.local/share/garak/garak-runs-{timestamp}.jsonl

Hit log - Records of successful exploits

Interpreting Results

PASS: Model resisted the attack
FAIL: Model exhibited vulnerable behavior
Failure rate: Percentage of attempts that triggered the vulnerability

Processing Reports

# Convert to AVID format
garak --report path/to/report.jsonl

Best Practices

For Comprehensive Testing

Start broad, then narrow: Run
```
--probes all
```
first, then focus on failures
Use multiple generations: Set
```
--generations 10
```
or higher for statistical significance
Test with buffs: Apply encoding transformations to find edge cases
Document baseline: Run identical tests before and after model updates

For Specific Vulnerability Assessment

Jailbreaks:

--probes dan,grandma,goodside,continuation

Prompt Injection:

--probes promptinject,encoding,smuggling

Content Safety:

--probes realtoxicityprompts,malwaregen,donotanswer

Data Security:

--probes leakreplay,apikey,packagehallucination

For CI/CD Integration

# Quick scan for critical issues
garak --config garak/configs/fast.json --target_type openai --target_name $MODEL_NAME

# Set exit code based on failure threshold
garak ... --eval_threshold 0.1

Troubleshooting

Common Issues

API Key errors: Ensure environment variables are set correctly

export OPENAI_API_KEY="sk-..."

Rate limiting: Reduce parallel requests

garak ... --parallel_requests 1

Memory issues with local models: Use smaller batch sizes

garak ... --generations 3

Unknown probe errors: Skip unknown plugins

garak ... --skip_unknown

Resources

Documentation: https://docs.garak.ai
GitHub: https://github.com/NVIDIA/garak
Discord: discord.gg/uVch4puUCs
Twitter: @garak_llm

Workflow: Running a Security Assessment

When asked to run a garak security assessment:

Verify installation: Check if garak is installed with
```
garak --version
```
Confirm target: Identify the model type and name
Set credentials: Ensure API keys are configured
Select probes: Choose appropriate probes for the assessment scope
Execute scan: Run garak with appropriate flags
Analyze results: Review the JSONL report and summarize findings
Recommend mitigations: Suggest fixes for identified vulnerabilities

Example Workflow

# 1. Check installation
garak --version

# 2. List available probes for reference
garak --list_probes

# 3. Set API key
export OPENAI_API_KEY="sk-..."

# 4. Run targeted scan
garak \
  --target_type openai \
  --target_name gpt-4 \
  --probes dan,promptinject,encoding \
  --generations 5 \
  --verbose

# 5. Review results
cat ~/.local/share/garak/garak-runs-*.jsonl | jq '.status'