Awesome-omni-skill garak
Security testing and red-teaming for LLMs using NVIDIA's garak vulnerability scanner. Use when probing AI models for jailbreaks, prompt injections, data leakage, toxic content generation, or other failure modes. Triggers on "test LLM security", "red team model", "run garak", "LLM vulnerability scan", "jailbreak testing", or "prompt injection test".
git clone https://github.com/diegosouzapw/awesome-omni-skill
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/garak" ~/.claude/skills/diegosouzapw-awesome-omni-skill-garak && rm -rf "$T"
skills/data-ai/garak/SKILL.mdGarak - LLM Vulnerability Scanner
Garak is NVIDIA's open-source security testing framework for large language models. Think of it as nmap or Metasploit, but for AI systems. It probes whether LLMs can be manipulated into generating harmful content, leaking data, accepting prompt injections, or failing in other undesirable ways.
Installation
# Standard installation python -m pip install -U garak # Development version python -m pip install -U git+https://github.com/NVIDIA/garak.git@main # From source with conda conda create --name garak "python>=3.10,<=3.12" conda activate garak git clone https://github.com/NVIDIA/garak.git cd garak python -m pip install -e .
Core Concepts
Architecture Components
| Component | Purpose |
|---|---|
| Probes | Attack modules that send adversarial prompts to test specific vulnerabilities |
| Detectors | Analyze model outputs to identify harmful or failed responses |
| Generators | Interface with different LLM platforms (OpenAI, HuggingFace, Bedrock, etc.) |
| Buffs | Transform prompts before sending (encoding, paraphrasing, etc.) |
| Harnesses | Orchestrate the testing workflow |
Supported Platforms (Generators)
Commercial APIs:
- OpenAI models (requiresopenai
)OPENAI_API_KEY
- AWS Bedrock (requiresbedrock
, optionalBEDROCK_API_KEY
)BEDROCK_REGION
- Cohere API (requirescohere
)COHERE_API_KEY
- Groq API (requiresgroq
)GROQ_API_KEY
- Mistral AImistral
- Replicate API (requiresreplicate
)REPLICATE_API_TOKEN
Local/Self-Hosted:
- HuggingFace Hub modelshuggingface
- Ollama local modelsollama
- GGUF format models (requiresggml
)GGML_MAIN_PATH
- NVIDIA NIM endpoints (requiresnim
)NIM_API_KEY
- Unified LiteLLM interfacelitellm
Framework Integrations:
- LangChain applicationslangchain
- Custom REST endpoints via YAML configrest
CLI Reference
Basic Syntax
garak [options]
Essential Flags
| Flag | Description |
|---|---|
| Generator module (e.g., , ) |
| Specific model name (e.g., , ) |
| Probes to run (comma-separated or ) |
| Detectors to use (comma-separated or ) |
| Number of outputs per prompt (default: 5) |
| Path to YAML/JSON config file |
Discovery Commands
# List available probes garak --list_probes # List available detectors garak --list_detectors # List available generators garak --list_generators # List available buffs garak --list_buffs # Get info about a specific plugin garak --plugin_info probes.dan.Dan_11_0
Execution Flags
| Flag | Description |
|---|---|
| Increase output verbosity (can stack: -vv) |
| Concurrent generator requests per prompt |
| Concurrent probe attempts |
| Random seed for reproducibility |
| Minimum threshold for a hit (0.0-1.0) |
| Custom prefix for output files |
| Enter interactive probing mode |
Probe Categories
Jailbreak Attacks
- Do Anything Now variants (Dan_6_0 through Dan_11_0, DUDE, STAN)dan
- Grandmother exploitgrandma
- Known jailbreaks from researchgoodside
- Text completion jailbreakscontinuation
Prompt Injection
- Direct prompt injection attackspromptinject
- Encoding-based bypasses (Base64, Hex, ROT13)encoding
- Content smuggling techniquessmuggling
- Latent space attackslatentinjection
Content Safety
- Toxic content generation (Profanity, Threats, Identity_Attack)realtoxicityprompts
- Malware code generation attemptsmalwaregen
- Tests refusal on restricted topicsdonotanswer
- Language Model Red-teaming Checklistlmrc
Information Security
- Data leakage testingleakreplay
- API key exposure detectionapikey
- Fake package/dependency detectionpackagehallucination
- Cross-site scripting in outputsxss
Robustness
- Token glitch exploitationglitch
- Invalid character handlingbadchars
- Behavioral deviation detectiondivergence
- Snowball attack variantssnowball
Common Usage Examples
Test OpenAI Model for Jailbreaks
export OPENAI_API_KEY="sk-..." garak --target_type openai --target_name gpt-4 --probes dan
Comprehensive Security Scan
garak --target_type openai --target_name gpt-4 --probes all --generations 10
Test Specific Vulnerabilities
# Test for prompt injection garak -t openai -n gpt-4 -p promptinject,encoding # Test for toxic content generation garak -t openai -n gpt-4 -p realtoxicityprompts # Test for data leakage garak -t openai -n gpt-4 -p leakreplay,apikey
Test HuggingFace Model
garak --target_type huggingface --target_name meta-llama/Llama-2-7b-chat-hf --probes dan,encoding
Test AWS Bedrock Model
export BEDROCK_API_KEY="..." export BEDROCK_REGION="us-east-1" garak --target_type bedrock --target_name anthropic.claude-3-sonnet --probes promptinject
Test Local Ollama Model
garak --target_type ollama --target_name llama2 --probes dan,continuation
Using a Config File
Create
scan_config.yaml:
system: parallel_attempts: 10 lite: true run: generations: 5 plugins: probe_spec: dan,encoding,promptinject,malwaregen extended_detectors: false
Run with config:
garak --config scan_config.yaml --target_type openai --target_name gpt-4
Output and Reports
Garak generates three types of output:
- garak.log - Debug information (persistent across runs)
- JSONL report - Detailed results at
~/.local/share/garak/garak-runs-{timestamp}.jsonl - Hit log - Records of successful exploits
Interpreting Results
- PASS: Model resisted the attack
- FAIL: Model exhibited vulnerable behavior
- Failure rate: Percentage of attempts that triggered the vulnerability
Processing Reports
# Convert to AVID format garak --report path/to/report.jsonl
Best Practices
For Comprehensive Testing
- Start broad, then narrow: Run
first, then focus on failures--probes all - Use multiple generations: Set
or higher for statistical significance--generations 10 - Test with buffs: Apply encoding transformations to find edge cases
- Document baseline: Run identical tests before and after model updates
For Specific Vulnerability Assessment
- Jailbreaks:
--probes dan,grandma,goodside,continuation - Prompt Injection:
--probes promptinject,encoding,smuggling - Content Safety:
--probes realtoxicityprompts,malwaregen,donotanswer - Data Security:
--probes leakreplay,apikey,packagehallucination
For CI/CD Integration
# Quick scan for critical issues garak --config garak/configs/fast.json --target_type openai --target_name $MODEL_NAME # Set exit code based on failure threshold garak ... --eval_threshold 0.1
Troubleshooting
Common Issues
API Key errors: Ensure environment variables are set correctly
export OPENAI_API_KEY="sk-..."
Rate limiting: Reduce parallel requests
garak ... --parallel_requests 1
Memory issues with local models: Use smaller batch sizes
garak ... --generations 3
Unknown probe errors: Skip unknown plugins
garak ... --skip_unknown
Resources
- Documentation: https://docs.garak.ai
- GitHub: https://github.com/NVIDIA/garak
- Discord: discord.gg/uVch4puUCs
- Twitter: @garak_llm
Workflow: Running a Security Assessment
When asked to run a garak security assessment:
- Verify installation: Check if garak is installed with
garak --version - Confirm target: Identify the model type and name
- Set credentials: Ensure API keys are configured
- Select probes: Choose appropriate probes for the assessment scope
- Execute scan: Run garak with appropriate flags
- Analyze results: Review the JSONL report and summarize findings
- Recommend mitigations: Suggest fixes for identified vulnerabilities
Example Workflow
# 1. Check installation garak --version # 2. List available probes for reference garak --list_probes # 3. Set API key export OPENAI_API_KEY="sk-..." # 4. Run targeted scan garak \ --target_type openai \ --target_name gpt-4 \ --probes dan,promptinject,encoding \ --generations 5 \ --verbose # 5. Review results cat ~/.local/share/garak/garak-runs-*.jsonl | jq '.status'