git clone https://github.com/ComeOnOliver/skillshub
T=$(mktemp -d) && git clone --depth=1 https://github.com/ComeOnOliver/skillshub "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/TerminalSkills/skills/pentagi" ~/.claude/skills/comeonoliver-skillshub-pentagi && rm -rf "$T"
skills/TerminalSkills/skills/pentagi/SKILL.mdPentAGI
Overview
PentAGI is a fully autonomous AI agent system for penetration testing. It deploys a multi-agent architecture where specialized AI agents (research, development, infrastructure) collaborate to plan, execute, and report security assessments. All operations run in sandboxed Docker containers with 20+ professional security tools (nmap, metasploit, sqlmap, nikto, gobuster, etc.). Features a knowledge graph (Neo4j + Graphiti) for persistent learning across engagements, web intelligence via built-in browser, and comprehensive monitoring with Grafana/Langfuse. Self-hosted — your data stays on your infrastructure.
Instructions
Step 1: Quick Deployment
# Clone the repository git clone https://github.com/vxcontrol/pentagi.git cd pentagi # Copy and configure environment cp .env.example .env
# .env — Essential configuration # LLM Provider (choose one) OPENAI_API_KEY=sk-... # OpenAI # ANTHROPIC_API_KEY=sk-ant-... # Anthropic # OLLAMA_SERVER_URL=http://host:11434 # Local Ollama # Primary model for the main agent LLM_MODEL=gpt-4o # or claude-3-5-sonnet, llama3.1 LLM_PROVIDER=openai # openai, anthropic, ollama, bedrock, gemini, deepseek # Search provider for web intelligence TAVILY_API_KEY=tvly-... # Tavily (recommended) # GOOGLE_SEARCH_API_KEY=... # or Google Custom Search # SEARXNG_URL=http://localhost:8080 # or self-hosted SearXNG # Security — change these in production POSTGRES_PASSWORD=your-secure-password SECRET_KEY=your-secret-key-min-32-chars
# Deploy the full stack docker compose up -d # Access the web UI open http://localhost:3000
The stack deploys: React frontend, Go backend (GraphQL API), PostgreSQL with pgvector, Neo4j knowledge graph, security tools container, web scraper, and monitoring (Grafana + Langfuse).
Step 2: Configure AI Agents
PentAGI uses a team of specialized agents that collaborate on the assessment.
# Agent architecture (configured via UI or API) # # Primary Agent (Orchestrator) # ├── Researches target, plans attack phases # ├── Delegates to specialists: # │ ├── Research Agent — OSINT, web scraping, CVE lookup # │ ├── Development Agent — exploit modification, payload crafting # │ └── Infrastructure Agent — container management, tool setup # ├── Executes security tools in sandboxed containers # └── Generates vulnerability reports # # Each agent has access to: # - 20+ security tools (nmap, metasploit, sqlmap, nikto, etc.) # - Web browser for research # - Knowledge graph for persistent memory # - Previous engagement learnings
Step 3: Start a Penetration Test via Web UI
1. Open http://localhost:3000 2. Create a new engagement: - Target: IP address, domain, or CIDR range - Scope: Which services/ports to test - Rules of engagement: What's allowed (e.g., no DoS, no data exfiltration) - Objective: "Full security assessment" or specific focus 3. The AI agent: - Performs reconnaissance (nmap, whois, DNS enumeration) - Identifies services and versions - Searches for known vulnerabilities (CVE databases) - Attempts exploitation with appropriate tools - Documents findings with evidence - Generates a vulnerability report
Step 4: GraphQL API Integration
// Integrate PentAGI into your security pipeline via GraphQL const PENTAGI_URL = 'http://localhost:3000/graphql' // Create a new engagement const createEngagement = await fetch(PENTAGI_URL, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${API_TOKEN}`, }, body: JSON.stringify({ query: ` mutation CreateTask($input: CreateTaskInput!) { createTask(input: $input) { id status createdAt } } `, variables: { input: { target: '192.168.1.0/24', objective: 'Perform a comprehensive security assessment of the internal network segment. Focus on identifying exposed services, default credentials, unpatched vulnerabilities, and potential lateral movement paths.', scope: ['port-scan', 'service-enum', 'vuln-scan', 'web-app-test'], constraints: ['no-dos', 'no-data-exfil', 'business-hours-only'], }, }, }), }) // Monitor progress const checkStatus = await fetch(PENTAGI_URL, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': `Bearer ${API_TOKEN}` }, body: JSON.stringify({ query: ` query TaskStatus($id: ID!) { task(id: $id) { id status progress currentPhase findings { severity title description evidence remediation } logs { timestamp agent action output } } } `, variables: { id: engagement.id }, }), })
Step 5: Knowledge Graph — Persistent Learning
# PentAGI remembers findings across engagements via Neo4j + Graphiti # # After each engagement, the knowledge graph stores: # - Vulnerability patterns found per technology stack # - Successful exploitation techniques # - Network topology relationships # - Service fingerprints and their known weaknesses # # In future engagements, the agent queries this knowledge to: # - Prioritize attack vectors that worked before on similar targets # - Skip techniques known to fail on specific configurations # - Correlate findings across multiple assessments # - Identify systemic issues across the organization
Step 6: Monitoring and Reporting
# Grafana dashboards — real-time monitoring open http://localhost:3001 # Dashboards include: # - Active agent operations and tool execution # - Token usage and LLM cost tracking # - Container resource utilization # - Engagement timeline and progress # Langfuse — LLM observability open http://localhost:3002 # Track: # - Agent reasoning chains # - Prompt effectiveness # - Token usage per engagement phase # - Model performance comparison
# Export vulnerability report curl -H "Authorization: Bearer $API_TOKEN" \ "http://localhost:3000/api/v1/tasks/$TASK_ID/report" \ -o vulnerability-report.pdf # Report includes: # - Executive summary # - Detailed findings with CVSS scores # - Evidence (screenshots, command output) # - Remediation recommendations # - Risk matrix
Guidelines
- Always get written authorization before running PentAGI against any target. Unauthorized penetration testing is illegal.
- Deploy on an isolated network segment — PentAGI's sandboxed containers contain offensive tools.
- Use
to enforce rules of engagement — prevent DoS, data exfiltration, or out-of-scope testing.constraints - Start with
for local/private assessments — no data leaves your infrastructure.Ollama - The knowledge graph improves over time — run PentAGI consistently to build organizational security intelligence.
- Review agent actions in real-time via the web UI — autonomous doesn't mean unsupervised.
- PentAGI complements manual testing — use it for initial reconnaissance and known vulnerability scanning, then have humans investigate complex logic flaws.
- Resource requirements: 8GB+ RAM, 4+ CPU cores. GPU optional (only for local LLM via Ollama).
- Langfuse integration helps optimize LLM costs — track which models give best results per phase.