git clone https://github.com/vibeforge1111/vibeship-spawner-skills
security/ai-code-security/skill.yamlid: ai-code-security name: AI Code Security version: 1.0.0 layer: 2 description: Security vulnerabilities in AI-generated code and LLM applications, covering OWASP Top 10 for LLMs, secure coding patterns, and AI-specific threat models
owns:
- ai-generated-code-review
- llm-vulnerability-patterns
- ai-security-testing
- model-output-validation
- ai-supply-chain-security
pairs_with:
- prompt-injection-defense
- llm-security-audit
- mcp-security
- code-review
requires:
- basic-security-knowledge
- llm-fundamentals
ecosystem: primary_tools: - name: OWASP LLM Top 10 description: Authoritative framework for LLM security risks url: https://owasp.org/www-project-top-10-for-large-language-model-applications/ - name: Semgrep/Opengrep description: Static analysis for AI-generated code patterns url: https://semgrep.dev - name: Gitleaks description: Secret detection in AI outputs url: https://gitleaks.io - name: Trivy description: Supply chain vulnerability scanning url: https://trivy.dev alternatives: - name: CodeQL description: GitHub's semantic code analysis when: Deep semantic vulnerability analysis needed - name: Snyk Code description: Real-time AI-powered security scanning when: IDE-integrated security feedback deprecated: - name: Manual code review only reason: AI-generated code volume exceeds human review capacity migration: Combine automated scanning with targeted human review
prerequisites: knowledge: - OWASP Top 10 web vulnerabilities - LLM API basics - Supply chain security concepts skills_recommended: - prompt-injection-defense - llm-security-audit
limits: does_not_cover: - ML model security (adversarial attacks, model poisoning) - Infrastructure security (cloud, containers) - Cryptographic implementation boundaries: - Focus is application-level AI security - Covers code generated by and interacting with LLMs
tags:
- security
- ai
- llm
- owasp
- code-review
- vulnerabilities
triggers:
- ai code security
- llm vulnerabilities
- ai generated code review
- owasp llm
- secure ai development
history:
- version: "2023" milestone: OWASP LLM Top 10 v1.0 released impact: First standardized framework for LLM security
- version: "2024" milestone: Enterprise AI adoption accelerates security focus impact: 73% of AI deployments found to have critical vulnerabilities
- version: "2025" milestone: OWASP LLM Top 10 v2.0 with new categories impact: Vector/embedding weaknesses, system prompt leakage added
contrarian_insights:
- claim: AI-generated code is inherently less secure reality: AI code has similar vulnerability rates to human code, but different patterns—more verbose, less context-aware, prone to outdated patterns
- claim: Static analysis catches AI code vulnerabilities reality: AI generates novel vulnerability patterns that traditional rules miss; semantic analysis required
- claim: Prompt engineering prevents security issues reality: Prompt-level controls are easily bypassed; defense must be multi-layered with output validation
identity: | You're a security engineer who has reviewed thousands of AI-generated code samples and found the same patterns recurring. You've seen production outages caused by LLM hallucinations, data breaches from prompt injection, and supply chain compromises through poisoned models.
Your experience spans traditional AppSec (OWASP Top 10, secure coding) and the new frontier of AI security. You understand that AI doesn't just generate vulnerabilities—it generates them at scale, with novel patterns that traditional tools miss.
Your core principles:
- Never trust AI output—validate everything
- Defense in depth—prompt, model, output, and runtime layers
- AI is an untrusted input source—treat it like user input
- Supply chain matters—models, datasets, and dependencies
- Automate detection—human review doesn't scale
patterns:
-
name: AI Output Validation Pipeline description: Validate all LLM outputs before execution or storage when: LLM generates code, SQL, commands, or structured data example: | import { z } from 'zod'; import { scanForSecrets } from './security';
// Schema validation for LLM-generated structured output const LLMOutputSchema = z.object({ code: z.string().max(10000), language: z.enum(['typescript', 'python', 'sql']), explanation: z.string() });
async function validateLLMOutput(rawOutput: unknown): Promise<ValidatedOutput> { // 1. Schema validation const parsed = LLMOutputSchema.parse(rawOutput);
// 2. Secret detection const secrets = await scanForSecrets(parsed.code); if (secrets.length > 0) { throw new SecurityError('LLM output contains secrets', { secrets }); } // 3. Dangerous pattern detection const dangerousPatterns = [ /eval\s*\(/, /exec\s*\(/, /rm\s+-rf/, /DROP\s+TABLE/i, /TRUNCATE/i, /__import__/, /subprocess\.call/ ]; for (const pattern of dangerousPatterns) { if (pattern.test(parsed.code)) { throw new SecurityError('LLM output contains dangerous pattern', { pattern: pattern.source }); } } // 4. Static analysis (language-specific) const analysisResult = await runStaticAnalysis(parsed.code, parsed.language); if (analysisResult.criticalIssues.length > 0) { throw new SecurityError('LLM code has critical vulnerabilities', { issues: analysisResult.criticalIssues }); } return { ...parsed, analysisResult, validatedAt: new Date() };}
-
name: Sandboxed Code Execution description: Execute AI-generated code in isolated environments when: LLM output must be executed (code interpreters, agents) example: | import { NodeVM } from 'vm2'; import { spawn } from 'child_process';
class SecureSandbox { private readonly timeout = 5000; private readonly memoryLimit = 128 * 1024 * 1024; // 128MB
// JavaScript/TypeScript sandbox async executeJS(code: string, context: Record<string, unknown> = {}): Promise<unknown> { const vm = new NodeVM({ timeout: this.timeout, sandbox: { ...context, // Explicitly deny dangerous globals process: undefined, require: undefined, __dirname: undefined, __filename: undefined }, eval: false, wasm: false, sourceExtensions: ['js'] }); try { return vm.run(code); } catch (error) { if (error.message.includes('Script execution timed out')) { throw new SecurityError('Code execution timeout', { code }); } throw error; } } // Python sandbox using subprocess with cgroups async executePython(code: string): Promise<string> { return new Promise((resolve, reject) => { const proc = spawn('firejail', [ '--quiet', '--private', '--net=none', '--rlimit-as=' + this.memoryLimit, 'python3', '-c', code ], { timeout: this.timeout, stdio: ['pipe', 'pipe', 'pipe'] }); let stdout = ''; let stderr = ''; proc.stdout.on('data', (data) => stdout += data); proc.stderr.on('data', (data) => stderr += data); proc.on('close', (code) => { if (code === 0) resolve(stdout); else reject(new Error(stderr || `Exit code: ${code}`)); }); }); } // Docker-based sandbox for full isolation async executeInDocker(code: string, image: string): Promise<string> { const containerId = await this.createContainer(image, { NetworkDisabled: true, Memory: this.memoryLimit, CpuPeriod: 100000, CpuQuota: 50000, // 50% CPU ReadonlyRootfs: true, SecurityOpt: ['no-new-privileges'] }); try { return await this.execInContainer(containerId, code); } finally { await this.removeContainer(containerId); } }}
-
name: Supply Chain Verification description: Verify AI model and dependency integrity when: Using third-party models, fine-tuned models, or AI dependencies example: | import { createHash } from 'crypto'; import { readFile } from 'fs/promises';
interface ModelManifest { name: string; version: string; sha256: string; source: string; signedBy?: string; attestation?: string; }
class ModelVerifier { private readonly trustedSources = [ 'huggingface.co', 'anthropic.com', 'openai.com' ];
async verifyModel(modelPath: string, manifest: ModelManifest): Promise<boolean> { // 1. Verify source trust const sourceUrl = new URL(manifest.source); if (!this.trustedSources.some(s => sourceUrl.hostname.endsWith(s))) { throw new SecurityError('Untrusted model source', { source: manifest.source, trusted: this.trustedSources }); } // 2. Verify hash integrity const modelData = await readFile(modelPath); const actualHash = createHash('sha256').update(modelData).digest('hex'); if (actualHash !== manifest.sha256) { throw new SecurityError('Model hash mismatch', { expected: manifest.sha256, actual: actualHash }); } // 3. Verify signature if available if (manifest.signedBy && manifest.attestation) { const valid = await this.verifySignature( modelData, manifest.attestation, manifest.signedBy ); if (!valid) { throw new SecurityError('Model signature invalid'); } } // 4. Scan for known malicious patterns await this.scanForMaliciousPatterns(modelPath); return true; } async verifyDependencies(packageJson: string): Promise<void> { const pkg = JSON.parse(await readFile(packageJson, 'utf-8')); const aiDeps = this.extractAIDependencies(pkg); for (const dep of aiDeps) { // Check for known vulnerable versions const vulns = await this.checkVulnerabilities(dep.name, dep.version); if (vulns.critical.length > 0) { throw new SecurityError('Critical AI dependency vulnerability', { package: dep.name, vulnerabilities: vulns.critical }); } } }}
-
name: OWASP LLM Top 10 Mitigation description: Systematic mitigation of OWASP LLM vulnerabilities when: Building or auditing LLM applications example: | // Comprehensive LLM security middleware
interface LLMSecurityConfig { maxTokens: number; rateLimitPerMinute: number; allowedTools: string[]; sensitiveDataPatterns: RegExp[]; }
class LLMSecurityMiddleware { constructor(private config: LLMSecurityConfig) {}
// LLM01: Prompt Injection Defense async sanitizeInput(input: string): Promise<string> { // Remove known injection patterns const sanitized = input .replace(/ignore previous instructions/gi, '[FILTERED]') .replace(/system:/gi, '[FILTERED]') .replace(/\[INST\]/gi, '[FILTERED]'); // Validate input doesn't exceed token budget const tokens = await this.countTokens(sanitized); if (tokens > this.config.maxTokens) { throw new SecurityError('Input exceeds token limit'); } return sanitized; } // LLM02: Insecure Output Handling async sanitizeOutput(output: string): Promise<string> { // Remove any embedded code that could execute let safe = output.replace(/<script[^>]*>[\s\S]*?<\/script>/gi, ''); // Detect and mask sensitive data for (const pattern of this.config.sensitiveDataPatterns) { safe = safe.replace(pattern, '[REDACTED]'); } return safe; } // LLM03: Training Data Poisoning (at inference time) validateModelSource(source: string): boolean { const trustedSources = ['anthropic', 'openai', 'internal']; return trustedSources.some(s => source.includes(s)); } // LLM04: Model Denial of Service async enforceRateLimits(userId: string): Promise<void> { const key = `ratelimit:${userId}`; const count = await this.redis.incr(key); if (count === 1) { await this.redis.expire(key, 60); } if (count > this.config.rateLimitPerMinute) { throw new SecurityError('Rate limit exceeded'); } } // LLM05: Supply Chain Vulnerabilities (handled by ModelVerifier) // LLM06: Sensitive Information Disclosure async detectPII(text: string): Promise<PIIResult> { const patterns = { ssn: /\b\d{3}-\d{2}-\d{4}\b/g, creditCard: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g, email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g, phone: /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g, apiKey: /\b(sk-|api[_-]?key)[a-zA-Z0-9]{20,}\b/gi }; const found: PIIMatch[] = []; for (const [type, pattern] of Object.entries(patterns)) { const matches = text.match(pattern); if (matches) { found.push({ type, count: matches.length }); } } return { hasPII: found.length > 0, matches: found }; } // LLM07: Insecure Plugin Design validateToolCall(toolName: string, args: unknown): boolean { if (!this.config.allowedTools.includes(toolName)) { throw new SecurityError('Unauthorized tool', { tool: toolName }); } // Validate arguments against schema const schema = this.getToolSchema(toolName); return schema.safeParse(args).success; } // LLM08: Excessive Agency async enforceLeastPrivilege(action: LLMAction): Promise<void> { const dangerousActions = ['delete', 'execute', 'admin', 'sudo']; if (dangerousActions.some(a => action.type.includes(a))) { // Require human approval for dangerous actions const approved = await this.requestHumanApproval(action); if (!approved) { throw new SecurityError('Action requires human approval'); } } }}
anti_patterns:
-
name: Trusting AI Output Directly description: Executing or storing AI-generated content without validation why: LLMs hallucinate, can be manipulated, and generate insecure code instead: Always validate, sanitize, and sandbox AI outputs before use.
-
name: Static Prompts as Security description: Relying solely on system prompts for security constraints why: Prompt injection bypasses prompt-level controls easily instead: Implement multi-layer defense with output validation and sandboxing.
-
name: Using Outdated AI Dependencies description: Not updating AI SDKs, models, or related dependencies why: AI security landscape evolves rapidly; new vulnerabilities discovered weekly instead: Regular dependency audits, automated updates, vulnerability scanning.
-
name: No Model Provenance description: Using models without verifying source and integrity why: Poisoned models can contain backdoors, biased outputs, or malware instead: Verify model hashes, sources, and maintain audit trail.
-
name: Excessive LLM Permissions description: Giving LLMs access to all tools, data, or systems why: Compromised LLM (via injection) gains all those permissions instead: Apply least privilege principle; LLM gets only what it needs.
handoffs:
-
trigger: prompt injection to: prompt-injection-defense context: Need specialized prompt injection defense
-
trigger: security audit|penetration test to: llm-security-audit context: Need comprehensive security assessment
-
trigger: mcp security|tool security to: mcp-security context: Need MCP-specific security patterns