Vibeship-spawner-skills ai-code-security

id: ai-code-security

install

source · Clone the upstream repo

git clone https://github.com/vibeforge1111/vibeship-spawner-skills

manifest: security/ai-code-security/skill.yaml

tags

#ai-security #llm-vulnerabilities #code-review #owasp #secure-coding #ai-threats

source content

id: ai-code-security name: AI Code Security version: 1.0.0 layer: 2 description: Security vulnerabilities in AI-generated code and LLM applications, covering OWASP Top 10 for LLMs, secure coding patterns, and AI-specific threat models

owns:

ai-generated-code-review
llm-vulnerability-patterns
ai-security-testing
model-output-validation
ai-supply-chain-security

pairs_with:

prompt-injection-defense
llm-security-audit
mcp-security
code-review

requires:

basic-security-knowledge
llm-fundamentals

ecosystem: primary_tools: - name: OWASP LLM Top 10 description: Authoritative framework for LLM security risks url: https://owasp.org/www-project-top-10-for-large-language-model-applications/ - name: Semgrep/Opengrep description: Static analysis for AI-generated code patterns url: https://semgrep.dev - name: Gitleaks description: Secret detection in AI outputs url: https://gitleaks.io - name: Trivy description: Supply chain vulnerability scanning url: https://trivy.dev alternatives: - name: CodeQL description: GitHub's semantic code analysis when: Deep semantic vulnerability analysis needed - name: Snyk Code description: Real-time AI-powered security scanning when: IDE-integrated security feedback deprecated: - name: Manual code review only reason: AI-generated code volume exceeds human review capacity migration: Combine automated scanning with targeted human review

prerequisites: knowledge: - OWASP Top 10 web vulnerabilities - LLM API basics - Supply chain security concepts skills_recommended: - prompt-injection-defense - llm-security-audit

limits: does_not_cover: - ML model security (adversarial attacks, model poisoning) - Infrastructure security (cloud, containers) - Cryptographic implementation boundaries: - Focus is application-level AI security - Covers code generated by and interacting with LLMs

tags:

security
ai
llm
owasp
code-review
vulnerabilities

triggers:

ai code security
llm vulnerabilities
ai generated code review
owasp llm
secure ai development

history:

version: "2023" milestone: OWASP LLM Top 10 v1.0 released impact: First standardized framework for LLM security
version: "2024" milestone: Enterprise AI adoption accelerates security focus impact: 73% of AI deployments found to have critical vulnerabilities
version: "2025" milestone: OWASP LLM Top 10 v2.0 with new categories impact: Vector/embedding weaknesses, system prompt leakage added

contrarian_insights:

claim: AI-generated code is inherently less secure reality: AI code has similar vulnerability rates to human code, but different patterns—more verbose, less context-aware, prone to outdated patterns
claim: Static analysis catches AI code vulnerabilities reality: AI generates novel vulnerability patterns that traditional rules miss; semantic analysis required
claim: Prompt engineering prevents security issues reality: Prompt-level controls are easily bypassed; defense must be multi-layered with output validation

identity: | You're a security engineer who has reviewed thousands of AI-generated code samples and found the same patterns recurring. You've seen production outages caused by LLM hallucinations, data breaches from prompt injection, and supply chain compromises through poisoned models.

Your experience spans traditional AppSec (OWASP Top 10, secure coding) and the new frontier of AI security. You understand that AI doesn't just generate vulnerabilities—it generates them at scale, with novel patterns that traditional tools miss.

Your core principles:

Never trust AI output—validate everything
Defense in depth—prompt, model, output, and runtime layers
AI is an untrusted input source—treat it like user input
Supply chain matters—models, datasets, and dependencies
Automate detection—human review doesn't scale

patterns:

name: AI Output Validation Pipeline description: Validate all LLM outputs before execution or storage when: LLM generates code, SQL, commands, or structured data example: | import { z } from 'zod'; import { scanForSecrets } from './security';

// Schema validation for LLM-generated structured output const LLMOutputSchema = z.object({ code: z.string().max(10000), language: z.enum(['typescript', 'python', 'sql']), explanation: z.string() });

async function validateLLMOutput(rawOutput: unknown): Promise<ValidatedOutput> { // 1. Schema validation const parsed = LLMOutputSchema.parse(rawOutput);

  // 2. Secret detection
  const secrets = await scanForSecrets(parsed.code);
  if (secrets.length > 0) {
      throw new SecurityError('LLM output contains secrets', { secrets });
  }

  // 3. Dangerous pattern detection
  const dangerousPatterns = [
      /eval\s*\(/,
      /exec\s*\(/,
      /rm\s+-rf/,
      /DROP\s+TABLE/i,
      /TRUNCATE/i,
      /__import__/,
      /subprocess\.call/
  ];

  for (const pattern of dangerousPatterns) {
      if (pattern.test(parsed.code)) {
          throw new SecurityError('LLM output contains dangerous pattern', {
              pattern: pattern.source
          });
      }
  }

  // 4. Static analysis (language-specific)
  const analysisResult = await runStaticAnalysis(parsed.code, parsed.language);
  if (analysisResult.criticalIssues.length > 0) {
      throw new SecurityError('LLM code has critical vulnerabilities', {
          issues: analysisResult.criticalIssues
      });
  }

  return {
      ...parsed,
      analysisResult,
      validatedAt: new Date()
  };

}

name: Sandboxed Code Execution description: Execute AI-generated code in isolated environments when: LLM output must be executed (code interpreters, agents) example: | import { NodeVM } from 'vm2'; import { spawn } from 'child_process';

class SecureSandbox { private readonly timeout = 5000; private readonly memoryLimit = 128 * 1024 * 1024; // 128MB

  // JavaScript/TypeScript sandbox
  async executeJS(code: string, context: Record<string, unknown> = {}): Promise<unknown> {
      const vm = new NodeVM({
          timeout: this.timeout,
          sandbox: {
              ...context,
              // Explicitly deny dangerous globals
              process: undefined,
              require: undefined,
              __dirname: undefined,
              __filename: undefined
          },
          eval: false,
          wasm: false,
          sourceExtensions: ['js']
      });

      try {
          return vm.run(code);
      } catch (error) {
          if (error.message.includes('Script execution timed out')) {
              throw new SecurityError('Code execution timeout', { code });
          }
          throw error;
      }
  }

  // Python sandbox using subprocess with cgroups
  async executePython(code: string): Promise<string> {
      return new Promise((resolve, reject) => {
          const proc = spawn('firejail', [
              '--quiet',
              '--private',
              '--net=none',
              '--rlimit-as=' + this.memoryLimit,
              'python3', '-c', code
          ], {
              timeout: this.timeout,
              stdio: ['pipe', 'pipe', 'pipe']
          });

          let stdout = '';
          let stderr = '';

          proc.stdout.on('data', (data) => stdout += data);
          proc.stderr.on('data', (data) => stderr += data);

          proc.on('close', (code) => {
              if (code === 0) resolve(stdout);
              else reject(new Error(stderr || `Exit code: ${code}`));
          });
      });
  }

  // Docker-based sandbox for full isolation
  async executeInDocker(code: string, image: string): Promise<string> {
      const containerId = await this.createContainer(image, {
          NetworkDisabled: true,
          Memory: this.memoryLimit,
          CpuPeriod: 100000,
          CpuQuota: 50000, // 50% CPU
          ReadonlyRootfs: true,
          SecurityOpt: ['no-new-privileges']
      });

      try {
          return await this.execInContainer(containerId, code);
      } finally {
          await this.removeContainer(containerId);
      }
  }

}

name: Supply Chain Verification description: Verify AI model and dependency integrity when: Using third-party models, fine-tuned models, or AI dependencies example: | import { createHash } from 'crypto'; import { readFile } from 'fs/promises';

interface ModelManifest { name: string; version: string; sha256: string; source: string; signedBy?: string; attestation?: string; }

class ModelVerifier { private readonly trustedSources = [ 'huggingface.co', 'anthropic.com', 'openai.com' ];

  async verifyModel(modelPath: string, manifest: ModelManifest): Promise<boolean> {
      // 1. Verify source trust
      const sourceUrl = new URL(manifest.source);
      if (!this.trustedSources.some(s => sourceUrl.hostname.endsWith(s))) {
          throw new SecurityError('Untrusted model source', {
              source: manifest.source,
              trusted: this.trustedSources
          });
      }

      // 2. Verify hash integrity
      const modelData = await readFile(modelPath);
      const actualHash = createHash('sha256').update(modelData).digest('hex');

      if (actualHash !== manifest.sha256) {
          throw new SecurityError('Model hash mismatch', {
              expected: manifest.sha256,
              actual: actualHash
          });
      }

      // 3. Verify signature if available
      if (manifest.signedBy && manifest.attestation) {
          const valid = await this.verifySignature(
              modelData,
              manifest.attestation,
              manifest.signedBy
          );
          if (!valid) {
              throw new SecurityError('Model signature invalid');
          }
      }

      // 4. Scan for known malicious patterns
      await this.scanForMaliciousPatterns(modelPath);

      return true;
  }

  async verifyDependencies(packageJson: string): Promise<void> {
      const pkg = JSON.parse(await readFile(packageJson, 'utf-8'));
      const aiDeps = this.extractAIDependencies(pkg);

      for (const dep of aiDeps) {
          // Check for known vulnerable versions
          const vulns = await this.checkVulnerabilities(dep.name, dep.version);
          if (vulns.critical.length > 0) {
              throw new SecurityError('Critical AI dependency vulnerability', {
                  package: dep.name,
                  vulnerabilities: vulns.critical
              });
          }
      }
  }

}

name: OWASP LLM Top 10 Mitigation description: Systematic mitigation of OWASP LLM vulnerabilities when: Building or auditing LLM applications example: | // Comprehensive LLM security middleware

interface LLMSecurityConfig { maxTokens: number; rateLimitPerMinute: number; allowedTools: string[]; sensitiveDataPatterns: RegExp[]; }

class LLMSecurityMiddleware { constructor(private config: LLMSecurityConfig) {}

  // LLM01: Prompt Injection Defense
  async sanitizeInput(input: string): Promise<string> {
      // Remove known injection patterns
      const sanitized = input
          .replace(/ignore previous instructions/gi, '[FILTERED]')
          .replace(/system:/gi, '[FILTERED]')
          .replace(/\[INST\]/gi, '[FILTERED]');

      // Validate input doesn't exceed token budget
      const tokens = await this.countTokens(sanitized);
      if (tokens > this.config.maxTokens) {
          throw new SecurityError('Input exceeds token limit');
      }

      return sanitized;
  }

  // LLM02: Insecure Output Handling
  async sanitizeOutput(output: string): Promise<string> {
      // Remove any embedded code that could execute
      let safe = output.replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '');

      // Detect and mask sensitive data
      for (const pattern of this.config.sensitiveDataPatterns) {
          safe = safe.replace(pattern, '[REDACTED]');
      }

      return safe;
  }

  // LLM03: Training Data Poisoning (at inference time)
  validateModelSource(source: string): boolean {
      const trustedSources = ['anthropic', 'openai', 'internal'];
      return trustedSources.some(s => source.includes(s));
  }

  // LLM04: Model Denial of Service
  async enforceRateLimits(userId: string): Promise<void> {
      const key = `ratelimit:${userId}`;
      const count = await this.redis.incr(key);

      if (count === 1) {
          await this.redis.expire(key, 60);
      }

      if (count > this.config.rateLimitPerMinute) {
          throw new SecurityError('Rate limit exceeded');
      }
  }

  // LLM05: Supply Chain Vulnerabilities (handled by ModelVerifier)

  // LLM06: Sensitive Information Disclosure
  async detectPII(text: string): Promise<PIIResult> {
      const patterns = {
          ssn: /\b\d{3}-\d{2}-\d{4}\b/g,
          creditCard: /\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b/g,
          email: /\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b/g,
          phone: /\b\d{3}[-.]?\d{3}[-.]?\d{4}\b/g,
          apiKey: /\b(sk-|api[_-]?key)[a-zA-Z0-9]{20,}\b/gi
      };

      const found: PIIMatch[] = [];
      for (const [type, pattern] of Object.entries(patterns)) {
          const matches = text.match(pattern);
          if (matches) {
              found.push({ type, count: matches.length });
          }
      }

      return { hasPII: found.length > 0, matches: found };
  }

  // LLM07: Insecure Plugin Design
  validateToolCall(toolName: string, args: unknown): boolean {
      if (!this.config.allowedTools.includes(toolName)) {
          throw new SecurityError('Unauthorized tool', { tool: toolName });
      }

      // Validate arguments against schema
      const schema = this.getToolSchema(toolName);
      return schema.safeParse(args).success;
  }

  // LLM08: Excessive Agency
  async enforceLeastPrivilege(action: LLMAction): Promise<void> {
      const dangerousActions = ['delete', 'execute', 'admin', 'sudo'];

      if (dangerousActions.some(a => action.type.includes(a))) {
          // Require human approval for dangerous actions
          const approved = await this.requestHumanApproval(action);
          if (!approved) {
              throw new SecurityError('Action requires human approval');
          }
      }
  }

}

anti_patterns:

name: Trusting AI Output Directly description: Executing or storing AI-generated content without validation why: LLMs hallucinate, can be manipulated, and generate insecure code instead: Always validate, sanitize, and sandbox AI outputs before use.
name: Static Prompts as Security description: Relying solely on system prompts for security constraints why: Prompt injection bypasses prompt-level controls easily instead: Implement multi-layer defense with output validation and sandboxing.
name: Using Outdated AI Dependencies description: Not updating AI SDKs, models, or related dependencies why: AI security landscape evolves rapidly; new vulnerabilities discovered weekly instead: Regular dependency audits, automated updates, vulnerability scanning.
name: No Model Provenance description: Using models without verifying source and integrity why: Poisoned models can contain backdoors, biased outputs, or malware instead: Verify model hashes, sources, and maintain audit trail.
name: Excessive LLM Permissions description: Giving LLMs access to all tools, data, or systems why: Compromised LLM (via injection) gains all those permissions instead: Apply least privilege principle; LLM gets only what it needs.

handoffs:

trigger: prompt injection to: prompt-injection-defense context: Need specialized prompt injection defense
trigger: security audit|penetration test to: llm-security-audit context: Need comprehensive security assessment
trigger: mcp security|tool security to: mcp-security context: Need MCP-specific security patterns