git clone https://github.com/Intense-Visions/harness-engineering
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/harness-security-review" ~/.claude/skills/intense-visions-harness-engineering-harness-security-review-db0be2 && rm -rf "$T"
agents/skills/claude-code/harness-security-review/SKILL.mdHarness Security Review
Deep security audit combining mechanical scanning with AI-powered vulnerability analysis. OWASP baseline + stack-adaptive rules + optional threat modeling.
When to Use
- Before a release or security-sensitive merge
- After updating dependencies (supply chain risk)
- When auditing a new or unfamiliar codebase
- When
triggers fire on security-sensitive pathson_pr - NOT for quick pre-commit checks (use harness-pre-commit-review for that)
- NOT for general code review (use harness-code-review for that)
Scope Adaptation
This skill adapts its behavior based on invocation context — standalone or as part of the code review pipeline.
Detection
Check for
pipelineContext in .harness/handoff.json. If present, run in changed-files mode. Otherwise, run in full mode.
# Check for pipeline context cat .harness/handoff.json 2>/dev/null | grep -q '"pipelineContext"'
Changed-Files Mode (Code Review Pipeline)
When invoked from the code review pipeline (Phase 4 fan-out, security slot):
- Phase 1 (SCAN): SKIPPED. The mechanical security scan already ran in code review Phase 2. Read the mechanical findings from
wherePipelineContext.findings
instead of re-runningdomain === 'security'
.run_security_scan - Phase 2 (REVIEW): Run OWASP baseline + stack-adaptive analysis on changed files only plus their direct imports (for data flow tracing). The changed file list is provided in the context bundle from the pipeline.
- Phase 3 (THREAT-MODEL): SKIPPED unless
flag was passed through from code review.--deep - Phase 4 (REPORT): SKIPPED. Return findings as
to the pipeline. The pipeline handles output formatting (Phase 7).ReviewFinding[]
Findings returned in this mode must use the
ReviewFinding schema with populated security fields (cweId, owaspCategory, confidence, remediation, references).
Full Mode (Standalone)
When invoked directly (no PipelineContext):
- All phases run as documented below (Phase 1 through Phase 4).
- Output is the standalone security report format.
- This is the existing behavior — no changes.
Principle: Layered Security
This skill follows the Deterministic-vs-LLM Responsibility Split principle. The mechanical scanner runs first and catches what patterns can catch. The AI review then looks for semantic issues that patterns miss — user input flowing through multiple functions to a dangerous sink, missing authorization checks, logic flaws in authentication flows.
Process
Phase 1: SCAN — Mechanical Security Scanner (full mode only)
Note: This phase is skipped in changed-files mode. See Scope Adaptation above.
Run the built-in security scanner against the project.
-
Run the scanner. Use the
CLI command:harness check-securityharness check-securityFor machine-readable output, add
. For scanning only changed files, add--json
.--changed-only -
Review findings. Categorize by severity:
- Error (blocking): Must fix before merge — secrets, injection, eval, weak crypto
- Warning (review): Should fix — CORS wildcards, disabled TLS, path traversal patterns
- Info (note): Consider — HTTP URLs, missing security headers
-
Report mechanical findings. Present each finding with:
- Rule ID and name
- File, line number, matched code
- Remediation guidance
- CWE/OWASP reference
Phase 2: REVIEW — AI-Powered Security Analysis
After mechanical scanning, perform deeper AI analysis.
OWASP Baseline (always runs)
Review the codebase against OWASP Top 10 and CWE Top 25:
-
Injection (CWE-89, CWE-78, CWE-79): Look for user input flowing to SQL queries, shell commands, or HTML output without sanitization. Trace data flow across function boundaries — patterns only catch single-line issues.
-
Broken Authentication (CWE-287): Check for weak session management, missing MFA enforcement, hardcoded credentials, predictable tokens.
-
Sensitive Data Exposure (CWE-200): Look for PII logged to console/files, sensitive data in error messages, missing encryption for data at rest or in transit.
-
Broken Access Control (CWE-862): Check for missing authorization on API endpoints, IDOR vulnerabilities, privilege escalation paths.
-
Security Misconfiguration (CWE-16): Check for debug mode in production configs, default credentials, overly permissive CORS, missing security headers.
Stack-Adaptive Review (based on detected tech)
After the OWASP baseline, add stack-specific checks:
- Node.js: Prototype pollution via
or spread on user input,Object.assign
injection, unhandled promise rejections exposing stack traces__proto__ - Express: Missing helmet, rate limiting, CSRF protection, body parser limits
- React: XSS via
, sensitive data in client state, insecuredangerouslySetInnerHTML
listenerspostMessage - Go: Race conditions in concurrent handlers,
usage, format string injectionunsafe.Pointer
Insecure Defaults Analysis
For each configuration variable that controls a security feature (auth, encryption, TLS, CORS, rate limiting), verify:
- Does the feature fail-closed (error/deny) when configuration is missing?
- Or does it fail-open (degrade to permissive/disabled)?
- Trace fallback chains:
— is the final default secure?config.x ?? env.Y ?? default
Patterns the mechanical
SEC-DEF-* rules cannot catch (focus here):
- Multi-line fallback chains where the insecure default is not adjacent to the security variable name
- Conditional logic that enables security features only in specific environments (e.g.,
)if (isProd) enableTLS() - Error handlers that swallow failures in auth, session, or token validation code (multi-line
blocks)catch - Silent type coercions that convert truthy env vars to falsy values
Rationalizations to reject (adapted from Trail of Bits):
- "The default is only used in development" — production deployments inherit defaults when config is missing
- "The env var will always be set" — missing env vars are the #1 cause of fail-open in production
- "The catch block will be filled in later" — empty auth catch blocks ship to production
- "It's behind a feature flag" — feature flags can be inadvertently enabled or disabled
Phase 3: THREAT-MODEL (optional, --deep
flag; full mode or explicit --deep
in pipeline)
--deep--deepWhen invoked with
--deep, build a lightweight threat model:
-
Identify entry points. Find all HTTP routes, API endpoints, message handlers, CLI commands, and file upload handlers.
-
Map trust boundaries. Where does data cross from untrusted (user input, external APIs) to trusted (database queries, file system, internal services)?
-
Trace data flows. For each entry point, trace how user-controlled data flows through the system. Use the knowledge graph if available (
,query_graph
).get_relationships -
Identify threat scenarios. For each trust boundary crossing, ask:
- What if this input is malicious?
- What is the worst-case impact?
- What controls are in place?
-
Report threat model. Present as a table:
Entry Point Data Flow Trust Boundary Threats Controls Risk
Phase 4: REPORT — Consolidated Findings
Produce a unified security report:
Security Review: [PASS/WARN/FAIL] Mechanical Scanner: - Scanned: N files, M rules applied - Coverage: baseline/enhanced - Errors: N | Warnings: N | Info: N [List each finding with rule ID, file:line, severity, and remediation] AI Review: - OWASP Baseline: [findings or "No issues found"] - Stack-Adaptive ([detected stacks]): [findings or "No issues found"] [If --deep] Threat Model: - Entry points: N - Trust boundaries: N - High-risk flows: [list]
Harness Integration
— Run the mechanical scanner via CLI. Useharness check-security
for machine-readable output.--json
— Standard project health checkharness validate
/query_graph
— Used in threat modeling phase for data flow tracingget_relationships
— Understand blast radius of security-sensitive changesget_impact
Rationalizations to Reject
| Rationalization | Reality |
|---|---|
| "The scanner didn't flag it so it must be fine" | Mechanical scanners catch pattern-level issues. They cannot trace user input across multiple function calls to a dangerous sink, detect authorization logic flaws, or evaluate whether a fallback chain fails open. The AI review phase exists precisely because scanners miss semantic vulnerabilities. |
| "This endpoint is behind authentication so we don't need to validate input" | Authentication and input validation are orthogonal controls. Authenticated users can still send malicious payloads. Authenticated SQL injection, SSRF, and path traversal are well-documented attack patterns against internal-only endpoints. |
| "The vulnerability requires knowing our internal schema to exploit" | Security through obscurity is not a control. Internal schema details leak through error messages, API responses, documentation, and employee turnover. Rate the vulnerability based on its impact assuming the attacker knows the system. |
| "We'll add rate limiting and input validation later once the feature ships" | Security controls added after deployment require re-testing and re-review. Shipping without them creates an exposure window and establishes technical debt that is systematically deprioritized once the feature is live. |
| "That's an OWASP theoretical risk — our app isn't targeted by sophisticated attackers" | OWASP findings are exploited by automated scanners, not just sophisticated attackers. Opportunistic bots continuously probe for SQL injection, XSS, and auth bypass. Unpatched OWASP Top 10 issues are routinely exploited within hours of exposure. |
Gates
- Mechanical scanner must run before AI review. The scanner catches what patterns can catch; AI reviews what remains.
- Error-severity findings are blocking. The report must be FAIL if any error-severity finding exists.
- AI review must reference specific code. No vague warnings like "consider improving security." Every finding must point to a file, line, and specific issue.
- Threat model is optional. Only runs with
. Do not run it unless explicitly requested.--deep
Success Criteria
- Mechanical scanner ran and produced findings (or confirmed clean)
- AI review covered OWASP Top 10 baseline
- Stack-adaptive checks matched the detected technology
- Every finding includes file, line, CWE reference, and remediation
- Report follows the structured format
- Error-severity findings result in FAIL status
Escalation
- Scanner finds secrets in committed code: Flag immediately. Recommend rotating the compromised credentials. This is urgent regardless of other findings.
- AI review finds a critical vulnerability (RCE, SQLi, auth bypass): Mark as blocking. Do not approve the PR. Provide exact remediation code.
- Conflict between scanner and AI review: If the scanner flags something the AI thinks is a false positive, include both perspectives in the report. Let the human decide.
- Scope too large for meaningful review: If the project has >1000 source files, recommend scoping the review to changed files or a specific subsystem.
Examples
Example: Clean Scan
Security Review: PASS Mechanical Scanner: - Scanned: 42 files, 22 rules applied - Coverage: baseline - Errors: 0 | Warnings: 0 | Info: 0 AI Review: - OWASP Baseline: No issues found - Stack-Adaptive (node, express): No issues found
Example: Findings Detected
Security Review: FAIL Mechanical Scanner: - Scanned: 42 files, 22 rules applied - Coverage: baseline - Errors: 2 | Warnings: 1 | Info: 0 Findings: 1. [SEC-SEC-002] ERROR src/config.ts:12 — Hardcoded API key or secret detected Remediation: Use environment variables: process.env.API_KEY 2. [SEC-INJ-002] ERROR src/db.ts:45 — SQL query built with string concatenation Remediation: Use parameterized queries: query("SELECT * FROM users WHERE id = $1", [id]) 3. [SEC-NET-001] WARNING src/cors.ts:8 — CORS wildcard origin allows any website to make requests Remediation: Restrict CORS to specific trusted origins AI Review: - OWASP Baseline: 1 finding — user input from req.params.id flows through formatQuery() to db.execute() without sanitization (confirms SEC-INJ-002 with data flow trace) - Stack-Adaptive (node, express): Missing helmet middleware, missing rate limiting on /api/* routes
Example: Deep Audit with Threat Model
Security Review: WARN Mechanical Scanner: - Scanned: 120 files, 30 rules applied - Coverage: baseline - Errors: 0 | Warnings: 2 | Info: 3 AI Review: - OWASP Baseline: No critical issues - Stack-Adaptive (node, react): localStorage used for session token (SEC-REACT-001) Threat Model: - Entry points: 12 (8 REST endpoints, 2 WebSocket handlers, 2 CLI commands) - Trust boundaries: 4 (client→API, API→database, API→external service, CLI→filesystem) - High-risk flows: 1. POST /api/upload → file stored to disk without size limit or type validation 2. WebSocket message handler passes user data to eval-like template engine