Learn-skills.dev hipaa-guardian
This skill should be used when the user asks to "scan for PHI", "detect PII", "HIPAA compliance check", "audit for protected health information", "find sensitive healthcare data", "generate HIPAA audit report", "check code for PHI leakage", "scan logs for PHI", "check authentication on PHI endpoints", "scan FHIR resources", "check HL7 messages", or mentions PHI detection, HIPAA compliance, healthcare data privacy, medical record security, logging PHI violations, authentication checks for health data, or healthcare data formats (FHIR, HL7, CDA).
git clone https://github.com/NeverSight/learn-skills.dev
T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/1mangesh1/dev-skills/hipaa-guardian" ~/.claude/skills/neversight-learn-skills-dev-hipaa-guardian && rm -rf "$T"
data/skills-md/1mangesh1/dev-skills/hipaa-guardian/SKILL.mdHIPAA Guardian
A comprehensive PHI/PII detection and HIPAA compliance skill for AI agents, with a strong focus on developer code security patterns. Detects all 18 HIPAA Safe Harbor identifiers in data files and source code, provides risk scoring, maps findings to HIPAA regulations, and generates audit reports with remediation guidance.
Capabilities
- PHI/PII Detection - Scan data files for the 18 HIPAA Safe Harbor identifiers
- Code Scanning - Detect PHI in source code, comments, test fixtures, configs
- Auth Gate Detection - Find API endpoints exposing PHI without authentication
- Log Safety Audit - Detect PHI leaking into log statements
- Classification - Classify findings as PHI, PII, or sensitive_nonPHI
- Risk Scoring - Score findings 0-100 based on sensitivity and exposure
- HIPAA Mapping - Map each finding to specific HIPAA rules
- Audit Reports - Generate findings.json, audit reports, and playbooks
- Remediation - Provide step-by-step remediation with code examples
- Control Checks - Validate security controls are in place
Usage
/hipaa-guardian [command] [path] [options]
Commands
- Scan files or directories for PHI/PIIscan <path>
- Scan source code for PHI leakagescan-code <path>
- Check API endpoints for missing authentication before PHI accessscan-auth <path>
- Detect PHI patterns in logging statementsscan-logs <path>
- Check API responses for unmasked PHI exposurescan-response <path>
- Generate full HIPAA compliance audit reportaudit <path>
- Check security controls in a projectcontrols <path>
- Generate report from existing findingsreport
Options
- Output format: json, markdown, csv (default: markdown)--format <type>
- Write results to file--output <file>
- Minimum severity: low, medium, high, critical--severity <level>
- File patterns to include--include <patterns>
- File patterns to exclude--exclude <patterns>
- Treat all data as synthetic (default for safety)--synthetic
Workflow
When invoked, follow this workflow:
Step 1: Determine Scan Scope
Ask the user to specify:
- Target path (file, directory, or glob pattern)
- Scan type (data files, source code, or both)
- Whether data is synthetic/test data or potentially real PHI
Step 2: File Discovery
Use Glob to find relevant files:
# For data files Glob: **/*.{json,csv,txt,log,xml,hl7,fhir} # For source code Glob: **/*.{py,js,ts,tsx,java,cs,go,rb,sql,sh} # For config files Glob: **/*.{env,yaml,yml,json,xml,ini,conf}
Step 3: PHI Detection
For each file, scan for the 18 HIPAA identifiers using patterns from
references/detection-patterns.md:
- Names - Patient, provider, relative names
- Geographic - Addresses, cities, ZIP codes
- Dates - DOB, admission, discharge, death dates
- Phone Numbers - All formats
- Fax Numbers - All formats
- Email Addresses - All formats
- SSN - Social Security Numbers
- MRN - Medical Record Numbers
- Health Plan IDs - Insurance identifiers
- Account Numbers - Financial accounts
- License Numbers - Driver's license, professional
- Vehicle IDs - VIN, license plates
- Device IDs - Serial numbers, UDI
- URLs - Web addresses
- IP Addresses - Network identifiers
- Biometric - Fingerprints, retinal, voice
- Photos - Full-face images
- Other Unique IDs - Any other identifying numbers
Step 4: Classification
Classify each finding:
- PHI - Health information linkable to individual
- PII - Personally identifiable but not health-related
- sensitive_nonPHI - Sensitive but not individually identifiable
Step 5: Risk Scoring
Calculate risk score (0-100) using methodology from
references/risk-scoring.md:
Risk Score = (Sensitivity × 0.35) + (Exposure × 0.25) + (Volume × 0.20) + (Identifiability × 0.20)
Step 6: HIPAA Mapping
Map findings to HIPAA rules from references:
- 45 CFR 164.500-534references/privacy-rule.md
- 45 CFR 164.302-318references/security-rule.md
- 45 CFR 164.400-414references/breach-rule.md
Step 7: Generate Output
Create structured output following
examples/sample-finding.json format:
{ "id": "F-YYYYMMDD-NNNN", "timestamp": "ISO-8601", "file": "path/to/file", "line": 123, "field": "field.path", "value_hash": "sha256:...", "classification": "PHI|PII|sensitive_nonPHI", "identifier_type": "ssn|mrn|dob|...", "confidence": 0.95, "risk_score": 85, "hipaa_rules": [...], "remediation": [...], "status": "open" }
Code Scanning
When scanning source code, look for:
1. Hardcoded PHI in Source
- String literals containing SSN, MRN, names, dates
- Variable assignments with sensitive values
- Database seed/fixture data
2. PHI in Comments
- Example data in code comments
- TODO comments with patient info
- Documentation strings with real data
3. Test Data Leakage
- Test fixtures with real PHI
- Mock data files with actual patient info
- Integration test data
4. Configuration Files
files with PHI.env- Connection strings with embedded credentials
- API responses cached with PHI
5. SQL Files
- INSERT statements with PHI
- Sample queries with real patient data
- Database dumps
See
references/code-scanning.md for detailed patterns.
Security Control Checks
Verify these controls are in place:
Access Controls
- Role-based access control (RBAC) implemented
- Minimum necessary access principle applied
- Access logging enabled
Encryption
- Data encrypted at rest (AES-256)
- Data encrypted in transit (TLS 1.2+)
- Encryption keys properly managed
Audit Controls
- Audit logging implemented
- Log integrity protected
- Retention policies defined
Code Security
-
excludes sensitive files.gitignore - Pre-commit hooks scan for PHI
- Secrets management in place
- Data masking in logs
Output Formats
findings.json
Structured array of all findings with full metadata.
audit_report.md
Human-readable report with:
- Executive summary
- Findings by severity
- HIPAA compliance status
- Risk assessment
- Recommendations
playbook.md
Step-by-step remediation guide:
- Prioritized actions
- Code examples
- Verification steps
Security Guardrails
- Default Synthetic Mode - Assumes data is synthetic unless confirmed otherwise
- No PHI Storage - Never stores detected PHI values, only hashes
- Redaction - All example outputs redact actual values
- Warning Prompts - Warns before processing potentially real PHI
- Audit Trail - Logs all scans (without PHI values)
References
- All 18 HIPAA Safe Harbor identifiersreferences/hipaa-identifiers.md
- Regex patterns for PHI detectionreferences/detection-patterns.md
- Code scanning patterns and rulesreferences/code-scanning.md
- FHIR, HL7, CDA detection patternsreferences/healthcare-formats.md
- HIPAA Privacy Rule (45 CFR 164.500-534)references/privacy-rule.md
- HIPAA Security Rule (45 CFR 164.302-318)references/security-rule.md
- Breach Notification Rule (45 CFR 164.400-414)references/breach-rule.md
- Risk scoring methodologyreferences/risk-scoring.md
- Authentication gate patterns for PHI endpointsreferences/auth-patterns.md
- PHI-safe logging patterns and filtersreferences/logging-safety.md
- API response masking and field-level authreferences/api-security.md
CI/CD Integration
Pre-Commit Hook Installation
# Install the pre-commit hook cp scripts/pre-commit-hook.sh .git/hooks/pre-commit chmod +x .git/hooks/pre-commit # Or using pre-commit framework # Add to .pre-commit-config.yaml: repos: - repo: local hooks: - id: hipaa-guardian name: HIPAA Guardian PHI Scan entry: python scripts/detect-phi.py language: python types: [file] pass_filenames: true
Environment Variables
# Configure pre-commit behavior export HIPAA_BLOCK_ON_CRITICAL=true # Block commits with critical findings export HIPAA_BLOCK_ON_HIGH=true # Block commits with high severity findings export HIPAA_SCAN_DATA=true # Scan data files export HIPAA_SCAN_CODE=true # Scan source code export HIPAA_VERBOSE=false # Enable verbose output
GitHub Actions Integration
# .github/workflows/hipaa-scan.yml name: HIPAA PHI Scan on: [push, pull_request] jobs: scan: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-python@v5 with: python-version: '3.11' - name: Run PHI Scan run: | python scripts/detect-phi.py . --format markdown --output phi-report.md - name: Upload Report uses: actions/upload-artifact@v4 with: name: phi-scan-report path: phi-report.md
Healthcare Data Format Support
Supported Formats
| Format | Extensions | Detection |
|---|---|---|
| FHIR R4 | , | Resource type, identifiers |
| HL7 v2.x | , | MSH, PID, DG1 segments |
| CDA/C-CDA | , , | ClinicalDocument, patientRole |
| X12 EDI | , , | Transaction set headers |
High-Risk FHIR Resources
- Demographics, identifiers, contactsPatient
- Diagnoses, health conditionsCondition
- Lab results, vitalsObservation
- PrescriptionsMedicationRequest
- Test resultsDiagnosticReport
HL7 v2 PHI Segments
- Patient Identification (SSN in PID-19)PID
- Diagnosis InformationDG1
- Observation/Result ValuesOBX
- Insurance InformationIN1
Examples
- Example finding output formatexamples/sample-finding.json
- Example audit reportexamples/sample-audit-report.md
- Test data for validationexamples/synthetic-phi-data.json
Scripts
- PHI/PII detection in data files (supports FHIR, HL7, CDA formats)scripts/detect-phi.py
- Code scanning for PHI leakagescripts/scan-code.py
- Authentication gate detection for PHI endpointsscripts/scan-auth.py
- PHI detection in logging statementsscripts/scan-logs.py
- API response PHI exposure detectionscripts/scan-response.py
- Report generation scriptscripts/generate-report.py
- Control validation scriptscripts/validate-controls.sh
- Git pre-commit hook for CI/CD integrationscripts/pre-commit-hook.sh