Claude-Skills qa-browser-automation

install

source · Clone the upstream repo

git clone https://github.com/borghei/Claude-Skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/borghei/Claude-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/engineering/qa-browser-automation" ~/.claude/skills/borghei-claude-skills-qa-browser-automation && rm -rf "$T"

manifest: engineering/qa-browser-automation/SKILL.md

source content

QA Browser Automation

The agent drives Chrome MCP for live browser testing and uses four Python tools for deterministic health scoring, accessibility auditing, visual regression tracking, and report generation.

Quick Start

# Score QA findings (0-100 weighted across 10 categories)
python scripts/qa_health_scorer.py findings.json --threshold 85 --baseline .qa-baselines/latest.json --save-baseline --json

# Audit HTML for WCAG 2.1 violations
python scripts/accessibility_auditor.py page.html --level AA --json

# Track visual regressions
python scripts/visual_regression_tracker.py --init --baseline-dir ./baselines
python scripts/visual_regression_tracker.py --register ./baselines
python scripts/visual_regression_tracker.py --baseline ./baselines --current ./screenshots --threshold 5

# Generate full QA report
python scripts/test_report_generator.py session_data.json --format markdown -o report.md

Tools Overview

Tool	Input	Output
`qa_health_scorer.py`	Findings JSON	Score 0-100, grade A-F, category breakdown, trend data
`accessibility_auditor.py`	HTML file (or stdin)	WCAG violations by level with remediation guidance
`visual_regression_tracker.py`	Baseline + current screenshot dirs	Pass/fail per page, change percentages
`test_report_generator.py`	Session data JSON	Markdown or JSON report with recommendations

All tools support

--json

for machine output. Health scorer and regression tracker return exit code 1 on failure (CI-friendly).

Workflow 1: Full Application QA Sweep (11 Phases)

Phase 1-2: Pre-flight and authentication.

Verify
```
git status
```
is clean. Abort if dirty.
Create session directory:
```
.qa-sessions/{timestamp}/
```
Authenticate via Chrome MCP if needed.

Phase 3-4: Orient and explore.

Use
```
mcp__claude-in-chrome__read_page
```
to build sitemap/page map.
Navigate each route. Check
```
read_console_messages
```
for errors,
```
read_network_requests
```
for 4xx/5xx.
Test all forms with valid data, empty submissions, and boundary values.

Phase 5: State testing.

Verify loading states (skeleton screens, not blank), empty states (guides to first action), error states, success states, partial states.
Four shadow paths per interaction: happy path, nil input, empty input, error upstream.

Phase 6: Cross-device and security.

Resize to 320px, 768px, 1024px, 1440px, 1920px.
Check touch targets (44x44px min), layout shifts.
Verify security headers (CSP, HSTS, X-Frame-Options), cookie flags.

Phase 7-8: Document and score.

Record every finding with screenshot evidence. No finding without evidence.
Classify by severity (P0-P4) and category (10 categories).

Run:

python scripts/qa_health_scorer.py findings.json --baseline .qa-baselines/latest.json

Phase 9: Triage and fix loop.

P3/P4: AUTO-FIX, commit atomically, verify.
P0/P1/P2: ASK, present evidence, propose fix, wait for approval.
After each fix: re-run check. If fail:
```
git revert
```
.
Hard stop at 50 fixes.

Phase 10-11: Regression check and report.

Re-visit fixed pages. Verify no new errors.

Generate report:

python scripts/test_report_generator.py session.json --save-baseline

Validation checkpoint: Health score >= 85. Zero P0 findings. WCAG AA >= 95%.

Workflow 2: Visual Regression Testing

# Set up baseline
python scripts/visual_regression_tracker.py --init --baseline-dir ./baselines
# Capture and register screenshots
python scripts/visual_regression_tracker.py --register ./baselines
# After changes, compare
python scripts/visual_regression_tracker.py --baseline ./baselines --current ./screenshots --threshold 5 --json
# Accept intentional changes
python scripts/visual_regression_tracker.py --update-baseline --baseline ./baselines --current ./screenshots

Pages exceeding the threshold (default 5%) are flagged as regressions. Uses SHA-256 hashing and byte-level comparison.

Workflow 3: Accessibility Audit

python scripts/accessibility_auditor.py page.html --level AA --json
curl -s https://example.com | python scripts/accessibility_auditor.py - --level AAA

What gets checked by level:

A (Must Fix): Alt text, page language, form labels, headings, duplicate IDs, autoplay media
AA (Should Fix): Color contrast (4.5:1 text, 3:1 large), heading hierarchy, focus visible, error identification
AAA (Nice to Have): Enhanced contrast (7:1), extended audio, reading level

Each violation includes: WCAG criterion, severity, element selector, and remediation guidance.

Testing Tiers

Tier	Duration	Scope
Quick	30s	Console errors, broken links, basic a11y, mobile resize
Standard	2-5 min	+ Top 10 routes, forms, contrast, Core Web Vitals
Deep	10-20 min	+ Full sitemap, state testing, WCAG AA, performance, visual regression, security headers
Exhaustive	30+ min	+ Every element, WCAG AAA, all pages performance, 5 breakpoints, auth edge cases, memory leaks

Health Scoring System

10 weighted categories, score 0-100:

Category	Weight	Measures
Functional	18%	Forms, CRUD, navigation flows
Accessibility	13%	WCAG compliance, keyboard nav
Console Errors	12%	JS errors, unhandled rejections
UX Flow	12%	Logical navigation, clear feedback
Performance	12%	Core Web Vitals within thresholds
Visual Consistency	10%	Layout shifts, alignment, z-index
Broken Links	8%	HTTP 4xx/5xx, dead anchors
Content Quality	5%	Spelling, placeholder text, truncation
Security Headers	5%	CSP, HSTS, cookie flags
Mobile Responsive	5%	Breakpoints, touch targets, no h-scroll

Severity deductions: P0: -30, P1: -18, P2: -10, P3: -4, P4: -1.

Grades: A (90-100), B (80-89), C (70-79), D (60-69), F (0-59).

Safety Controls

Clean working tree required -- abort if
```
git status
```
dirty.
Max 50 fixes per session -- hard stop.
Risk accumulator -- component (+5), style (+2), config (+8), revert (+15). Stop at 25% of budget.
WTF heuristic -- 3 consecutive fix verification failures = stop entirely.
Atomic commits -- one fix = one commit:
```
fix(qa): [P{severity}] {description}
```

Troubleshooting

Problem	Cause	Solution
Scorer exits code 1 with no errors	Score below `--threshold` (default 70)	Check score in output; raise threshold or fix findings
Auditor reports `parse-error`	Malformed HTML	Verify file is complete; check curl is not returning redirect
Regression tracker 100% change on all pages	Baseline manifest empty	Run `--init` then `--register` before comparing
Findings default to P3/functional	Missing `severity` or `category` keys	Include both keys in each finding dict
Chrome MCP returns stale content after SPA nav	DOM updated without full page load	Wait for transition, call `read_page` again

References

Guide	Path
Browser Testing Methodology	`references/browser_testing_methodology.md`
WCAG Compliance Guide	`references/wcag_compliance_guide.md`
Performance Benchmarks	`references/performance_benchmarks.md`

Integration Points

Skill	Integration
`code-reviewer`	Health score and findings in PR review context
`senior-frontend`	Visual regression baselines align with component library
`senior-devops`	Health score gates CI/CD via exit code
`senior-secops`	Security header findings escalate to security review
`incident-commander`	P0 findings trigger incident response

Last Updated: April 2026 Version: 2.1.0