Claude-skill-registry harshjudge

AI-native E2E testing orchestration for Claude Code. Use when creating, running, or managing end-to-end test scenarios with visual evidence capture. Activates for tasks involving E2E tests, browser automation testing, test scenario creation, test execution with screenshots, or checking test status.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/harshjudge" ~/.claude/skills/majiayu000-claude-skill-registry-harshjudge && rm -rf "$T"
manifest: skills/data/harshjudge/SKILL.md
source content

HarshJudge E2E Testing

AI-native E2E testing with MCP tools and visual evidence capture.

Core Principles

  1. Evidence First: Screenshot before and after every action
  2. Fail Fast: Stop on error, report with context
  3. Complete Runs: Always call
    completeRun
    , even on failure
  4. Step Isolation: Each step executes in its own spawned agent for token efficiency
  5. Knowledge Accumulation: Learnings go to
    prd.md
    , not scenarios

Step-Based Execution

HarshJudge uses a step-based agent pattern for token-efficient test execution:

Main Agent                    Step Agents (spawned per step)
    │
    ├─ startRun(scenarioSlug)
    │      ↓
    │  Returns: runId, steps[]
    │
    ├─► Spawn Agent: Step 01 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "01", status)
    │      │
    ├─► Spawn Agent: Step 02 ──────────────────────► Execute actions
    │      │                                              │
    │      │ ◄─────────────────────────────────── Return: { status, evidencePaths }
    │      │
    │   completeStep(runId, "02", status)
    │      │
    │   ... (repeat for each step)
    │
    └─ completeRun(runId, finalStatus)

Benefits:

  • Each step agent has isolated context (no token accumulation)
  • Large outputs (screenshots, logs) saved to files, not returned
  • Main agent only receives concise summaries
  • Automatic token optimization without manual management

Workflows

IntentReferenceKey Tools
Initialize projectreferences/setup.md
initProject
Create scenarioreferences/create.md
createScenario
Run scenarioreferences/run.md
startRun
,
completeStep
,
completeRun
Fix failed testreferences/iterate.md
getStatus
,
createScenario
Check statusreferences/status.md
getStatus

Project Structure

.harshJudge/
  config.yaml              # Project configuration
  prd.md                   # Product requirements (from assets/prd.md template)
  scenarios/{slug}/
    meta.yaml              # Scenario definition + run statistics
    steps/                 # Individual step files
      01-step-slug.md      # Step 01 details
      02-step-slug.md      # Step 02 details
      ...
    runs/{runId}/          # Run history
      result.json          # Run result with per-step data
      step-01/evidence/    # Step 01 evidence
      step-02/evidence/    # Step 02 evidence
      ...
  snapshots/               # Inspection tool outputs (token-saving pattern)

Quick Reference

HarshJudge MCP Tools

ToolPurpose
initProject
Initialize project (spawns dashboard)
createScenario
Create/update scenario with step files
toggleStar
Toggle/set scenario starred status
startRun
Start test run, returns step list
recordEvidence
Capture evidence for a step
completeStep
Complete a step, get next step ID
completeRun
Finalize run with status
getStatus
Check project or scenario status
openDashboard
/
closeDashboard
Manage dashboard server

Playwright MCP Tools

ToolPurpose
browser_navigate
Navigate to URL
browser_snapshot
Get accessibility tree (use before click/type)
browser_click
Click element using ref
browser_type
Type into input using ref
browser_take_screenshot
Capture screenshot for evidence
browser_console_messages
Get console logs
browser_network_requests
Get network activity
browser_wait_for
Wait for text/condition

Step Agent Prompt Template

When spawning an agent for each step:

Execute step {stepId} of scenario {scenarioSlug}:

## Step Content
{content from steps/{stepId}-{slug}.md}

## Project Context
Base URL: {from config.yaml}
Auth: {from prd.md if needed}

## Previous Step
Status: {pass|fail|first step}

## Your Task
1. Execute the actions using Playwright MCP tools
2. Use browser_snapshot before clicking to get element refs
3. Capture before/after screenshots using browser_take_screenshot
4. Record evidence using recordEvidence with step={stepNumber}

Return ONLY a JSON object:
{
  "status": "pass" | "fail",
  "evidencePaths": ["path1.png", "path2.png"],
  "error": null | "error message"
}

DO NOT return full evidence content. DO NOT explain your work.

Error Handling

On ANY error:

  1. STOP - Do not proceed
  2. Report - Tool, params, error, resolution
  3. Check prd.md - Is this a known pattern?
  4. Do NOT retry - Unless user instructs