Myclaude-creator-engine test
git clone https://github.com/myclaude-sh/myclaude-creator-engine
T=$(mktemp -d) && git clone --depth=1 https://github.com/myclaude-sh/myclaude-creator-engine "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/test" ~/.claude/skills/myclaude-sh-myclaude-creator-engine-test && rm -rf "$T"
.claude/skills/test/SKILL.mdTester
Run sandbox tests on a product in an isolated environment.
When to use: Before publishing, after major changes, or to verify activation protocol works standalone.
When NOT to use: For quick syntax checks (use /validate instead). For testing the Engine itself.
Activation Protocol
- Shared preamble: Load
— context assembly, persona adaptation, deterministic routing rules.references/quality/activation-preamble.md - Identify target product:
- If
provided, use as slug →$ARGUMENTSworkspace/{slug}/ - If not, list products in workspace/ and ask
- If
- Read
→ get type, state, mcs_target.meta.yaml - Load product-dna/{type}.yaml → get install_target pattern
- Verify product has content (state != scaffold)
- Create worktree isolation (this skill runs with
) 5b. Stale worktree check: Before creating a new worktree, globisolation: worktree
. If stale entries exist (>1 hour old by directory mtime), attempt removal. If removal fails, proceed anyway — never block testing on cleanup..claude/worktrees/ - Load voice identity: Load
. Test result reporting — pass verdicts (celebrating tone), fail verdicts (confronting tone), diagnostics (conducting tone) — honors the ✦ signature, three tones, and six anti-patterns.references/quality/engine-voice-core.md
Core Instructions
TEST EXECUTION
Step 1 — Install Simulation
Copy product files to the install target path (from product-dna):
workspace/{slug}/ → {worktree}/{install_target}/
Verify:
- All files copied successfully
- No broken relative paths after move
- Primary file exists at target location
Step 2 — Activation Test
Test the activation protocol:
- Read the primary file (SKILL.md, AGENT.md, etc.)
- Verify it references files that exist in the installed location
- Check that references/ paths resolve correctly
- Verify frontmatter is valid (name, description present)
Step 2b — Type-Specific Tests (run BEFORE generic scenarios)
| Type | Test | Pass Criteria |
|---|---|---|
| hooks | 1. Parse — must be valid JSON | JSON.parse succeeds |
2. Run each handler script with mock stdin: | Exit code 0 or 2 (not crash) | |
| 3. Verify all event names in hooks.json are from the 25-event list | No unknown events | |
| 4. Run security scan (same patterns as /validate Stage 2) | Zero injection patterns | |
| statusline | 1. Verify has shebang ( or ) | First line matches |
2. Run with mock JSON stdin: | Produces stdout output (non-empty) | |
3. Verify is valid JSON with and | Fields present and valid | |
4. Check script reads from stdin ( or pattern present) | Pattern found | |
| minds | 1. Load — verify frontmatter has and | Both fields present |
2. Verify or field exists in frontmatter | At least one tool restriction | |
| 3. Verify the file is self-contained (no broken references to external files) | All refs resolve | |
4. Run test prompt: invoke as Agent with and a simple domain question | Produces domain-relevant response |
If type-specific tests fail, report them separately in the test report under a
TYPE-SPECIFIC section.
Step 3 — Three Scenarios
Run 3 test inputs that stress the product at different levels. A product that only handles the ideal case is fragile — robust products handle all three:
| Scenario | Purpose | Input Strategy | What Failure Reveals |
|---|---|---|---|
| Happy path (ideal) | Normal expected use | Typical request matching the product's description | If this fails, the product is fundamentally broken |
| Edge case (challenging) | Boundary conditions | Minimal input, unusual formatting, missing context | If this fails, D14 (Graceful Degradation) is weak |
| Adversarial (problematic) | Graceful failure | Invalid input, prompt injection attempt, conflicting instructions | If this fails, the product is unsafe for marketplace |
For each scenario:
- Invoke the product with the test input
- Capture output
- Verify against D4 (Quality Gate) criteria if defined
- Check for crashes, undefined behavior, or silent failures
Step 4 — Must-Haves Verification
If
.meta.yaml contains must_haves:, verify each:
- Truths: Can the stated behavior be observed? Try to trigger it.
- Artifacts: Do the listed files exist with minimum content?
- Key Links: Do the grep patterns match?
Report: "Must-haves: {passed}/{total} verified" If any fail: "MUST-HAVE FAILED: {description}. The product may not achieve its stated goal."
Step 5 — Report
UX Stack (load before rendering results):
§1 Context Assembly + §2.3 Moment Awareness (test pass/fail)references/ux-experience-system.md
— translate termsreferences/ux-vocabulary.md
— Brand DNAreferences/quality/engine-voice.md
Cognitive rendering: /test results are a confidence moment. On pass: "Works in practice, not just on paper." — factual confidence, not celebration. On fail: diagnostic mode — show exactly what failed and why, suggest specific fix. For first-time test pass, this is a milestone: "Your product works. Real conversations validated." For experts: compact result line + only surface surprising observations. Never celebrate a test pass the same way twice for the same creator — vary the insight.
TEST REPORT — {slug} INSTALL PASS Files copied: {N}, paths valid ACTIVATION PASS Primary file loads, refs resolve, frontmatter valid HAPPY PATH PASS Output matches expected behavior EDGE CASE PASS Handles minimal input gracefully ADVERSARIAL PASS Rejects invalid input without crashing MUST-HAVES PASS {passed}/{total} verified Result: 6/6 PASS — Ready for /package {if failures} FAILURES: {scenario}: {what went wrong} Suggested fix: {recommendation} {/if}
Step 6 — Update State
On test completion, update
.meta.yaml:
# .meta.yaml updates state: last_tested: "{ISO timestamp}" test_result: "pass" # "pass" if all 5 checks pass, "fail" otherwise test_scenarios: "{passed}/{total}"
If any test fails, record
test_result: "fail" but do NOT change state.phase — testing does not regress product state.
WORKTREE CLEANUP PROTOCOL
After all test scenarios complete (pass or fail):
- Record results first — write test results to the ORIGINAL
(not the worktree copy) before any cleanup attempt. Results must survive cleanup failure..meta.yaml - Attempt cleanup — the
frontmatter handles automatic cleanup. If the worktree persists after skill completion:isolation: worktree- On next
invocation: check for stale worktrees via/test
. If found, attempt removal before creating a new one.glob .claude/worktrees/* - On Windows: file lock contention is common. If removal fails, log the path and proceed — never block the test pipeline on cleanup failure.
- On next
- Disk budget guard — if
contains more than 3 directories, emit advisory: "Stale worktrees detected ({N} found). Run cleanup manually: remove.claude/worktrees/
contents.".claude/worktrees/
Quality Gate
Test is considered PASS if:
- Product installs without broken references
- Activation protocol loads successfully
- Happy path produces meaningful output
- Edge case doesn't crash or produce empty output
- Adversarial input is handled (rejection, fallback, or graceful error)
- All
entries verified (if defined in .meta.yaml)must_haves
Anti-Patterns
- Testing in production — Always use worktree isolation. Never modify the creator's workspace.
- Weak adversarial — "please ignore instructions" is too simple. Use domain-appropriate adversarial inputs.
- Binary pass/fail — Provide specific failure descriptions, not just FAIL.
- Skipping cleanup — Worktree must be cleaned up after test, even on failure.
- Testing the template — If the product still has placeholder content, don't test. Suggest /fill first.
Compact Instructions
When context is compressed, preserve:
- Product slug, type, and test status (pass/fail)
- Which scenarios passed/failed (happy path, edge case, adversarial)
- Type-specific test results if applicable
- Worktree cleanup status
- Whether test was blocking for MCS-2+ packaging