Agentic-qe qe-agentic-quality-engineering

AI agents as force multipliers for quality work. Core skill for all 19 QE agents using PACT principles.

install

source · Clone the upstream repo

git clone https://github.com/proffesor-for-testing/agentic-qe

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/proffesor-for-testing/agentic-qe "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.kiro/skills/qe-agentic-quality-engineering" ~/.claude/skills/proffesor-for-testing-agentic-qe-qe-agentic-quality-engineering && rm -rf "$T"

manifest: .kiro/skills/qe-agentic-quality-engineering/SKILL.md

Agentic Quality Engineering

<default_to_action> When implementing agentic QE or coordinating agents:

SPAWN appropriate agent(s) for the task using
```
Task
```
tool with agent type
CONFIGURE agent coordination (hierarchical/mesh/sequential)
EXECUTE with PACT principles: Proactive analysis, Autonomous operation, Collaborative feedback, Targeted risk focus
VALIDATE results through quality gates before deployment
LEARN from outcomes - store patterns in
```
aqe/learning/*
```
namespace

Quick Agent Selection:

Test generation needed →
```
qe-test-generator
```
Coverage gaps →
```
qe-coverage-analyzer
```
Quality decision →
```
qe-quality-gate
```
Security scan →
```
qe-security-scanner
```
Performance test →
```
qe-performance-tester
```
Full pipeline →
```
qe-fleet-commander
```

Critical Success Factors:

Agents amplify human expertise, not replace it
Human-in-the-loop for critical decisions
Measure: bugs caught, time saved, coverage improved </default_to_action>

Quick Reference Card

When to Use

Designing autonomous testing systems
Scaling QE with intelligent agents
Implementing multi-agent coordination
Building CI/CD quality pipelines

PACT Principles

Principle	Agent Behavior	Human Role
Proactive	Analyze pre-merge, predict risk	Set guardrails
Autonomous	Execute tests, fix flaky tests	Review critical
Collaborative	Multi-agent coordination	Provide context
Targeted	Risk-based prioritization	Define risk areas

19-Agent Fleet

Category	Agents	Primary Use
Core Testing (5)	test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzer	Daily testing
Performance/Security (2)	performance-tester, security-scanner	Non-functional
Strategic (3)	requirements-validator, production-intelligence, fleet-commander	Planning
Advanced (4)	regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunter	Specialized
Visual/Chaos (2)	visual-tester, chaos-engineer	Edge cases
Deployment (1)	deployment-readiness	Release
Analysis (1)	code-complexity	Maintainability

Coordination Patterns

Hierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gate

Success Criteria

✅ 10x deployment frequency with same/better quality ✅ Coverage gaps detected in real-time ✅ Bugs caught pre-production ❌ Agents acting without human oversight on critical decisions ❌ Deploying all 19 agents at once (start with 1-2)

Core Concepts

QE Evolution

Stage	Approach	Limitation
Traditional	Manual everything	Human bottleneck
Automation	Scripts + fixed scenarios	Needs orchestration
Agentic	AI agents + human judgment	Requires trust-building

Core Premise: Agents amplify human expertise for 10x scale.

Key Capabilities

1. Intelligent Test Generation

// Agent analyzes code change, generates targeted tests
const tests = await qeTestGenerator.generate(prDiff);
// → Happy path, edge cases, error handling tests

2. Pattern Detection - Scan logs, find anomalies, correlate errors

3. Adaptive Strategy - Adjust test focus based on risk signals

4. Root Cause Analysis - Link failures to code changes, suggest fixes

Agent Coordination

Memory Namespaces

aqe/test-plan/*     - Test planning decisions
aqe/coverage/*      - Coverage analysis results
aqe/quality/*       - Quality metrics and gates
aqe/learning/*      - Patterns and Q-values
aqe/coordination/*  - Cross-agent state

Memory Operations (MCP Tools)

CRITICAL: Always use

aqe memory store

with

persist: true

for learnings.

1. Store data to persistent memory:

# Store test plan decisions (persisted to .agentic-qe/memory.db)
aqe memory store \
  --key "aqe/test-plan/pr-123" \
  --namespace "aqe/test-plan" \
  --value '{"prNumber":123,"riskLevel":"medium","requiredCoverage":85,"testTypes":["unit","integration"]}' \
  --ttl 604800 \
  --json

2. Retrieve prior learnings before task:

# Query patterns before starting test generation
aqe memory search \
  --pattern "aqe/learning/patterns/test-generation/*" \
  --namespace "aqe/learning" \
  --json

3. Store coverage analysis results:

aqe memory store \
  --key "aqe/coverage/auth-module" \
  --namespace "aqe/coverage" \
  --value '{"moduleId":"auth-module","currentCoverage":78,"gaps":["error-handling","edge-cases"],"priority":"high"}' \
  --ttl 1209600 \
  --json

Three-Phase Memory Protocol

For coordinated multi-agent tasks, use the STATUS → PROGRESS → COMPLETE pattern:

# PHASE 1: STATUS - Task starting
aqe memory store \
  --key "aqe/coordination/task-123/status" \
  --namespace "aqe/coordination" \
  --value '{"status":"running","agent":"qe-test-generator"}' \
  --json

# PHASE 2: PROGRESS - Intermediate updates
aqe memory store \
  --key "aqe/coordination/task-123/progress" \
  --namespace "aqe/coordination" \
  --value '{"progress":50,"action":"generating-unit-tests","testsGenerated":25}' \
  --json

# PHASE 3: COMPLETE - Task finished
aqe memory store \
  --key "aqe/coordination/task-123/complete" \
  --namespace "aqe/coordination" \
  --value '{"status":"complete","result":"success","testsGenerated":47,"coverageAchieved":92.3}' \
  --json

Blackboard Events

Event	Trigger	Subscribers
`test:generated`	New tests created	executor, coverage
`coverage:gap`	Gap detected	test-generator
`quality:decision`	Gate evaluated	fleet-commander
`security:finding`	Vulnerability found	quality-gate

Example: PR Quality Pipeline

// 1. Risk analysis
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");

// 2. Generate tests for risks
const tests = await Task("Generate tests", risks, "qe-test-generator");

// 3. Execute + analyze
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");

// 4. Quality decision
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → GO/NO-GO with rationale

Implementation Phases

Phase	Duration	Goal	Agent(s)
Experiment	Weeks 1-4	Validate one use case	1 agent
Integrate	Months 2-3	CI/CD pipeline	3-4 agents
Scale	Months 4-6	Multiple use cases	8+ agents
Evolve	Ongoing	Continuous learning	Full fleet

Phase 1 Example

# Week 1: Deploy single agent
aqe agent spawn qe-test-generator

# Weeks 2-3: Generate tests for 10 PRs
# Track: bugs found, test quality, review time

# Week 4: Measure impact
aqe agent metrics qe-test-generator
# → Tests: 150, Bugs: 12, Time saved: 8h

Limitations & Strengths

Agents Excel At

Volume: Scan thousands of logs in seconds
Patterns: Find correlations humans miss
Tireless: 24/7 testing and monitoring
Speed: Instant code change analysis

Agents Need Humans For

Business context and priorities
Ethical judgment and trade-offs
Creative exploration ("what if" scenarios)
Domain expertise (healthcare, finance, legal)

Best Practices

Do	Don't
Start with one agent, one use case	Deploy all 18 at once
Build feedback loops early	Deploy and forget
Human reviews agent output	Auto-merge without review
Measure bugs caught, time saved	Track vanity metrics (test count)
Build trust gradually	Give full autonomy immediately

Trust Progression

Month 1: Agent suggests → Human decides
Month 2: Agent acts → Human reviews after
Month 3: Agent autonomous on low-risk
Month 4: Agent handles critical with oversight

Agent Coordination Hints

coordination:
  topology: hierarchical
  commander: qe-fleet-commander
  memory_namespace: aqe/coordination
  blackboard_topic: qe-fleet

preload_skills:
  - agentic-quality-engineering  # Always (this skill)
  - risk-based-testing           # For prioritization
  - quality-metrics              # For measurement

agent_assignments:
  qe-test-generator: [api-testing-patterns, tdd-london-chicago]
  qe-coverage-analyzer: [quality-metrics, risk-based-testing]
  qe-security-scanner: [security-testing, risk-based-testing]
  qe-performance-tester: [performance-testing]

Related Skills

```
holistic-testing-pact
```
- PACT principles deep dive
```
risk-based-testing
```
- Prioritize agent focus
```
quality-metrics
```
- Measure agent effectiveness

api-testing-patterns

security-testing

performance-testing

- Specialized testing

Resources

Agent definitions:
```
.claude/agents/
```
CLI:
```
aqe agent --help
```
Fleet status:
```
aqe fleet status
```

Success Metric: Deploy 10x more frequently with same or better quality through intelligent agent collaboration.