Agentic-qe qe-agentic-quality-engineering

AI agents as force multipliers for quality work. Core skill for all 19 QE agents using PACT principles.

install
source · Clone the upstream repo
git clone https://github.com/proffesor-for-testing/agentic-qe
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/proffesor-for-testing/agentic-qe "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.kiro/skills/qe-agentic-quality-engineering" ~/.claude/skills/proffesor-for-testing-agentic-qe-qe-agentic-quality-engineering && rm -rf "$T"
manifest: .kiro/skills/qe-agentic-quality-engineering/SKILL.md
source content

Agentic Quality Engineering

<default_to_action> When implementing agentic QE or coordinating agents:

  1. SPAWN appropriate agent(s) for the task using
    Task
    tool with agent type
  2. CONFIGURE agent coordination (hierarchical/mesh/sequential)
  3. EXECUTE with PACT principles: Proactive analysis, Autonomous operation, Collaborative feedback, Targeted risk focus
  4. VALIDATE results through quality gates before deployment
  5. LEARN from outcomes - store patterns in
    aqe/learning/*
    namespace

Quick Agent Selection:

  • Test generation needed →
    qe-test-generator
  • Coverage gaps →
    qe-coverage-analyzer
  • Quality decision →
    qe-quality-gate
  • Security scan →
    qe-security-scanner
  • Performance test →
    qe-performance-tester
  • Full pipeline →
    qe-fleet-commander

Critical Success Factors:

  • Agents amplify human expertise, not replace it
  • Human-in-the-loop for critical decisions
  • Measure: bugs caught, time saved, coverage improved </default_to_action>

Quick Reference Card

When to Use

  • Designing autonomous testing systems
  • Scaling QE with intelligent agents
  • Implementing multi-agent coordination
  • Building CI/CD quality pipelines

PACT Principles

PrincipleAgent BehaviorHuman Role
ProactiveAnalyze pre-merge, predict riskSet guardrails
AutonomousExecute tests, fix flaky testsReview critical
CollaborativeMulti-agent coordinationProvide context
TargetedRisk-based prioritizationDefine risk areas

19-Agent Fleet

CategoryAgentsPrimary Use
Core Testing (5)test-generator, test-executor, coverage-analyzer, quality-gate, quality-analyzerDaily testing
Performance/Security (2)performance-tester, security-scannerNon-functional
Strategic (3)requirements-validator, production-intelligence, fleet-commanderPlanning
Advanced (4)regression-risk-analyzer, test-data-architect, api-contract-validator, flaky-test-hunterSpecialized
Visual/Chaos (2)visual-tester, chaos-engineerEdge cases
Deployment (1)deployment-readinessRelease
Analysis (1)code-complexityMaintainability

Coordination Patterns

Hierarchical: fleet-commander → [generators] → [executors] → quality-gate
Mesh: test-gen ↔ coverage ↔ quality (peer decisions)
Sequential: risk-analyzer → test-gen → executor → coverage → gate

Success Criteria

✅ 10x deployment frequency with same/better quality ✅ Coverage gaps detected in real-time ✅ Bugs caught pre-production ❌ Agents acting without human oversight on critical decisions ❌ Deploying all 19 agents at once (start with 1-2)


Core Concepts

QE Evolution

StageApproachLimitation
TraditionalManual everythingHuman bottleneck
AutomationScripts + fixed scenariosNeeds orchestration
AgenticAI agents + human judgmentRequires trust-building

Core Premise: Agents amplify human expertise for 10x scale.

Key Capabilities

1. Intelligent Test Generation

// Agent analyzes code change, generates targeted tests
const tests = await qeTestGenerator.generate(prDiff);
// → Happy path, edge cases, error handling tests

2. Pattern Detection - Scan logs, find anomalies, correlate errors

3. Adaptive Strategy - Adjust test focus based on risk signals

4. Root Cause Analysis - Link failures to code changes, suggest fixes


Agent Coordination

Memory Namespaces

aqe/test-plan/*     - Test planning decisions
aqe/coverage/*      - Coverage analysis results
aqe/quality/*       - Quality metrics and gates
aqe/learning/*      - Patterns and Q-values
aqe/coordination/*  - Cross-agent state

Memory Operations (MCP Tools)

CRITICAL: Always use

aqe memory store
with
persist: true
for learnings.

1. Store data to persistent memory:

# Store test plan decisions (persisted to .agentic-qe/memory.db)
aqe memory store \
  --key "aqe/test-plan/pr-123" \
  --namespace "aqe/test-plan" \
  --value '{"prNumber":123,"riskLevel":"medium","requiredCoverage":85,"testTypes":["unit","integration"]}' \
  --ttl 604800 \
  --json

2. Retrieve prior learnings before task:

# Query patterns before starting test generation
aqe memory search \
  --pattern "aqe/learning/patterns/test-generation/*" \
  --namespace "aqe/learning" \
  --json

3. Store coverage analysis results:

aqe memory store \
  --key "aqe/coverage/auth-module" \
  --namespace "aqe/coverage" \
  --value '{"moduleId":"auth-module","currentCoverage":78,"gaps":["error-handling","edge-cases"],"priority":"high"}' \
  --ttl 1209600 \
  --json

Three-Phase Memory Protocol

For coordinated multi-agent tasks, use the STATUS → PROGRESS → COMPLETE pattern:

# PHASE 1: STATUS - Task starting
aqe memory store \
  --key "aqe/coordination/task-123/status" \
  --namespace "aqe/coordination" \
  --value '{"status":"running","agent":"qe-test-generator"}' \
  --json

# PHASE 2: PROGRESS - Intermediate updates
aqe memory store \
  --key "aqe/coordination/task-123/progress" \
  --namespace "aqe/coordination" \
  --value '{"progress":50,"action":"generating-unit-tests","testsGenerated":25}' \
  --json

# PHASE 3: COMPLETE - Task finished
aqe memory store \
  --key "aqe/coordination/task-123/complete" \
  --namespace "aqe/coordination" \
  --value '{"status":"complete","result":"success","testsGenerated":47,"coverageAchieved":92.3}' \
  --json

Blackboard Events

EventTriggerSubscribers
test:generated
New tests createdexecutor, coverage
coverage:gap
Gap detectedtest-generator
quality:decision
Gate evaluatedfleet-commander
security:finding
Vulnerability foundquality-gate

Example: PR Quality Pipeline

// 1. Risk analysis
const risks = await Task("Analyze PR", prDiff, "qe-regression-risk-analyzer");

// 2. Generate tests for risks
const tests = await Task("Generate tests", risks, "qe-test-generator");

// 3. Execute + analyze
const results = await Task("Run tests", tests, "qe-test-executor");
const coverage = await Task("Check coverage", results, "qe-coverage-analyzer");

// 4. Quality decision
const decision = await Task("Evaluate", {results, coverage}, "qe-quality-gate");
// → GO/NO-GO with rationale

Implementation Phases

PhaseDurationGoalAgent(s)
ExperimentWeeks 1-4Validate one use case1 agent
IntegrateMonths 2-3CI/CD pipeline3-4 agents
ScaleMonths 4-6Multiple use cases8+ agents
EvolveOngoingContinuous learningFull fleet

Phase 1 Example

# Week 1: Deploy single agent
aqe agent spawn qe-test-generator

# Weeks 2-3: Generate tests for 10 PRs
# Track: bugs found, test quality, review time

# Week 4: Measure impact
aqe agent metrics qe-test-generator
# → Tests: 150, Bugs: 12, Time saved: 8h

Limitations & Strengths

Agents Excel At

  • Volume: Scan thousands of logs in seconds
  • Patterns: Find correlations humans miss
  • Tireless: 24/7 testing and monitoring
  • Speed: Instant code change analysis

Agents Need Humans For

  • Business context and priorities
  • Ethical judgment and trade-offs
  • Creative exploration ("what if" scenarios)
  • Domain expertise (healthcare, finance, legal)

Best Practices

DoDon't
Start with one agent, one use caseDeploy all 18 at once
Build feedback loops earlyDeploy and forget
Human reviews agent outputAuto-merge without review
Measure bugs caught, time savedTrack vanity metrics (test count)
Build trust graduallyGive full autonomy immediately

Trust Progression

Month 1: Agent suggests → Human decides
Month 2: Agent acts → Human reviews after
Month 3: Agent autonomous on low-risk
Month 4: Agent handles critical with oversight

Agent Coordination Hints

coordination:
  topology: hierarchical
  commander: qe-fleet-commander
  memory_namespace: aqe/coordination
  blackboard_topic: qe-fleet

preload_skills:
  - agentic-quality-engineering  # Always (this skill)
  - risk-based-testing           # For prioritization
  - quality-metrics              # For measurement

agent_assignments:
  qe-test-generator: [api-testing-patterns, tdd-london-chicago]
  qe-coverage-analyzer: [quality-metrics, risk-based-testing]
  qe-security-scanner: [security-testing, risk-based-testing]
  qe-performance-tester: [performance-testing]

Related Skills

  • holistic-testing-pact
    - PACT principles deep dive
  • risk-based-testing
    - Prioritize agent focus
  • quality-metrics
    - Measure agent effectiveness
  • api-testing-patterns
    ,
    security-testing
    ,
    performance-testing
    - Specialized testing

Resources

  • Agent definitions:
    .claude/agents/
  • CLI:
    aqe agent --help
  • Fleet status:
    aqe fleet status

Success Metric: Deploy 10x more frequently with same or better quality through intelligent agent collaboration.