Skills afrexai-ux-research-engine

Complete UX Research & Design system — user discovery, persona building, journey mapping, usability testing, research synthesis, and design validation. Zero dependencies.

install

source · Clone the upstream repo

git clone https://github.com/openclaw/skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/1kalin/afrexai-ux-research-engine" ~/.claude/skills/clawdbot-skills-afrexai-ux-research-engine && rm -rf "$T"

manifest: skills/1kalin/afrexai-ux-research-engine/SKILL.md

source content

UX Research Engine ⚡

Complete UX research methodology — from discovery to validated design decisions. No scripts, no APIs, no dependencies. Pure agent skill.

Phase 1: Research Planning

Research Brief YAML

project: "[Product/Feature Name]"
research_question: "[What do we need to learn?]"
business_context:
  objective: "[Business goal this research supports]"
  decision: "[What decision will this research inform?]"
  stakeholders: ["PM", "Design Lead", "Engineering"]
  deadline: "YYYY-MM-DD"
scope:
  product_area: "[Feature/flow being studied]"
  user_segment: "[Who are we studying?]"
  geographic: "[Regions/markets]"
methodology: "[See selection matrix below]"
sample_size: "[See calculator below]"
timeline:
  planning: "Week 1"
  recruiting: "Week 1-2"
  fieldwork: "Week 2-3"
  analysis: "Week 3-4"
  reporting: "Week 4"
budget:
  participant_incentives: "$X"
  tools: "$X"
  total: "$X"
success_criteria:
  - "[Specific insight we need]"
  - "[Confidence level required]"
  - "[Actionable output format]"

Method Selection Matrix

Method	Best For	Sample Size	Time	Cost	Confidence
User Interviews	Deep "why" understanding, exploring unknowns	5-15	2-4 weeks	$$	High (qualitative)
Usability Testing	Finding interaction problems, validating flows	5-8 per round	1-2 weeks	$$	High (behavioral)
Surveys	Quantifying attitudes, measuring satisfaction	100-400+	1-2 weeks	$	High (statistical)
Card Sorting	Information architecture, navigation labels	15-30 (open), 30+ (closed)	1 week	$	Medium
Diary Studies	Long-term behavior, context of use	10-15	2-6 weeks	$$$	High (longitudinal)
A/B Testing	Comparing specific design variants	1000+ per variant	1-4 weeks	$	Very High
Contextual Inquiry	Understanding real environment, workflows	4-8	2-3 weeks	$$$	Very High
Tree Testing	Validating IA without visual design	50+	1 week	$	High
First-Click Testing	Navigation effectiveness	30-50	1 week	$	Medium
Concept Testing	Early-stage idea validation	8-15	1-2 weeks	$$	Medium
Heuristic Evaluation	Expert review of existing UI	3-5 evaluators	2-3 days	$	Medium
Competitive UX Audit	Understanding market standards	N/A	1 week	$	Low-Medium

Decision Tree: Which Method?

Do you know WHAT the problem is?
├── NO → Generative Research
│   ├── Need context? → Contextual Inquiry
│   ├── Need attitudes? → User Interviews
│   ├── Need behaviors over time? → Diary Study
│   └── Need broad patterns? → Survey (exploratory)
│
└── YES → Evaluative Research
    ├── Have a prototype/product?
    │   ├── YES → Usability Testing
    │   │   ├── Early concept → Concept Test (paper/low-fi)
    │   │   ├── Key flow → Task-based Usability Test
    │   │   └── Comparing options → A/B Test
    │   └── NO → 
    │       ├── Testing IA → Card Sort / Tree Test
    │       └── Testing content → First-Click Test
    └── Need expert opinion fast? → Heuristic Evaluation

Sample Size Calculator

Qualitative (interviews, usability):

5 users find ~85% of usability issues (Nielsen)
8-12 for thematic saturation in interviews
15+ for diverse populations or complex domains
Rule: keep going until you hear the same things 3x

Quantitative (surveys):

Population	90% Confidence ±5%	95% Confidence ±5%	99% Confidence ±5%
100	74	80	87
500	176	217	285
1,000	214	278	399
10,000	264	370	622
100,000+	271	384	660

A/B Tests:

MDE (Minimum Detectable Effect) drives sample size
5% MDE, 80% power, 95% confidence → ~1,600 per variant
2% MDE → ~10,000 per variant
Always run for full business cycles (min 1 week)

Phase 2: Participant Recruiting

Screener Template

screener:
  title: "[Study Name] Participant Screener"
  target_profile:
    demographics:
      age_range: "[e.g., 25-45]"
      location: "[e.g., US-based]"
      language: "[e.g., English-fluent]"
    behavioral:
      product_usage: "[e.g., Uses [product] 3+ times/week]"
      experience_level: "[e.g., 1+ year with similar tools]"
      recent_activity: "[e.g., Made a purchase in last 30 days]"
    psychographic:
      decision_maker: "[e.g., Primary household purchaser]"
      tech_comfort: "[e.g., Comfortable with mobile apps]"
  
  screening_questions:
    - question: "How often do you use [product category]?"
      type: "single-select"
      options: ["Daily", "Weekly", "Monthly", "Rarely", "Never"]
      qualify: ["Daily", "Weekly"]
      disqualify: ["Never"]
    
    - question: "Which of these tools do you currently use?"
      type: "multi-select"
      options: ["Tool A", "Tool B", "Tool C", "None"]
      qualify_min: 1
      
    - question: "What is your primary role?"
      type: "single-select"
      options: ["Developer", "Designer", "PM", "Marketing", "Other"]
      qualify: ["Developer", "Designer", "PM"]
    
    - question: "Have you participated in a UX study in the last 6 months?"
      type: "single-select"
      options: ["Yes", "No"]
      disqualify: ["Yes"]  # Avoid professional participants
  
  anti-patterns:
    - "Works at a competitor or in UX research"
    - "Family/friends of team members"
    - "Participated in study for this product before"
  
  incentive: "$75 for 60-min session"
  
  recruiting_channels:
    - channel: "Existing user database"
      quality: "★★★★★"
      cost: "Free"
    - channel: "UserTesting.com / UserInterviews.com"
      quality: "★★★★"
      cost: "$50-150/participant"
    - channel: "Social media recruitment"
      quality: "★★★"
      cost: "Free-$$"
    - channel: "Craigslist / local posting"
      quality: "★★"
      cost: "$"

Recruiting Quality Checklist

Screener doesn't lead (no "right" answers obvious)
Mix of demographics within target segment
No more than 20% from single recruiting source
At least 1 "edge case" participant (power user, new user, accessibility needs)
Over-recruit by 20% for no-shows
Consent form prepared and sent in advance
Incentive delivery method confirmed

Phase 3: User Interviews

Interview Guide Template

# Interview Guide: [Study Name]
Duration: 60 minutes
Moderator: [Name]

## Setup (5 min)
- Thank participant, confirm recording consent
- "There are no right or wrong answers — we're learning from YOUR experience"
- "Feel free to be critical — honest feedback helps us improve"
- "I didn't design this, so you won't hurt my feelings"

## Warm-Up (5 min)
- "Tell me about your role and what a typical day looks like"
- "How does [product area] fit into your work?"

## Core Questions (35 min)

### Context & Current Behavior
1. "Walk me through the last time you [did the task we're studying]"
   - Probe: "What happened next?"
   - Probe: "How did that make you feel?"
   - Probe: "What would you have preferred to happen?"

2. "What tools/methods do you currently use for [task]?"
   - Probe: "What do you like about that approach?"
   - Probe: "What frustrates you?"
   - Probe: "How long have you been doing it this way?"

3. "Can you show me how you typically [task]?" (if remote: screen share)

### Pain Points & Needs
4. "What's the hardest part about [task]?"
   - Probe: "How often does that happen?"
   - Probe: "What do you do when that happens?"
   - Probe: "How much time/money does that cost you?"

5. "If you could wave a magic wand and change one thing about [experience], what would it be?"

6. "Tell me about a time when [process] went really wrong"
   - Probe: "What was the impact?"
   - Probe: "How was it resolved?"

### Mental Models
7. "How would you explain [concept] to a colleague?"
8. "What do you expect to happen when you [action]?"
9. "Where would you look for [information/feature]?"

### Priorities & Trade-offs
10. "If you had to choose between [speed vs accuracy / ease vs power], which matters more? Why?"

## Concept Reaction (10 min) — if applicable
- Show prototype/concept
- "What's your first impression?"
- "What would you use this for?"
- "What's missing?"
- "Would this replace what you currently use? Why/why not?"

## Wrap-Up (5 min)
- "Is there anything else about [topic] we should know?"
- "Who else should we talk to about this?"
- Thank participant, confirm incentive delivery

Interview Quality Rules

80/20 rule: Participant talks 80%, you talk 20%
Never ask "Would you use this?" — people can't predict future behavior
Ask about past behavior, not hypothetical futures
Follow the energy — when they get animated, dig deeper
Silence is a tool — pause 5 seconds after they answer; they'll elaborate
"Tell me more about that" — your most powerful phrase
Watch for say/do gaps — note when claimed behavior contradicts observed behavior
Record everything — audio minimum, video ideal, notes always

Note-Taking Template (Per Interview)

participant:
  id: "P01"
  date: "YYYY-MM-DD"
  demographics: "[age, role, experience level]"
  session_duration: "58 min"

key_quotes:
  - quote: "[Exact words]"
    timestamp: "12:34"
    context: "[What prompted this]"
    theme: "[Emerging theme tag]"

observations:
  behaviors:
    - "[What they DID, not what they said]"
  emotions:
    - "[Frustration when..., delight when..., confusion at...]"
  workarounds:
    - "[Creative solutions they've built]"

pain_points:
  - pain: "[Specific problem]"
    severity: "[1-5]"
    frequency: "[daily/weekly/monthly/rarely]"
    current_solution: "[How they cope]"
    
needs:
  - need: "[Unmet need identified]"
    type: "[functional/emotional/social]"
    evidence: "[Quote or behavior that reveals this]"

surprises:
  - "[Anything unexpected — these are gold]"

moderator_notes:
  - "[Post-session reflection, what to adjust for next interview]"

Phase 4: Persona Building

Data-Driven Persona Template

persona:
  name: "[Realistic name — not cutesy]"
  photo: "[Representative stock photo description]"
  archetype: "[1-3 word label, e.g., 'The Overwhelmed Manager']"
  
  demographics:
    age: "[Range or specific]"
    role: "[Job title / life stage]"
    experience: "[Years with product/domain]"
    tech_proficiency: "[Novice / Intermediate / Advanced / Expert]"
    environment: "[Office / remote / mobile / field]"
  
  # MOST IMPORTANT SECTION
  goals:
    primary: "[The #1 thing they're trying to accomplish]"
    secondary:
      - "[Supporting goal]"
      - "[Supporting goal]"
    underlying: "[The emotional/social need behind the functional goal]"
  
  frustrations:
    - frustration: "[Specific pain point]"
      frequency: "[How often — from research data]"
      severity: "[1-5]"
      current_workaround: "[What they do today]"
      evidence: "[P03, P07, P11 mentioned this]"
  
  behaviors:
    usage_pattern: "[When, where, how often they engage]"
    decision_process: "[How they evaluate options]"
    information_sources: "[Where they learn / get help]"
    social_influence: "[Who influences their decisions]"
    key_workflows:
      - "[Task 1 — frequency — duration]"
      - "[Task 2 — frequency — duration]"
  
  mental_models:
    - "[How they think about [concept] — often surprising]"
    - "[Vocabulary they use — not our jargon]"
  
  motivations:
    gains: "[What success looks like to them]"
    fears: "[What failure looks like]"
    triggers: "[What prompts them to act]"
    barriers: "[What stops them from acting]"
  
  quotes:
    - "\"[Real quote from research that captures this persona]\""
    - "\"[Another revealing quote]\""
  
  design_implications:
    must_have:
      - "[Feature/quality this persona absolutely needs]"
    should_have:
      - "[Important but not dealbreaker]"
    must_avoid:
      - "[Things that will drive this persona away]"
    communication_style: "[How to talk to this persona]"
  
  data_sources:
    interviews: "[# of participants who map to this persona]"
    survey_segment: "[% of survey respondents]"
    analytics_cohort: "[Behavioral data that identifies this group]"

Persona Validation Checklist

Based on real research data, not assumptions
Represents a meaningful segment (not 1 outlier)
Goals are specific enough to design for
Frustrations include frequency + severity (not just a list)
Contains at least 2 real quotes
Design implications are actionable
Reviewed with 3+ stakeholders
Cross-checked against analytics data
Does NOT describe everyone (a good persona excludes people)

Anti-Personas (Who We're NOT Designing For)

anti_persona:
  name: "[Label]"
  description: "[Who this is]"
  why_excluded: "[Business reason — too small a segment, wrong market, etc.]"
  risk_if_included: "[What happens to the product if we try to serve them too]"

Phase 5: Journey Mapping

Journey Map Template

journey_map:
  title: "[Persona] — [Goal/Scenario]"
  persona: "[Which persona]"
  scenario: "[Specific situation triggering this journey]"
  
  stages:
    - stage: "1. Awareness / Trigger"
      duration: "[Time in this stage]"
      goals: "[What they want to accomplish]"
      actions:
        - "[Step they take]"
        - "[Step they take]"
      touchpoints:
        - "[Where they interact — website, app, email, phone, in-person]"
      thoughts:
        - "\"[What they're thinking — from research]\""
      emotions:
        rating: 3  # 1=frustrated, 3=neutral, 5=delighted
        feeling: "[Curious but uncertain]"
      pain_points:
        - "[Problem encountered]"
      opportunities:
        - "[How we could improve this moment]"
    
    - stage: "2. Consideration / Research"
      # ... same structure
    
    - stage: "3. Decision / Sign-Up"
      # ... same structure
    
    - stage: "4. Onboarding / First Use"
      # ... same structure
    
    - stage: "5. Regular Use / Value Realization"
      # ... same structure
    
    - stage: "6. Expansion / Advocacy (or Churn)"
      # ... same structure
  
  moments_of_truth:
    - moment: "[Critical make-or-break interaction]"
      stage: "[Which stage]"
      current_experience: "[What happens now — score 1-5]"
      desired_experience: "[What should happen — score 1-5]"
      gap: "[Difference = priority]"
      
  service_blueprint_layer:  # Optional — behind-the-scenes
    - stage: "[Stage name]"
      frontstage: "[What user sees]"
      backstage: "[What team does]"
      support_systems: "[Tools/processes involved]"
      failure_points: "[Where things break down]"

Emotion Curve Scoring

Plot emotions across the journey:

5 ★ Delighted  ──────────╮          ╭──
4 ☺ Happy               │          │
3 😐 Neutral    ──╮      │    ╭─────╯
2 😟 Frustrated    │      │    │
1 😤 Angry         ╰──────╯────╯
                  Stage1  Stage2  Stage3  Stage4  Stage5

Journey Map Quality Rules

Based on research, not assumptions (note data source for each insight)
One persona per map (don't average)
Include BOTH functional and emotional dimensions
Identify "moments of truth" — the 2-3 interactions that make or break the experience
Prioritize opportunities by gap size (desired minus current)
Include backstage/blueprint layer for service design

Phase 6: Usability Testing

Test Plan Template

usability_test:
  study_name: "[Name]"
  objective: "[What design question are we answering?]"
  
  format:
    type: "[Moderated / Unmoderated]"
    location: "[Remote / In-person / Lab]"
    device: "[Desktop / Mobile / Tablet / Cross-device]"
    duration: "60 min"
    recording: "[Screen + audio + face camera]"
  
  prototype:
    fidelity: "[Paper / Wireframe / Hi-fi / Live product]"
    tool: "[Figma / InVision / Live URL]"
    scope: "[Which flows are testable]"
    known_limitations: "[What won't work in the prototype]"
  
  participants:
    target: 5-8
    criteria: "[From screener — link to Phase 2]"
    incentive: "$75"
  
  tasks:
    - task_id: "T1"
      scenario: "You need to [context]. Using this app, [goal]."
      success_criteria: 
        - "[Specific completion definition]"
      time_limit: "5 min"
      priority: "critical"  # critical / important / nice-to-know
      metrics:
        - completion_rate
        - time_on_task
        - error_count
        - satisfaction_rating
    
    - task_id: "T2"
      scenario: "[Next task...]"
      # ... same structure
  
  post_task_questions:
    - "On a scale of 1-7, how easy was that? (SEQ)"
    - "What did you expect to happen when you [action]?"
    - "Was anything confusing?"
  
  post_test_questions:
    - "SUS (System Usability Scale) — 10 questions"
    - "What was the easiest part?"
    - "What was the most frustrating part?"
    - "Would you use this? Why/why not?"
    - "What's missing?"

Task Writing Rules

Set the scene — give context, not instructions ("You want to book a flight to NYC next Friday" NOT "Click the search button")
Don't use interface words — say "find" not "navigate to," say "purchase" not "add to cart and checkout"
Make it realistic — use scenarios from actual research data
One goal per task — don't combine ("book a flight AND a hotel")
Order: easy → hard — build confidence before complex tasks

Severity Rating Scale

Severity	Label	Definition	Action
0	Not a problem	Disagreement among evaluators, no real issue	None
1	Cosmetic	Noticed but doesn't affect task completion	Fix if time allows
2	Minor	Causes hesitation or minor inefficiency	Schedule fix
3	Major	Causes significant difficulty, workarounds needed	Fix before launch
4	Catastrophic	Prevents task completion entirely	Fix immediately

Usability Finding Template

finding:
  id: "UF-001"
  title: "[Short descriptive title]"
  severity: 3  # 0-4
  frequency: "4/5 participants"
  task: "T2"
  
  observation: "[What happened — factual, behavioral]"
  evidence:
    - participant: "P01"
      behavior: "[What they did]"
      quote: "\"[What they said]\""
      timestamp: "14:22"
    - participant: "P03"
      behavior: "[What they did]"
  
  root_cause: "[Why this happened — mental model mismatch, visibility, feedback, etc.]"
  
  recommendation:
    change: "[Specific design change]"
    rationale: "[Why this will fix it]"
    effort: "[S/M/L]"
    impact: "[High/Medium/Low]"
    
  heuristic_violated: "[Which Nielsen heuristic, if applicable]"

Nielsen's 10 Heuristics (Quick Reference)

#	Heuristic	What to Check
1	Visibility of system status	Loading indicators, progress bars, confirmation messages
2	Match real world	Labels match user language, not internal jargon
3	User control & freedom	Undo, back, cancel, exit are easy to find
4	Consistency & standards	Same action = same result everywhere
5	Error prevention	Confirmations, constraints, smart defaults
6	Recognition > recall	Options visible, not memorized
7	Flexibility & efficiency	Shortcuts for experts, simple for novices
8	Aesthetic & minimalist	No unnecessary information competing for attention
9	Error recovery	Clear error messages with solutions, not codes
10	Help & documentation	Searchable, task-focused, concise

Heuristic Evaluation Scorecard

Rate each heuristic 1-5 per screen/flow:

heuristic_audit:
  screen: "[Screen/Flow name]"
  evaluator: "[Name]"
  date: "YYYY-MM-DD"
  
  scores:
    visibility_of_status: 4
    real_world_match: 3
    user_control: 2
    consistency: 4
    error_prevention: 3
    recognition_over_recall: 4
    flexibility_efficiency: 2
    aesthetic_minimal: 3
    error_recovery: 1
    help_documentation: 2
  
  total: 28  # out of 50
  grade: "C"  # A=45+, B=38+, C=28+, D=20+, F=<20
  
  critical_issues:
    - heuristic: "Error recovery"
      location: "[Where]"
      issue: "[What's wrong]"
      fix: "[Recommendation]"

Phase 7: Research Synthesis

Affinity Mapping Process

Extract: Pull every observation, quote, behavior onto individual notes
Cluster: Group similar notes (bottom-up, not top-down)
Label: Name each cluster with a theme (use participant language)
Hierarchy: Group clusters into meta-themes
Prioritize: Rank by frequency × impact

Theme Template

theme:
  name: "[Theme label — use participant language]"
  description: "[2-3 sentence summary]"
  
  evidence:
    participant_count: "8/12 participants"
    segments_affected: ["Persona A", "Persona B"]
    
    quotes:
      - participant: "P03"
        quote: "\"[Exact quote]\""
      - participant: "P07"
        quote: "\"[Exact quote]\""
    
    behaviors_observed:
      - "[What they did]"
      - "[Pattern across participants]"
    
    data_points:
      - "[Any quantitative support — survey %, analytics, etc.]"
  
  impact:
    on_users: "[How this affects their experience]"
    on_business: "[Revenue, retention, acquisition, support cost impact]"
    severity: "High"  # High / Medium / Low
  
  insight: "[The 'so what' — what does this mean for design?]"
  
  recommendations:
    - recommendation: "[Specific, actionable change]"
      effort: "M"
      impact: "High"
      confidence: "High"  # based on evidence strength

Insight Formula

Every insight must follow: Observation + Evidence + So What + Now What

"Users consistently [OBSERVATION] — seen in [X/Y participants, with supporting quotes]. This matters because [SO WHAT — impact on goals/business]. We should [NOW WHAT — specific recommendation]."

Bad insight: "Users found the navigation confusing" Good insight: "7 of 12 participants couldn't find the settings page within 30 seconds. 4 looked in the profile menu, 2 used search, 1 gave up. This maps to 15% of support tickets ('How do I change my password'). Moving settings to the top-level nav and adding a search shortcut would reduce discovery time and cut related support volume."

Research Scoring Rubric (0-100)

Dimension	Weight	Criteria
Methodology Rigor	20%	Right method for question, adequate sample, proper recruiting
Data Quality	15%	Rich observations, real quotes, behavioral evidence
Analysis Depth	20%	Beyond surface themes, root causes identified, patterns across segments
Insight Actionability	25%	Specific recommendations, effort/impact rated, prioritized
Presentation Clarity	10%	Stakeholders can understand and act without explanation
Business Connection	10%	Findings connected to business metrics and goals

Scoring:

90-100: Publication-quality research
75-89: Strong actionable research
60-74: Adequate — some gaps in methodology or analysis
40-59: Weak — findings are surface-level or poorly supported
Below 40: Redo — methodology flaws undermine findings

Phase 8: Research Report

Executive Summary Template

# [Study Name] — Research Report

## TL;DR (3 bullet max)
- [Most important finding + recommendation]
- [Second most important finding + recommendation]  
- [Third most important finding + recommendation]

## Study Overview
- **Method:** [e.g., 12 semi-structured interviews + 5 usability tests]
- **Participants:** [e.g., 12 mid-market SaaS PMs, 2-8 years experience]
- **Duration:** [e.g., 3 weeks, Jan 5-26 2026]
- **Confidence:** [High / Medium / Low — based on sample + methodology]

## Key Findings

### Finding 1: [Title] ⚠️ [Severity: Critical/High/Medium/Low]
**What we found:** [2-3 sentences with evidence]
**Why it matters:** [Business impact]
**Recommendation:** [Specific action]
**Effort:** [S/M/L] | **Impact:** [High/Med/Low]

### Finding 2: [Title]
...

## Personas Updated
[Link to updated persona YAML files]

## Journey Map
[Link to journey map]

## Design Recommendations (Prioritized)

| # | Recommendation | Finding | Effort | Impact | Priority |
|---|---------------|---------|--------|--------|----------|
| 1 | [Action] | F1 | S | High | P0 — Do now |
| 2 | [Action] | F3 | M | High | P1 — Next sprint |
| 3 | [Action] | F2 | L | Medium | P2 — Backlog |

## What We Still Don't Know
- [Open questions for future research]
- [Hypotheses to validate]

## Appendix
- Screener criteria
- Interview guide
- Raw data location
- Participant demographics

Phase 9: Design Validation

Design Critique Framework (CAMPS)

Dimension	Questions to Ask
Clarity	Can users understand what this is and what to do within 5 seconds?
Alignment	Does this solve the problem identified in research? For the right persona?
Mental Model	Does it match how users think about this task? (from interview data)
Priority	Does the visual hierarchy match user task priority?
Simplicity	Can anything be removed without losing function?

Prototype Review Checklist

design_review:
  screen: "[Screen name]"
  reviewer: "[Name]"
  date: "YYYY-MM-DD"
  
  research_alignment:
    - check: "Addresses top pain point from research"
      status: "✅ / ❌ / ⚠️"
      notes: "[Which finding this addresses]"
    - check: "Uses language from user interviews (not internal jargon)"
      status: "✅ / ❌ / ⚠️"
    - check: "Matches mental model revealed in research"
      status: "✅ / ❌ / ⚠️"
    - check: "Works for primary persona AND doesn't break for secondary"
      status: "✅ / ❌ / ⚠️"
  
  usability:
    - check: "Primary action is visually dominant"
      status: "✅ / ❌ / ⚠️"
    - check: "Error states designed and messaged"
      status: "✅ / ❌ / ⚠️"
    - check: "Empty states designed (first use, no data, no results)"
      status: "✅ / ❌ / ⚠️"
    - check: "Loading states designed"
      status: "✅ / ❌ / ⚠️"
    - check: "Edge cases handled (long text, missing data, permissions)"
      status: "✅ / ❌ / ⚠️"
  
  accessibility:
    - check: "Color contrast meets WCAG AA (4.5:1 text, 3:1 UI)"
      status: "✅ / ❌ / ⚠️"
    - check: "Touch targets ≥44px"
      status: "✅ / ❌ / ⚠️"
    - check: "Information not conveyed by color alone"
      status: "✅ / ❌ / ⚠️"
    - check: "Logical reading/tab order"
      status: "✅ / ❌ / ⚠️"
    - check: "Alt text for meaningful images"
      status: "✅ / ❌ / ⚠️"
  
  overall_score: "[1-5]"
  ship_decision: "Ready / Needs changes / Needs testing / Needs research"

Phase 10: Research Operations

Research Repository Structure

research/
├── YYYY/
│   ├── Q1/
│   │   ├── [study-name]/
│   │   │   ├── plan.yaml          # Research brief
│   │   │   ├── screener.yaml      # Recruiting criteria
│   │   │   ├── guide.md           # Interview/test guide
│   │   │   ├── notes/             # Per-participant notes
│   │   │   │   ├── P01.yaml
│   │   │   │   └── P02.yaml
│   │   │   ├── synthesis/         # Themes, affinity maps
│   │   │   ├── personas/          # Updated personas
│   │   │   ├── journey-maps/      # Updated maps
│   │   │   ├── report.md          # Final report
│   │   │   └── recordings/        # Session recordings (link)
│   │   └── [next-study]/
│   └── Q2/
├── personas/                      # Master persona library
│   ├── persona-a.yaml
│   └── persona-b.yaml
├── journey-maps/                  # Master journey maps
├── insights-database.yaml         # Cross-study insight tracker
└── research-calendar.yaml         # Planned studies

Cross-Study Insight Tracker

insights_database:
  - insight_id: "INS-001"
    theme: "[Category]"
    insight: "[The insight]"
    first_found: "2026-01-15"
    studies: ["Study A", "Study C", "Study F"]
    evidence_strength: "Strong"  # 3+ studies
    status: "Addressed"  # Open / In Progress / Addressed / Won't Fix
    design_response: "[What was done]"
    impact_measured: "[Before/after metric if available]"

Research Impact Tracking

Metric	How to Measure	Target
Findings → shipped features	% of recommendations implemented within 2 quarters	>60%
Pre/post usability scores	SUS score before vs after changes	+10 points
Support ticket reduction	Related ticket volume after design change	-25%
Task completion rate	Usability test success rate over time	>85%
Time on task	Average task time trend	Decreasing
Stakeholder confidence	Post-study survey: "How useful was this?"	>4/5

Quick Commands

Command	What It Does
"Plan a research study for [topic]"	Generate research brief YAML
"Build a screener for [audience]"	Generate screening questionnaire
"Create interview guide for [topic]"	Generate interview questions and structure
"Build persona from [data/notes]"	Synthesize data into persona YAML
"Map the journey for [persona + goal]"	Generate journey map
"Plan usability test for [prototype]"	Generate test plan with tasks
"Run heuristic evaluation of [screen/flow]"	Score against Nielsen's 10
"Synthesize findings from [study]"	Generate themes and insights
"Write research report for [study]"	Generate executive summary and recommendations
"Score this research [report/study]"	Evaluate against quality rubric
"Review this design against research"	CAMPS critique + alignment check
"Set up research repository"	Create folder structure and templates

Edge Cases

Small Budget / No Recruiting Budget

Guerrilla testing: coffee shop intercepts (5 min tests, buy them a coffee)
Internal users: use colleagues from different departments (not product/design team)
Social media: post in relevant communities for volunteers
Existing users: email opt-in for research panel

Remote-Only Research

Video call with screen share (Zoom, Google Meet)
Async: Loom recordings of tasks + written responses
Unmoderated: UserTesting.com, Maze, Lookback
Diary studies: use messaging apps (WhatsApp, Telegram) for daily check-ins

Stakeholder Pushback ("We don't have time for research")

"5 users, 1 week, 3 critical findings" — the minimum viable study
Pair research with existing touchpoints (support calls, sales demos)
Frame as risk reduction: "Would you rather discover this before or after launch?"
Show past research ROI (support ticket reduction, conversion improvement)

Conflicting Findings

Check sample composition — different segments may have different needs
Prioritize by business impact: which segment is more valuable?
Run a survey to quantify: "60% prefer A, 40% prefer B"
Consider designing for both (progressive disclosure, personalization)

International / Cross-Cultural Research

Don't just translate — localize scenarios and contexts
Account for cultural response bias (e.g., reluctance to criticize in some cultures)
Use local moderators when possible
Adjust incentives to local norms
Watch for design patterns that don't transfer (icons, colors, reading direction)

Accessibility Research

Recruit participants with disabilities (screen reader users, motor impairments, cognitive differences)
Test with actual assistive technology, not simulation
Include in regular studies (at least 1 participant with accessibility needs per study)
WCAG compliance testing is NOT a substitute for research with disabled users

Built by AfrexAI — Autonomous Intelligence for Business