Skills afrexai-ux-research-engine
Complete UX Research & Design system — user discovery, persona building, journey mapping, usability testing, research synthesis, and design validation. Zero dependencies.
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/1kalin/afrexai-ux-research-engine" ~/.claude/skills/clawdbot-skills-afrexai-ux-research-engine && rm -rf "$T"
skills/1kalin/afrexai-ux-research-engine/SKILL.mdUX Research Engine ⚡
Complete UX research methodology — from discovery to validated design decisions. No scripts, no APIs, no dependencies. Pure agent skill.
Phase 1: Research Planning
Research Brief YAML
project: "[Product/Feature Name]" research_question: "[What do we need to learn?]" business_context: objective: "[Business goal this research supports]" decision: "[What decision will this research inform?]" stakeholders: ["PM", "Design Lead", "Engineering"] deadline: "YYYY-MM-DD" scope: product_area: "[Feature/flow being studied]" user_segment: "[Who are we studying?]" geographic: "[Regions/markets]" methodology: "[See selection matrix below]" sample_size: "[See calculator below]" timeline: planning: "Week 1" recruiting: "Week 1-2" fieldwork: "Week 2-3" analysis: "Week 3-4" reporting: "Week 4" budget: participant_incentives: "$X" tools: "$X" total: "$X" success_criteria: - "[Specific insight we need]" - "[Confidence level required]" - "[Actionable output format]"
Method Selection Matrix
| Method | Best For | Sample Size | Time | Cost | Confidence |
|---|---|---|---|---|---|
| User Interviews | Deep "why" understanding, exploring unknowns | 5-15 | 2-4 weeks | $$ | High (qualitative) |
| Usability Testing | Finding interaction problems, validating flows | 5-8 per round | 1-2 weeks | $$ | High (behavioral) |
| Surveys | Quantifying attitudes, measuring satisfaction | 100-400+ | 1-2 weeks | $ | High (statistical) |
| Card Sorting | Information architecture, navigation labels | 15-30 (open), 30+ (closed) | 1 week | $ | Medium |
| Diary Studies | Long-term behavior, context of use | 10-15 | 2-6 weeks | $$$ | High (longitudinal) |
| A/B Testing | Comparing specific design variants | 1000+ per variant | 1-4 weeks | $ | Very High |
| Contextual Inquiry | Understanding real environment, workflows | 4-8 | 2-3 weeks | $$$ | Very High |
| Tree Testing | Validating IA without visual design | 50+ | 1 week | $ | High |
| First-Click Testing | Navigation effectiveness | 30-50 | 1 week | $ | Medium |
| Concept Testing | Early-stage idea validation | 8-15 | 1-2 weeks | $$ | Medium |
| Heuristic Evaluation | Expert review of existing UI | 3-5 evaluators | 2-3 days | $ | Medium |
| Competitive UX Audit | Understanding market standards | N/A | 1 week | $ | Low-Medium |
Decision Tree: Which Method?
Do you know WHAT the problem is? ├── NO → Generative Research │ ├── Need context? → Contextual Inquiry │ ├── Need attitudes? → User Interviews │ ├── Need behaviors over time? → Diary Study │ └── Need broad patterns? → Survey (exploratory) │ └── YES → Evaluative Research ├── Have a prototype/product? │ ├── YES → Usability Testing │ │ ├── Early concept → Concept Test (paper/low-fi) │ │ ├── Key flow → Task-based Usability Test │ │ └── Comparing options → A/B Test │ └── NO → │ ├── Testing IA → Card Sort / Tree Test │ └── Testing content → First-Click Test └── Need expert opinion fast? → Heuristic Evaluation
Sample Size Calculator
Qualitative (interviews, usability):
- 5 users find ~85% of usability issues (Nielsen)
- 8-12 for thematic saturation in interviews
- 15+ for diverse populations or complex domains
- Rule: keep going until you hear the same things 3x
Quantitative (surveys):
| Population | 90% Confidence ±5% | 95% Confidence ±5% | 99% Confidence ±5% |
|---|---|---|---|
| 100 | 74 | 80 | 87 |
| 500 | 176 | 217 | 285 |
| 1,000 | 214 | 278 | 399 |
| 10,000 | 264 | 370 | 622 |
| 100,000+ | 271 | 384 | 660 |
A/B Tests:
- MDE (Minimum Detectable Effect) drives sample size
- 5% MDE, 80% power, 95% confidence → ~1,600 per variant
- 2% MDE → ~10,000 per variant
- Always run for full business cycles (min 1 week)
Phase 2: Participant Recruiting
Screener Template
screener: title: "[Study Name] Participant Screener" target_profile: demographics: age_range: "[e.g., 25-45]" location: "[e.g., US-based]" language: "[e.g., English-fluent]" behavioral: product_usage: "[e.g., Uses [product] 3+ times/week]" experience_level: "[e.g., 1+ year with similar tools]" recent_activity: "[e.g., Made a purchase in last 30 days]" psychographic: decision_maker: "[e.g., Primary household purchaser]" tech_comfort: "[e.g., Comfortable with mobile apps]" screening_questions: - question: "How often do you use [product category]?" type: "single-select" options: ["Daily", "Weekly", "Monthly", "Rarely", "Never"] qualify: ["Daily", "Weekly"] disqualify: ["Never"] - question: "Which of these tools do you currently use?" type: "multi-select" options: ["Tool A", "Tool B", "Tool C", "None"] qualify_min: 1 - question: "What is your primary role?" type: "single-select" options: ["Developer", "Designer", "PM", "Marketing", "Other"] qualify: ["Developer", "Designer", "PM"] - question: "Have you participated in a UX study in the last 6 months?" type: "single-select" options: ["Yes", "No"] disqualify: ["Yes"] # Avoid professional participants anti-patterns: - "Works at a competitor or in UX research" - "Family/friends of team members" - "Participated in study for this product before" incentive: "$75 for 60-min session" recruiting_channels: - channel: "Existing user database" quality: "★★★★★" cost: "Free" - channel: "UserTesting.com / UserInterviews.com" quality: "★★★★" cost: "$50-150/participant" - channel: "Social media recruitment" quality: "★★★" cost: "Free-$$" - channel: "Craigslist / local posting" quality: "★★" cost: "$"
Recruiting Quality Checklist
- Screener doesn't lead (no "right" answers obvious)
- Mix of demographics within target segment
- No more than 20% from single recruiting source
- At least 1 "edge case" participant (power user, new user, accessibility needs)
- Over-recruit by 20% for no-shows
- Consent form prepared and sent in advance
- Incentive delivery method confirmed
Phase 3: User Interviews
Interview Guide Template
# Interview Guide: [Study Name] Duration: 60 minutes Moderator: [Name] ## Setup (5 min) - Thank participant, confirm recording consent - "There are no right or wrong answers — we're learning from YOUR experience" - "Feel free to be critical — honest feedback helps us improve" - "I didn't design this, so you won't hurt my feelings" ## Warm-Up (5 min) - "Tell me about your role and what a typical day looks like" - "How does [product area] fit into your work?" ## Core Questions (35 min) ### Context & Current Behavior 1. "Walk me through the last time you [did the task we're studying]" - Probe: "What happened next?" - Probe: "How did that make you feel?" - Probe: "What would you have preferred to happen?" 2. "What tools/methods do you currently use for [task]?" - Probe: "What do you like about that approach?" - Probe: "What frustrates you?" - Probe: "How long have you been doing it this way?" 3. "Can you show me how you typically [task]?" (if remote: screen share) ### Pain Points & Needs 4. "What's the hardest part about [task]?" - Probe: "How often does that happen?" - Probe: "What do you do when that happens?" - Probe: "How much time/money does that cost you?" 5. "If you could wave a magic wand and change one thing about [experience], what would it be?" 6. "Tell me about a time when [process] went really wrong" - Probe: "What was the impact?" - Probe: "How was it resolved?" ### Mental Models 7. "How would you explain [concept] to a colleague?" 8. "What do you expect to happen when you [action]?" 9. "Where would you look for [information/feature]?" ### Priorities & Trade-offs 10. "If you had to choose between [speed vs accuracy / ease vs power], which matters more? Why?" ## Concept Reaction (10 min) — if applicable - Show prototype/concept - "What's your first impression?" - "What would you use this for?" - "What's missing?" - "Would this replace what you currently use? Why/why not?" ## Wrap-Up (5 min) - "Is there anything else about [topic] we should know?" - "Who else should we talk to about this?" - Thank participant, confirm incentive delivery
Interview Quality Rules
- 80/20 rule: Participant talks 80%, you talk 20%
- Never ask "Would you use this?" — people can't predict future behavior
- Ask about past behavior, not hypothetical futures
- Follow the energy — when they get animated, dig deeper
- Silence is a tool — pause 5 seconds after they answer; they'll elaborate
- "Tell me more about that" — your most powerful phrase
- Watch for say/do gaps — note when claimed behavior contradicts observed behavior
- Record everything — audio minimum, video ideal, notes always
Note-Taking Template (Per Interview)
participant: id: "P01" date: "YYYY-MM-DD" demographics: "[age, role, experience level]" session_duration: "58 min" key_quotes: - quote: "[Exact words]" timestamp: "12:34" context: "[What prompted this]" theme: "[Emerging theme tag]" observations: behaviors: - "[What they DID, not what they said]" emotions: - "[Frustration when..., delight when..., confusion at...]" workarounds: - "[Creative solutions they've built]" pain_points: - pain: "[Specific problem]" severity: "[1-5]" frequency: "[daily/weekly/monthly/rarely]" current_solution: "[How they cope]" needs: - need: "[Unmet need identified]" type: "[functional/emotional/social]" evidence: "[Quote or behavior that reveals this]" surprises: - "[Anything unexpected — these are gold]" moderator_notes: - "[Post-session reflection, what to adjust for next interview]"
Phase 4: Persona Building
Data-Driven Persona Template
persona: name: "[Realistic name — not cutesy]" photo: "[Representative stock photo description]" archetype: "[1-3 word label, e.g., 'The Overwhelmed Manager']" demographics: age: "[Range or specific]" role: "[Job title / life stage]" experience: "[Years with product/domain]" tech_proficiency: "[Novice / Intermediate / Advanced / Expert]" environment: "[Office / remote / mobile / field]" # MOST IMPORTANT SECTION goals: primary: "[The #1 thing they're trying to accomplish]" secondary: - "[Supporting goal]" - "[Supporting goal]" underlying: "[The emotional/social need behind the functional goal]" frustrations: - frustration: "[Specific pain point]" frequency: "[How often — from research data]" severity: "[1-5]" current_workaround: "[What they do today]" evidence: "[P03, P07, P11 mentioned this]" behaviors: usage_pattern: "[When, where, how often they engage]" decision_process: "[How they evaluate options]" information_sources: "[Where they learn / get help]" social_influence: "[Who influences their decisions]" key_workflows: - "[Task 1 — frequency — duration]" - "[Task 2 — frequency — duration]" mental_models: - "[How they think about [concept] — often surprising]" - "[Vocabulary they use — not our jargon]" motivations: gains: "[What success looks like to them]" fears: "[What failure looks like]" triggers: "[What prompts them to act]" barriers: "[What stops them from acting]" quotes: - "\"[Real quote from research that captures this persona]\"" - "\"[Another revealing quote]\"" design_implications: must_have: - "[Feature/quality this persona absolutely needs]" should_have: - "[Important but not dealbreaker]" must_avoid: - "[Things that will drive this persona away]" communication_style: "[How to talk to this persona]" data_sources: interviews: "[# of participants who map to this persona]" survey_segment: "[% of survey respondents]" analytics_cohort: "[Behavioral data that identifies this group]"
Persona Validation Checklist
- Based on real research data, not assumptions
- Represents a meaningful segment (not 1 outlier)
- Goals are specific enough to design for
- Frustrations include frequency + severity (not just a list)
- Contains at least 2 real quotes
- Design implications are actionable
- Reviewed with 3+ stakeholders
- Cross-checked against analytics data
- Does NOT describe everyone (a good persona excludes people)
Anti-Personas (Who We're NOT Designing For)
anti_persona: name: "[Label]" description: "[Who this is]" why_excluded: "[Business reason — too small a segment, wrong market, etc.]" risk_if_included: "[What happens to the product if we try to serve them too]"
Phase 5: Journey Mapping
Journey Map Template
journey_map: title: "[Persona] — [Goal/Scenario]" persona: "[Which persona]" scenario: "[Specific situation triggering this journey]" stages: - stage: "1. Awareness / Trigger" duration: "[Time in this stage]" goals: "[What they want to accomplish]" actions: - "[Step they take]" - "[Step they take]" touchpoints: - "[Where they interact — website, app, email, phone, in-person]" thoughts: - "\"[What they're thinking — from research]\"" emotions: rating: 3 # 1=frustrated, 3=neutral, 5=delighted feeling: "[Curious but uncertain]" pain_points: - "[Problem encountered]" opportunities: - "[How we could improve this moment]" - stage: "2. Consideration / Research" # ... same structure - stage: "3. Decision / Sign-Up" # ... same structure - stage: "4. Onboarding / First Use" # ... same structure - stage: "5. Regular Use / Value Realization" # ... same structure - stage: "6. Expansion / Advocacy (or Churn)" # ... same structure moments_of_truth: - moment: "[Critical make-or-break interaction]" stage: "[Which stage]" current_experience: "[What happens now — score 1-5]" desired_experience: "[What should happen — score 1-5]" gap: "[Difference = priority]" service_blueprint_layer: # Optional — behind-the-scenes - stage: "[Stage name]" frontstage: "[What user sees]" backstage: "[What team does]" support_systems: "[Tools/processes involved]" failure_points: "[Where things break down]"
Emotion Curve Scoring
Plot emotions across the journey:
5 ★ Delighted ──────────╮ ╭── 4 ☺ Happy │ │ 3 😐 Neutral ──╮ │ ╭─────╯ 2 😟 Frustrated │ │ │ 1 😤 Angry ╰──────╯────╯ Stage1 Stage2 Stage3 Stage4 Stage5
Journey Map Quality Rules
- Based on research, not assumptions (note data source for each insight)
- One persona per map (don't average)
- Include BOTH functional and emotional dimensions
- Identify "moments of truth" — the 2-3 interactions that make or break the experience
- Prioritize opportunities by gap size (desired minus current)
- Include backstage/blueprint layer for service design
Phase 6: Usability Testing
Test Plan Template
usability_test: study_name: "[Name]" objective: "[What design question are we answering?]" format: type: "[Moderated / Unmoderated]" location: "[Remote / In-person / Lab]" device: "[Desktop / Mobile / Tablet / Cross-device]" duration: "60 min" recording: "[Screen + audio + face camera]" prototype: fidelity: "[Paper / Wireframe / Hi-fi / Live product]" tool: "[Figma / InVision / Live URL]" scope: "[Which flows are testable]" known_limitations: "[What won't work in the prototype]" participants: target: 5-8 criteria: "[From screener — link to Phase 2]" incentive: "$75" tasks: - task_id: "T1" scenario: "You need to [context]. Using this app, [goal]." success_criteria: - "[Specific completion definition]" time_limit: "5 min" priority: "critical" # critical / important / nice-to-know metrics: - completion_rate - time_on_task - error_count - satisfaction_rating - task_id: "T2" scenario: "[Next task...]" # ... same structure post_task_questions: - "On a scale of 1-7, how easy was that? (SEQ)" - "What did you expect to happen when you [action]?" - "Was anything confusing?" post_test_questions: - "SUS (System Usability Scale) — 10 questions" - "What was the easiest part?" - "What was the most frustrating part?" - "Would you use this? Why/why not?" - "What's missing?"
Task Writing Rules
- Set the scene — give context, not instructions ("You want to book a flight to NYC next Friday" NOT "Click the search button")
- Don't use interface words — say "find" not "navigate to," say "purchase" not "add to cart and checkout"
- Make it realistic — use scenarios from actual research data
- One goal per task — don't combine ("book a flight AND a hotel")
- Order: easy → hard — build confidence before complex tasks
Severity Rating Scale
| Severity | Label | Definition | Action |
|---|---|---|---|
| 0 | Not a problem | Disagreement among evaluators, no real issue | None |
| 1 | Cosmetic | Noticed but doesn't affect task completion | Fix if time allows |
| 2 | Minor | Causes hesitation or minor inefficiency | Schedule fix |
| 3 | Major | Causes significant difficulty, workarounds needed | Fix before launch |
| 4 | Catastrophic | Prevents task completion entirely | Fix immediately |
Usability Finding Template
finding: id: "UF-001" title: "[Short descriptive title]" severity: 3 # 0-4 frequency: "4/5 participants" task: "T2" observation: "[What happened — factual, behavioral]" evidence: - participant: "P01" behavior: "[What they did]" quote: "\"[What they said]\"" timestamp: "14:22" - participant: "P03" behavior: "[What they did]" root_cause: "[Why this happened — mental model mismatch, visibility, feedback, etc.]" recommendation: change: "[Specific design change]" rationale: "[Why this will fix it]" effort: "[S/M/L]" impact: "[High/Medium/Low]" heuristic_violated: "[Which Nielsen heuristic, if applicable]"
Nielsen's 10 Heuristics (Quick Reference)
| # | Heuristic | What to Check |
|---|---|---|
| 1 | Visibility of system status | Loading indicators, progress bars, confirmation messages |
| 2 | Match real world | Labels match user language, not internal jargon |
| 3 | User control & freedom | Undo, back, cancel, exit are easy to find |
| 4 | Consistency & standards | Same action = same result everywhere |
| 5 | Error prevention | Confirmations, constraints, smart defaults |
| 6 | Recognition > recall | Options visible, not memorized |
| 7 | Flexibility & efficiency | Shortcuts for experts, simple for novices |
| 8 | Aesthetic & minimalist | No unnecessary information competing for attention |
| 9 | Error recovery | Clear error messages with solutions, not codes |
| 10 | Help & documentation | Searchable, task-focused, concise |
Heuristic Evaluation Scorecard
Rate each heuristic 1-5 per screen/flow:
heuristic_audit: screen: "[Screen/Flow name]" evaluator: "[Name]" date: "YYYY-MM-DD" scores: visibility_of_status: 4 real_world_match: 3 user_control: 2 consistency: 4 error_prevention: 3 recognition_over_recall: 4 flexibility_efficiency: 2 aesthetic_minimal: 3 error_recovery: 1 help_documentation: 2 total: 28 # out of 50 grade: "C" # A=45+, B=38+, C=28+, D=20+, F=<20 critical_issues: - heuristic: "Error recovery" location: "[Where]" issue: "[What's wrong]" fix: "[Recommendation]"
Phase 7: Research Synthesis
Affinity Mapping Process
- Extract: Pull every observation, quote, behavior onto individual notes
- Cluster: Group similar notes (bottom-up, not top-down)
- Label: Name each cluster with a theme (use participant language)
- Hierarchy: Group clusters into meta-themes
- Prioritize: Rank by frequency × impact
Theme Template
theme: name: "[Theme label — use participant language]" description: "[2-3 sentence summary]" evidence: participant_count: "8/12 participants" segments_affected: ["Persona A", "Persona B"] quotes: - participant: "P03" quote: "\"[Exact quote]\"" - participant: "P07" quote: "\"[Exact quote]\"" behaviors_observed: - "[What they did]" - "[Pattern across participants]" data_points: - "[Any quantitative support — survey %, analytics, etc.]" impact: on_users: "[How this affects their experience]" on_business: "[Revenue, retention, acquisition, support cost impact]" severity: "High" # High / Medium / Low insight: "[The 'so what' — what does this mean for design?]" recommendations: - recommendation: "[Specific, actionable change]" effort: "M" impact: "High" confidence: "High" # based on evidence strength
Insight Formula
Every insight must follow: Observation + Evidence + So What + Now What
"Users consistently [OBSERVATION] — seen in [X/Y participants, with supporting quotes]. This matters because [SO WHAT — impact on goals/business]. We should [NOW WHAT — specific recommendation]."
Bad insight: "Users found the navigation confusing" Good insight: "7 of 12 participants couldn't find the settings page within 30 seconds. 4 looked in the profile menu, 2 used search, 1 gave up. This maps to 15% of support tickets ('How do I change my password'). Moving settings to the top-level nav and adding a search shortcut would reduce discovery time and cut related support volume."
Research Scoring Rubric (0-100)
| Dimension | Weight | Criteria |
|---|---|---|
| Methodology Rigor | 20% | Right method for question, adequate sample, proper recruiting |
| Data Quality | 15% | Rich observations, real quotes, behavioral evidence |
| Analysis Depth | 20% | Beyond surface themes, root causes identified, patterns across segments |
| Insight Actionability | 25% | Specific recommendations, effort/impact rated, prioritized |
| Presentation Clarity | 10% | Stakeholders can understand and act without explanation |
| Business Connection | 10% | Findings connected to business metrics and goals |
Scoring:
- 90-100: Publication-quality research
- 75-89: Strong actionable research
- 60-74: Adequate — some gaps in methodology or analysis
- 40-59: Weak — findings are surface-level or poorly supported
- Below 40: Redo — methodology flaws undermine findings
Phase 8: Research Report
Executive Summary Template
# [Study Name] — Research Report ## TL;DR (3 bullet max) - [Most important finding + recommendation] - [Second most important finding + recommendation] - [Third most important finding + recommendation] ## Study Overview - **Method:** [e.g., 12 semi-structured interviews + 5 usability tests] - **Participants:** [e.g., 12 mid-market SaaS PMs, 2-8 years experience] - **Duration:** [e.g., 3 weeks, Jan 5-26 2026] - **Confidence:** [High / Medium / Low — based on sample + methodology] ## Key Findings ### Finding 1: [Title] ⚠️ [Severity: Critical/High/Medium/Low] **What we found:** [2-3 sentences with evidence] **Why it matters:** [Business impact] **Recommendation:** [Specific action] **Effort:** [S/M/L] | **Impact:** [High/Med/Low] ### Finding 2: [Title] ... ## Personas Updated [Link to updated persona YAML files] ## Journey Map [Link to journey map] ## Design Recommendations (Prioritized) | # | Recommendation | Finding | Effort | Impact | Priority | |---|---------------|---------|--------|--------|----------| | 1 | [Action] | F1 | S | High | P0 — Do now | | 2 | [Action] | F3 | M | High | P1 — Next sprint | | 3 | [Action] | F2 | L | Medium | P2 — Backlog | ## What We Still Don't Know - [Open questions for future research] - [Hypotheses to validate] ## Appendix - Screener criteria - Interview guide - Raw data location - Participant demographics
Phase 9: Design Validation
Design Critique Framework (CAMPS)
| Dimension | Questions to Ask |
|---|---|
| Clarity | Can users understand what this is and what to do within 5 seconds? |
| Alignment | Does this solve the problem identified in research? For the right persona? |
| Mental Model | Does it match how users think about this task? (from interview data) |
| Priority | Does the visual hierarchy match user task priority? |
| Simplicity | Can anything be removed without losing function? |
Prototype Review Checklist
design_review: screen: "[Screen name]" reviewer: "[Name]" date: "YYYY-MM-DD" research_alignment: - check: "Addresses top pain point from research" status: "✅ / ❌ / ⚠️" notes: "[Which finding this addresses]" - check: "Uses language from user interviews (not internal jargon)" status: "✅ / ❌ / ⚠️" - check: "Matches mental model revealed in research" status: "✅ / ❌ / ⚠️" - check: "Works for primary persona AND doesn't break for secondary" status: "✅ / ❌ / ⚠️" usability: - check: "Primary action is visually dominant" status: "✅ / ❌ / ⚠️" - check: "Error states designed and messaged" status: "✅ / ❌ / ⚠️" - check: "Empty states designed (first use, no data, no results)" status: "✅ / ❌ / ⚠️" - check: "Loading states designed" status: "✅ / ❌ / ⚠️" - check: "Edge cases handled (long text, missing data, permissions)" status: "✅ / ❌ / ⚠️" accessibility: - check: "Color contrast meets WCAG AA (4.5:1 text, 3:1 UI)" status: "✅ / ❌ / ⚠️" - check: "Touch targets ≥44px" status: "✅ / ❌ / ⚠️" - check: "Information not conveyed by color alone" status: "✅ / ❌ / ⚠️" - check: "Logical reading/tab order" status: "✅ / ❌ / ⚠️" - check: "Alt text for meaningful images" status: "✅ / ❌ / ⚠️" overall_score: "[1-5]" ship_decision: "Ready / Needs changes / Needs testing / Needs research"
Phase 10: Research Operations
Research Repository Structure
research/ ├── YYYY/ │ ├── Q1/ │ │ ├── [study-name]/ │ │ │ ├── plan.yaml # Research brief │ │ │ ├── screener.yaml # Recruiting criteria │ │ │ ├── guide.md # Interview/test guide │ │ │ ├── notes/ # Per-participant notes │ │ │ │ ├── P01.yaml │ │ │ │ └── P02.yaml │ │ │ ├── synthesis/ # Themes, affinity maps │ │ │ ├── personas/ # Updated personas │ │ │ ├── journey-maps/ # Updated maps │ │ │ ├── report.md # Final report │ │ │ └── recordings/ # Session recordings (link) │ │ └── [next-study]/ │ └── Q2/ ├── personas/ # Master persona library │ ├── persona-a.yaml │ └── persona-b.yaml ├── journey-maps/ # Master journey maps ├── insights-database.yaml # Cross-study insight tracker └── research-calendar.yaml # Planned studies
Cross-Study Insight Tracker
insights_database: - insight_id: "INS-001" theme: "[Category]" insight: "[The insight]" first_found: "2026-01-15" studies: ["Study A", "Study C", "Study F"] evidence_strength: "Strong" # 3+ studies status: "Addressed" # Open / In Progress / Addressed / Won't Fix design_response: "[What was done]" impact_measured: "[Before/after metric if available]"
Research Impact Tracking
| Metric | How to Measure | Target |
|---|---|---|
| Findings → shipped features | % of recommendations implemented within 2 quarters | >60% |
| Pre/post usability scores | SUS score before vs after changes | +10 points |
| Support ticket reduction | Related ticket volume after design change | -25% |
| Task completion rate | Usability test success rate over time | >85% |
| Time on task | Average task time trend | Decreasing |
| Stakeholder confidence | Post-study survey: "How useful was this?" | >4/5 |
Quick Commands
| Command | What It Does |
|---|---|
| "Plan a research study for [topic]" | Generate research brief YAML |
| "Build a screener for [audience]" | Generate screening questionnaire |
| "Create interview guide for [topic]" | Generate interview questions and structure |
| "Build persona from [data/notes]" | Synthesize data into persona YAML |
| "Map the journey for [persona + goal]" | Generate journey map |
| "Plan usability test for [prototype]" | Generate test plan with tasks |
| "Run heuristic evaluation of [screen/flow]" | Score against Nielsen's 10 |
| "Synthesize findings from [study]" | Generate themes and insights |
| "Write research report for [study]" | Generate executive summary and recommendations |
| "Score this research [report/study]" | Evaluate against quality rubric |
| "Review this design against research" | CAMPS critique + alignment check |
| "Set up research repository" | Create folder structure and templates |
Edge Cases
Small Budget / No Recruiting Budget
- Guerrilla testing: coffee shop intercepts (5 min tests, buy them a coffee)
- Internal users: use colleagues from different departments (not product/design team)
- Social media: post in relevant communities for volunteers
- Existing users: email opt-in for research panel
Remote-Only Research
- Video call with screen share (Zoom, Google Meet)
- Async: Loom recordings of tasks + written responses
- Unmoderated: UserTesting.com, Maze, Lookback
- Diary studies: use messaging apps (WhatsApp, Telegram) for daily check-ins
Stakeholder Pushback ("We don't have time for research")
- "5 users, 1 week, 3 critical findings" — the minimum viable study
- Pair research with existing touchpoints (support calls, sales demos)
- Frame as risk reduction: "Would you rather discover this before or after launch?"
- Show past research ROI (support ticket reduction, conversion improvement)
Conflicting Findings
- Check sample composition — different segments may have different needs
- Prioritize by business impact: which segment is more valuable?
- Run a survey to quantify: "60% prefer A, 40% prefer B"
- Consider designing for both (progressive disclosure, personalization)
International / Cross-Cultural Research
- Don't just translate — localize scenarios and contexts
- Account for cultural response bias (e.g., reluctance to criticize in some cultures)
- Use local moderators when possible
- Adjust incentives to local norms
- Watch for design patterns that don't transfer (icons, colors, reading direction)
Accessibility Research
- Recruit participants with disabilities (screen reader users, motor impairments, cognitive differences)
- Test with actual assistive technology, not simulation
- Include in regular studies (at least 1 participant with accessibility needs per study)
- WCAG compliance testing is NOT a substitute for research with disabled users
Built by AfrexAI — Autonomous Intelligence for Business