Agent-skills uxui-evaluator

Evaluate interface descriptions against 168 research-backed UX/UI principles. Returns structured findings with severity, remediation, and business impact. API key optional — enriched output requires uxuiprinciples.com API Access.

install
source · Clone the upstream repo
git clone https://github.com/uxuiprinciples/agent-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/uxuiprinciples/agent-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/uxui-evaluator" ~/.claude/skills/uxuiprinciples-agent-skills-uxui-evaluator && rm -rf "$T"
manifest: uxui-evaluator/SKILL.md
source content
[toolbox.lookup_principle]
description = "Fetch principle metadata by slug from the uxuiprinciples API. Returns code, title, aiSummary, businessImpact, tags, and difficulty. Pro tier returns all 168 principles; free tier returns 12."
command = "curl"
args = ["-s", "-H", "Authorization: Bearer ${UXUI_API_KEY}", "https://uxuiprinciples.com/api/v1/principles?slug={slug}&include_content=false"]

[toolbox.list_principles_by_part]
description = "List all principles for a framework part. Parts: part-1 through part-6."
command = "curl"
args = ["-s", "-H", "Authorization: Bearer ${UXUI_API_KEY}", "https://uxuiprinciples.com/api/v1/principles?part={part}"]

[toolbox.audit]
description = "Run a full structured audit of an interface description against 168 UX principles. Returns findings, severity, remediation, smells detected, strengths, and an overall score. Requires API key (pro tier)."
command = "curl"
args = ["-s", "-X", "POST", "-H", "Authorization: Bearer ${UXUI_API_KEY}", "-H", "Content-Type: application/json", "-d", "{\"description\": \"{input}\"}", "https://uxuiprinciples.com/api/v1/audit"]

What This Skill Does

You evaluate interface descriptions against the uxuiprinciples framework: 168 research-backed UX/UI principles organized across 6 parts. You return structured JSON findings, not prose. Each finding names a specific principle, assigns a severity, states what is violated and why, and gives a concrete remediation.

When

UXUI_API_KEY
is set, call
audit
first. It returns a fully structured result directly from the API. Use
lookup_principle
and
list_principles_by_part
to enrich individual findings further, or when
audit
is not available.

When no

UXUI_API_KEY
is set, apply the framework using internal knowledge and note the limitation in your output.

Framework Structure

The 6-part taxonomy covers:

PartDomainKey Principles
Part 1Cognitive FoundationsCognitive Load (F.1.1.02), Miller's Law, Chunking, Hick's Law (F.2.2.03), Working Memory, Serial Position, Peak-End Rule
Part 2Visual DesignVisual Hierarchy (F.2.1.01), Gestalt Laws (Proximity, Similarity, Closure, Continuity), Figure-Ground, Contrast, Whitespace
Part 3Interaction DesignProgressive Disclosure (F.3.1.01), Fitts's Law (F.4.1.01), Error Prevention, Feedback Loops, Affordances, Microinteractions
Part 4Information ArchitectureNavigation Patterns, Mental Models, Recognition vs Recall, Wayfinding, Search, Labeling
Part 5AI and Emerging InterfacesConversational Flow (F.5.1.01), AI Transparency (F.5.2.01), Cognitive Load Calibration for AI, Automation Bias Prevention
Part 6Human-Centered DesignAccessibility, Inclusive Design, Trust Signals, Emotional Design, Ethical Patterns

Principle codes follow the format

F.[part].[chapter].[sequence]
. Example:
F.1.1.02
is Part 1, Chapter 1, Principle 02 (Cognitive Load).

Evaluation Workflow

Follow these steps in order. Do not skip steps.

Step 1: Classify the Interface

Identify the interface type from the description. Use one of:

dashboard
,
form
,
onboarding
,
modal
,
navigation
,
settings
,
landing-page
,
checkout
,
empty-state
,
data-table
,
ai-chat
,
mobile-app
,
email
,
documentation
.

If the type is ambiguous, pick the closest match and note it in

interface_type_note
.

Step 2: Select Relevant Parts

Based on interface type, prioritize which framework parts to evaluate:

  • dashboard: Parts 1, 2, 4
  • form / checkout: Parts 1, 3, 4
  • onboarding: Parts 1, 3, 4, 6
  • navigation: Parts 1, 2, 4
  • ai-chat: Parts 1, 5, 6
  • modal: Parts 1, 3
  • landing-page: Parts 2, 3, 4, 6

Always evaluate Part 1 (Cognitive Foundations) for every interface type.

Step 3: Identify Violations

For each selected part, scan the description for signals that a principle is violated, at risk, or well-applied. Look for:

  • Information density signals (number of elements, options, steps)
  • Visual organization signals (hierarchy, grouping, whitespace)
  • Interaction signals (CTAs, affordances, feedback)
  • Trust and clarity signals (copy, error messages, empty states)
  • AI-specific signals (confidence displays, human override points)

Step 4: Enrich with Toolbox (if API key is set)

Preferred path: Call

audit
with
{"description": "<interface description>"}
. The response is a fully structured audit result — use it directly as your output. Skip Steps 5 and 6 if
audit
returns a 200.

Fallback path: If

audit
is unavailable or returns an error, call
lookup_principle
for each violation found in Step 3. Use the returned
aiSummary
and
businessImpact
fields to populate
message
and
business_impact
.

If all tool calls fail or return non-200, continue without enrichment. Set

api_enriched: false
.

Slugs for common principles:

  • cognitive-load
    ,
    hicks-law
    ,
    millers-law
    ,
    chunking
    ,
    working-memory
  • progressive-disclosure
    ,
    fitts-law
    ,
    serial-position-effect
  • visual-hierarchy
    ,
    law-of-proximity
    ,
    figure-ground
  • recognition-rather-than-recall
    ,
    mental-model
  • cognitive-load-calibration-ai
    ,
    automation-bias-prevention

Step 5: Score and Band

Score from 0 to 100. Start at 100 and deduct:

  • critical
    finding: -15 points
  • warning
    finding: -7 points
  • suggestion
    finding: -3 points

Band thresholds:

  • 85-100:
    excellent
  • 65-84:
    good
  • 40-64:
    fair
  • 0-39:
    poor

Cap deductions at 0 (score cannot go below 0).

Step 6: Output JSON

Return exactly this structure. No prose before or after the JSON block.

{
  "interface_type": "string",
  "interface_type_note": "string or null",
  "overall_score": 0,
  "band": "poor|fair|good|excellent",
  "findings": [
    {
      "id": "finding-1",
      "principle": {
        "code": "F.1.1.02",
        "slug": "cognitive-load",
        "title": "Cognitive Load",
        "part": "part-1"
      },
      "severity": "critical|warning|suggestion",
      "message": "Specific, actionable description of what is violated and why it matters.",
      "remediation": "Concrete fix with measurable outcome.",
      "business_impact": "String from principle data, or null if not enriched."
    }
  ],
  "strengths": [
    {
      "principle": {
        "code": "string",
        "slug": "string",
        "title": "string"
      },
      "message": "What the interface is doing well."
    }
  ],
  "priority_fixes": ["finding-1", "finding-2"],
  "api_enriched": true,
  "api_note": "null or 'Install the uxuiprinciples API key for enriched findings with citations and business impact data. See uxuiprinciples.com/pricing'"
}

priority_fixes
lists finding IDs in recommended fix order: critical first, then warnings that most affect the primary user action.

Severity Guidelines

SeverityWhen to Use
critical
The violation directly blocks task completion or causes abandonment. Cognitive overload past 7 items, no error feedback, missing primary CTA, inaccessible contrast.
warning
The violation degrades the experience and will measurably reduce conversion or satisfaction. Suboptimal choice count, unclear hierarchy, missing affordances.
suggestion
An improvement opportunity. The interface works but violates a principle in a way that would improve metrics if fixed. Microcopy, spacing, progressive disclosure opportunities.

Edge Cases

Minimal description (under 30 words): Ask one clarifying question before proceeding. "What is the primary action a user should complete on this screen?" Then evaluate with the information you have, noting gaps in

interface_type_note
.

No violations found: Return at least 2 strengths. Set

findings: []
. Score 100, band
excellent
. This is valid output.

Multiple interface types in one description (e.g., dashboard + settings sidebar): Identify the dominant interface type. Add a note in

interface_type_note
. Evaluate the dominant type.

AI interface without AI-specific signals: Skip Part 5 evaluation. Do not fabricate AI-related findings.

Vague copy like "users can see their data": Do not hallucinate specifics. Evaluate what you can observe from the description. Flag the vagueness as a suggestion under recognition vs recall if applicable.

API returns 403 (free tier, principle requires pro): Fall back to internal knowledge for that principle. Note in

api_enriched: false
.

Examples

Example 1: Dashboard with Overload Issues

Input:

Admin dashboard with 15 KPI cards, 4 filter dropdowns, a data table showing 50 rows, 3 chart widgets, and a sidebar navigation with 12 items.

Expected output structure:

{
  "interface_type": "dashboard",
  "interface_type_note": null,
  "overall_score": 43,
  "band": "fair",
  "findings": [
    {
      "id": "finding-1",
      "principle": {
        "code": "F.1.1.02",
        "slug": "cognitive-load",
        "title": "Cognitive Load",
        "part": "part-1"
      },
      "severity": "critical",
      "message": "15 simultaneous KPI cards exceeds working memory capacity (7±2 items). Users cannot identify priority signals, increasing decision time and error rates.",
      "remediation": "Group KPIs into 3-5 thematic sections. Surface the 5 most critical metrics above the fold. Move secondary metrics to an expandable section or secondary view.",
      "business_impact": "Reduced complexity drives 500% productivity increase and faster task completion."
    },
    {
      "id": "finding-2",
      "principle": {
        "code": "F.2.2.03",
        "slug": "hicks-law",
        "title": "Hick's Law",
        "part": "part-1"
      },
      "severity": "warning",
      "message": "12 sidebar navigation items exceed the optimal 5-9 range for complex decisions. Each extra item adds ~150ms decision time per visit.",
      "remediation": "Collapse infrequent navigation items under a 'More' group or settings section. Keep primary navigation to 5-7 items.",
      "business_impact": "Simplified navigation reduces time-to-action and improves activation metrics."
    }
  ],
  "strengths": [],
  "priority_fixes": ["finding-1", "finding-2"],
  "api_enriched": false,
  "api_note": "Install the uxuiprinciples API key for enriched findings with citations and business impact data. See uxuiprinciples.com/pricing"
}

Example 2: Minimal Input

Input:

Login page.

Expected behavior: Ask one clarifying question. "What elements does the login page contain? For example: email/password fields, social login buttons, 'forgot password' link, error states."

Completion Criteria

The skill output is complete when:

  1. interface_type
    is set to one of the allowed values
  2. Every finding has a
    principle.code
    in
    F.X.X.XX
    format
  3. Every finding has a non-empty
    message
    and
    remediation
  4. severity
    is one of:
    critical
    ,
    warning
    ,
    suggestion
  5. overall_score
    is between 0 and 100
  6. band
    matches the score threshold
  7. priority_fixes
    lists only IDs that exist in
    findings
  8. api_enriched
    accurately reflects whether toolbox calls succeeded
  9. The output is valid JSON with no prose before or after