Learn-skills.dev harness-engineering

Set up and improve harness engineering (AGENTS.md, docs/, lint rules, eval systems, project-level prompt engineering) for AI-agent-friendly codebases. Triggers on: new/empty project setup for AI agents, AGENTS.md or CLAUDE.md creation, harness engineering questions, making agents work better on a codebase. ALSO triggers when users are frustrated or complaining about agent quality — e.g. 'the agent keeps ignoring conventions', 'it never follows instructions', 'why does it keep doing X', 'the agent is broken' — because poor agent output almost always signals harness gaps, not model problems. Covers: context engineering, architectural constraints, multi-agent coordination, evaluation, long-running agent harness, and diagnosis of agent quality issues.

install

source · Clone the upstream repo

git clone https://github.com/NeverSight/learn-skills.dev

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/10xchengtu/harness-engineering/harness-engineering" ~/.claude/skills/neversight-learn-skills-dev-harness-engineering && rm -rf "$T"

manifest: data/skills-md/10xchengtu/harness-engineering/harness-engineering/SKILL.md

source content

Harness Engineering

Harness = the operating system for AI agents working on your project. Model is CPU, context window is RAM, harness is OS.

Core Principle

Start simple, add complexity only when needed. Every harness component encodes an assumption about what the model can't do alone. Pressure-test these assumptions — they expire as models improve. Build for deletion.

When This Skill Activates

Signal	Action
Empty/new project	→ Full project setup (Section 1)
User frustrated with agent	→ Diagnose & fix harness gaps (Section 7)
Existing project needs improvement	→ Assess & incrementally improve
Explicit harness question	→ Reference relevant sections

Workflow

For New Projects

Assess — What's the project? Tech stack? Team size? How will agents be used?
Setup — Create foundational harness files → read
```
references/01-project-setup.md
```
Context — Design information architecture → read
```
references/02-context-engineering.md
```
Constraints — Add guardrails and linters → read
```
references/03-constraints.md
```
Evaluate — Set up feedback loops → read
```
references/05-eval-feedback.md
```

If project involves multi-agent or long tasks → read

references/04-multi-agent.md

references/06-long-running.md

For Diagnosis (Agent Not Performing Well)

Read
```
references/07-diagnosis.md
```
immediately
Identify which harness layer is failing
Apply targeted fix from the relevant reference

For Incremental Improvement

Assess current harness maturity, identify weakest layer, improve one layer at a time.

Harness Layers (Quick Reference)

Layer	What	Reference
Project Setup	AGENTS.md, docs/, directory conventions	`01-project-setup.md`
Context Engineering	What info agents see, progressive disclosure, working state	`02-context-engineering.md`
Constraints & Guardrails	Linters, type systems, architecture enforcement, safe autonomy	`03-constraints.md`
Multi-Agent Architecture	Agent separation, coordination protocols, delegation patterns	`04-multi-agent.md`
Eval & Feedback	Testing, grading, GC agents, observability	`05-eval-feedback.md`
Long-Running Tasks	Progress tracking, context resets, handoff artifacts	`06-long-running.md`
Diagnosis	When agents fail — identify root cause in harness, not model	`07-diagnosis.md`

Self-Update Protocol

When you discover a new reusable harness pattern during a project:

Identify which reference file it belongs to (or if it needs a new one)
Add the pattern with: what it solves, when to use it, how to implement it
Keep it concise — no fluff, just the pattern