Auto-GPT pr-address
Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.
git clone https://github.com/Significant-Gravitas/AutoGPT
T=$(mktemp -d) && git clone --depth=1 https://github.com/Significant-Gravitas/AutoGPT "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/pr-address" ~/.claude/skills/significant-gravitas-auto-gpt-pr-address && rm -rf "$T"
.claude/skills/pr-address/SKILL.mdPR Address
Find the PR
gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT gh pr view {N}
Read the PR description
Understand the Why / What / How before addressing comments — you need context to make good fixes:
gh pr view {N} --json body --jq '.body'
Fetch comments (all sources)
1. Inline review threads — GraphQL (primary source of actionable items)
⚠️ WARNING — PAGINATE ALL PAGES BEFORE ADDRESSING ANYTHING
returns at most 100 threads per page AND returns threads oldest-first. On a PR with many review cycles (e.g. 373 threads), the oldest 100–200 threads are from past cycles and are all already resolved. Filtering client-side withreviewThreads(first: 100)on page 1 therefore yields 0 results — even though pages 2–4 contain many unresolved threads from recent review cycles.select(.isResolved == false)This is the most common failure mode: agent fetches page 1, sees 0 unresolved after filtering, stops pagination, reports "done" — while hundreds of unresolved threads sit on later pages.
One observed PR had 142 total threads: page 1 returned 0 unresolved (all old/resolved), while pages 2–3 had 111 unresolved. Another with 373 threads across 4 pages also had page 1 entirely resolved.
The rule: ALWAYS paginate to
regardless of the per-page unresolved count. Never stop early because a page returns 0 unresolved.hasNextPage == false
Step 1 — Fetch total count and sanity-check the newest threads:
# Get total count and the newest 100 threads (last: 100 returns newest-first) gh api graphql -f query=' { repository(owner: "Significant-Gravitas", name: "AutoGPT") { pullRequest(number: {N}) { reviewThreads { totalCount } newest: reviewThreads(last: 100) { nodes { isResolved } } } } }' | jq '{ total: .data.repository.pullRequest.reviewThreads.totalCount, newest_unresolved: [.data.repository.pullRequest.newest.nodes[] | select(.isResolved == false)] | length }'
If
total > 100, you have multiple pages — you must paginate all of them regardless of what newest_unresolved shows. The last: 100 check is a sanity signal only; the full loop below is mandatory.
Step 2 — Collect all unresolved thread IDs across all pages:
# Accumulate all unresolved threads — loop until hasNextPage == false CURSOR="" ALL_THREADS="[]" while true; do AFTER=${CURSOR:+", after: \"$CURSOR\""} PAGE=$(gh api graphql -f query=" { repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") { pullRequest(number: {N}) { reviewThreads(first: 100${AFTER}) { pageInfo { hasNextPage endCursor } nodes { id isResolved path line comments(last: 1) { nodes { databaseId body author { login } } } } } } } }") # Append unresolved nodes from this page PAGE_THREADS=$(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false)]') ALL_THREADS=$(echo "$ALL_THREADS $PAGE_THREADS" | jq -s 'add') HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage') CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor') [ "$HAS_NEXT" = "false" ] && break done # Reverse so newest threads (last pages) are addressed first — GitHub returns oldest-first # and the most recent review cycle's comments are the ones blocking approval. ALL_THREADS=$(echo "$ALL_THREADS" | jq 'reverse') echo "Total unresolved threads: $(echo "$ALL_THREADS" | jq 'length')" echo "$ALL_THREADS" | jq '[.[] | {id, path, line, body: .comments.nodes[0].body[:200]}]'
Step 3 — Address every thread in
, then resolve.ALL_THREADS
Only after this loop completes (all pages fetched, count confirmed) should you begin making fixes.
Why reverse? GraphQL returns threads oldest-first and exposes no
option. A PR with 373 threads has ~4 pages; threads from the latest review cycle land on the last pages. Processing in reverse ensures the newest, most blocking comments are addressed first — the earlier pages mostly contain outdated threads from prior cycles.orderBy
Filter to unresolved threads only — skip any thread where
isResolved: true. comments(last: 1) returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread id (Relay global ID) to track threads across polls.
2. Top-level reviews — REST (MUST paginate)
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
CRITICAL — always
. Reviews default to 30 per page. PRs can have 80–170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including --paginate
autogpt-reviewer's structured review which is typically posted after several CI runs and sits well beyond the first page.
Two things to extract:
- Overall state: look for
orCHANGES_REQUESTED
reviews.APPROVED - Actionable feedback: non-empty bodies only. Empty-body reviews are thread-resolution events — they indicate progress but have no feedback to act on.
Where each reviewer posts:
— posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as top-level reviews. Not present on every PR. Address ALL items.autogpt-reviewer
— posts bug predictions as inline threads. Fix real bugs, explain false positives.sentry[bot]
— posts summaries as top-level reviews AND actionable items as inline threads. Address actionable items.coderabbitai[bot]- Human reviewers — can post in any source. Address ALL non-empty feedback.
3. PR conversation comments — REST
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
Mostly contains: bot summaries (
coderabbitai[bot]), CI/conflict detection (github-actions[bot]), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.
For each unaddressed comment
CRITICAL: The only valid sequence is fix → commit → push → reply → resolve. Never resolve a thread without a real code commit.
Resolving a thread via
resolveReviewThread without an actual fix is the most common failure mode — it makes unresolved counts drop without any real change, producing a false "done" signal. If the issue was genuinely a false positive (no code change needed), reply explaining why and then resolve. Otherwise:
Address comments one at a time: fix → commit → push → inline reply → resolve.
- Read the referenced code, make the fix (or reply explaining why it's not needed)
- Commit and push the fix
- Reply inline (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):
Use a markdown commit link so GitHub renders it as a clickable reference. Always get the full SHA with
git rev-parse HEAD after committing — never copy a SHA from a previous commit or hardcode one:
FULL_SHA=$(git rev-parse HEAD) gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies \ -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): <description>"
| Comment type | How to reply |
|---|---|
Inline review () | |
Conversation () | |
What counts as a valid resolution
Only two situations justify calling
resolveReviewThread:
- Real code fix: you changed the code, committed + pushed, and replied with the SHA. The commit diff must actually address the concern — not just touch the same file.
- Genuine false positive: the reviewer's concern does not apply to this code, and you can give a specific technical reason (e.g. "Not applicable —
is pre-validated bysdk_cwd
which applies normpath + prefix assertion before reaching this point")._make_sdk_cwd()
Anti-patterns that look resolved but aren't — never do these:
— a deferral, not a fix. The concern is still open. Do not resolve."Accepted, tracked as follow-up"
or"Acknowledged"
— these are acknowledgements, not fixes. Do not resolve."Same as above"
where"Fixed in abc1234"
is a commit that doesn't actually change the flagged line/logic — dishonest. Verifyabc1234
changes the right thing before posting.git show abc1234 -- path/to/file- Resolving without replying — the reviewer never sees what happened.
When in doubt: if a code change is needed, make it. A deferred issue means the thread stays open until the follow-up PR is merged.
Codecov coverage
Codecov patch target is 80% on changed lines. Checks are informational (not blocking) but should be green.
Running coverage locally
Backend (from
autogpt_platform/backend/):
poetry run pytest -s -vv --cov=backend --cov-branch --cov-report term-missing
Frontend (from
autogpt_platform/frontend/):
pnpm vitest run --coverage
When codecov/patch fails
- Find uncovered files:
git diff --name-only $(gh pr view --json baseRefName --jq '.baseRefName')...HEAD - For each uncovered file — extract inline logic to
/helpers.ts
and test those (highest ROI). Colocate tests ashelpers.py
(backend) or*_test.py
(frontend).__tests__/*.test.ts - Run coverage locally to verify, commit, push.
Format and commit
After fixing, format the changed code:
- Backend (from
):autogpt_platform/backend/poetry run format - Frontend (from
):autogpt_platform/frontend/pnpm format && pnpm lint && pnpm types
If API routes changed, regenerate the frontend client:
cd autogpt_platform/backend && poetry run rest & REST_PID=$! trap "kill $REST_PID 2>/dev/null" EXIT WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done cd ../frontend && pnpm generate:api:force kill $REST_PID 2>/dev/null; trap - EXIT
Never manually edit files in
src/app/api/__generated__/.
Then commit and push immediately — never batch commits without pushing. Each fix should be visible on GitHub right away so CI can start and reviewers can see progress.
Never push empty commits (
git commit --allow-empty) to re-trigger CI or bot checks. When a check fails, investigate the root cause (unchecked PR checklist, unaddressed review comments, code issues) and fix those directly. Empty commits add noise to git history.
For backend commits in worktrees:
poetry run git commit (pre-commit hooks).
Coverage
Codecov enforces patch coverage on new/changed lines — new code you write must be tested. Before pushing, verify you haven't left new lines uncovered:
cd autogpt_platform/backend poetry run pytest --cov=. --cov-report=term-missing {path/to/changed/module}
Look for lines marked
miss — those are uncovered. Add tests for any new code you wrote as part of addressing comments.
Rules:
- New code you add should have tests
- Don't remove existing tests when fixing comments
- If a reviewer asks you to delete code, also delete its tests, but verify coverage hasn't dropped on remaining lines
The loop
address comments → format → commit → push → wait for CI (while addressing new comments) → fix failures → push → re-check comments after CI settles → repeat until: all comments addressed AND CI green AND no new comments arriving
Polling for CI + new comments
After pushing, poll for both CI status and new comments in a single loop. Do not use
gh pr checks --watch — it blocks the tool and prevents reacting to new comments while CI is running.
Note:
is tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.gh pr checks --watch --fail-fast
Polling loop — repeat every 30 seconds:
- Check CI status:
gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link
Parse the results: if every check has
bucket of "pass" or "skipping", CI is green. If any has "fail", CI has failed. Otherwise CI is still pending.
- Check for merge conflicts:
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'
If the result is
"CONFLICTING", the PR has a merge conflict — see "Resolving merge conflicts" below. If "UNKNOWN", GitHub is still computing mergeability — wait and re-check next poll.
-
Check for new/changed comments (all three sources):
Inline threads — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record
as your baseline. On each poll, action is needed if:{thread_id, last_comment_databaseId}- A new thread
appears that wasn't in the baseline (new thread), ORid - An existing thread's
has changed (new reply on existing thread)last_comment_databaseId
Conversation comments:
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginateCompare total count and newest
against baseline. Filter to non-empty, non-bot, non-author-update messages.idTop-level reviews:
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginateWatch for new non-empty reviews (
orCHANGES_REQUESTED
with body). Compare total count and newestCOMMENTED
against baseline.id - A new thread
-
React in this precedence order (first match wins):
| What happened | Action |
|---|---|
| Merge conflict detected | See "Resolving merge conflicts" below. |
Mergeability is | GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top. |
| New comments detected | Address them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status). |
| CI failed (bucket == "fail") | Get failed check links: . Extract run ID from link (format: ), read logs with . Fix → commit → push → restart polling. |
| CI green + no new comments | Do not exit immediately. Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for 2 more cycles (60s) after CI goes green. Only exit after 2 consecutive green+quiet polls. |
| CI pending + no new comments | Sleep 30 seconds, then poll again. |
The loop ends when: CI fully green + all comments addressed + 2 consecutive polls with no new comments after CI settled.
Resolving merge conflicts
- Identify the PR's target branch and remote:
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName' git remote -v # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
- Pull the latest base branch with a 3-way merge:
git pull {base-remote} {base-branch} --no-rebase
- Resolve conflicting files, then verify no conflict markers remain:
if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then echo "Unresolved conflict markers found — resolve before proceeding." exit 1 fi
- Stage and push:
git add <conflicted-files> git commit -m "Resolve merge conflicts with {base-branch}" git push
- Restart the polling loop from the top — new commits reset CI status.
GitHub abuse rate limits
Two distinct rate limits exist — they have different causes and recovery times:
| Error | HTTP code | Cause | Recovery |
|---|---|---|---|
| 403 | Secondary rate limit — too many write operations (comments, mutations) in a short window | Wait 2–3 minutes. 60s is often not enough. |
| 429 | Primary rate limit — too many API calls per hour | Wait until header timestamp |
Prevention: Add
sleep 3 between individual thread reply API calls. When posting >20 replies, increase to sleep 5.
Recovery from secondary rate limit (403):
- Stop all API writes immediately
- Wait 2 minutes minimum (not 60s — secondary limits are stricter)
- Resume with
between each callsleep 3 - If 403 persists after 2 min, wait another 2 min before retrying
Never batch all replies in a tight loop — always space them out.
Parallel thread resolution
When a PR has more than 10 unresolved threads, addressing one commit per thread is slow. Use this strategy instead:
Group by file, batch per commit
- Sort
byALL_THREADS
— threads in the same file can share a single commit.path - Fix all threads in one file →
→git commit
→ reply to all those threads with the same SHA → resolve them all.git push - Move to the next file group and repeat.
This reduces N commits to (number of files touched), which is usually 3–5 instead of 15–30.
Posting replies concurrently (for large batches)
For truly independent thread groups (different files, no shared logic), you can post replies in parallel using background subshells — but always space out API writes:
# Post replies to a batch of threads concurrently, 3s apart ( sleep 3 gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID1}/replies \ -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..." ) & ( sleep 6 gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID2}/replies \ -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..." ) & wait # wait for all background replies before resolving
Then resolve sequentially (GraphQL mutations):
for THREAD_ID in "$THREAD1" "$THREAD2" "$THREAD3"; do gh api graphql -f query="mutation { resolveReviewThread(input: {threadId: \"${THREAD_ID}\"}) { thread { isResolved } } }" sleep 3 done
Always sleep 3s between individual API writes — GitHub's secondary rate limit (403) triggers on bursts of >20 writes. Increase to
sleep 5 when posting more than 20 replies in a batch.
Resolving threads via GraphQL
Use
resolveReviewThread only after the commit is pushed and the reply is posted:
gh api graphql -f query='mutation { resolveReviewThread(input: {threadId: "THREAD_ID"}) { thread { isResolved } } }'
Never call this mutation before committing the fix. The orchestrator will verify actual unresolved counts via GraphQL after you output
ORCHESTRATOR:DONE — false resolutions will be caught and you will be re-briefed.
Verify actual count before outputting ORCHESTRATOR:DONE
Before claiming "0 unresolved threads", always query GitHub directly — don't rely on your own bookkeeping. Paginate all pages — a single
first: 100 query misses threads beyond page 1:
# Step 1: get total thread count gh api graphql -f query=' { repository(owner: "Significant-Gravitas", name: "AutoGPT") { pullRequest(number: {N}) { reviewThreads { totalCount } } } }' | jq '.data.repository.pullRequest.reviewThreads.totalCount' # Step 2: paginate all pages, count truly unresolved CURSOR=""; UNRESOLVED=0 while true; do AFTER=${CURSOR:+", after: \"$CURSOR\""} PAGE=$(gh api graphql -f query=" { repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") { pullRequest(number: {N}) { reviewThreads(first: 100${AFTER}) { pageInfo { hasNextPage endCursor } nodes { isResolved } } } } }") UNRESOLVED=$(( UNRESOLVED + $(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved==false)] | length') )) HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage') CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor') [ "$HAS_NEXT" = "false" ] && break done echo "Unresolved threads: $UNRESOLVED"
Only output
ORCHESTRATOR:DONE after this loop reports 0.