Auto-GPT pr-address

Address PR review comments and loop until CI green and all comments resolved. TRIGGER when user asks to address comments, fix PR feedback, respond to reviewers, or babysit/monitor a PR.

install
source · Clone the upstream repo
git clone https://github.com/Significant-Gravitas/AutoGPT
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Significant-Gravitas/AutoGPT "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/pr-address" ~/.claude/skills/significant-gravitas-auto-gpt-pr-address && rm -rf "$T"
manifest: .claude/skills/pr-address/SKILL.md
source content

PR Address

Find the PR

gh pr list --head $(git branch --show-current) --repo Significant-Gravitas/AutoGPT
gh pr view {N}

Read the PR description

Understand the Why / What / How before addressing comments — you need context to make good fixes:

gh pr view {N} --json body --jq '.body'

Fetch comments (all sources)

1. Inline review threads — GraphQL (primary source of actionable items)

⚠️ WARNING — PAGINATE ALL PAGES BEFORE ADDRESSING ANYTHING

reviewThreads(first: 100)
returns at most 100 threads per page AND returns threads oldest-first. On a PR with many review cycles (e.g. 373 threads), the oldest 100–200 threads are from past cycles and are all already resolved. Filtering client-side with
select(.isResolved == false)
on page 1 therefore yields 0 results — even though pages 2–4 contain many unresolved threads from recent review cycles.

This is the most common failure mode: agent fetches page 1, sees 0 unresolved after filtering, stops pagination, reports "done" — while hundreds of unresolved threads sit on later pages.

One observed PR had 142 total threads: page 1 returned 0 unresolved (all old/resolved), while pages 2–3 had 111 unresolved. Another with 373 threads across 4 pages also had page 1 entirely resolved.

The rule: ALWAYS paginate to

hasNextPage == false
regardless of the per-page unresolved count. Never stop early because a page returns 0 unresolved.

Step 1 — Fetch total count and sanity-check the newest threads:

# Get total count and the newest 100 threads (last: 100 returns newest-first)
gh api graphql -f query='
{
  repository(owner: "Significant-Gravitas", name: "AutoGPT") {
    pullRequest(number: {N}) {
      reviewThreads { totalCount }
      newest: reviewThreads(last: 100) {
        nodes { isResolved }
      }
    }
  }
}' | jq '{ total: .data.repository.pullRequest.reviewThreads.totalCount, newest_unresolved: [.data.repository.pullRequest.newest.nodes[] | select(.isResolved == false)] | length }'

If

total > 100
, you have multiple pages — you must paginate all of them regardless of what
newest_unresolved
shows. The
last: 100
check is a sanity signal only; the full loop below is mandatory.

Step 2 — Collect all unresolved thread IDs across all pages:

# Accumulate all unresolved threads — loop until hasNextPage == false
CURSOR=""
ALL_THREADS="[]"
while true; do
  AFTER=${CURSOR:+", after: \"$CURSOR\""}
  PAGE=$(gh api graphql -f query="
  {
    repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") {
      pullRequest(number: {N}) {
        reviewThreads(first: 100${AFTER}) {
          pageInfo { hasNextPage endCursor }
          nodes {
            id
            isResolved
            path
            line
            comments(last: 1) {
              nodes { databaseId body author { login } }
            }
          }
        }
      }
    }
  }")
  # Append unresolved nodes from this page
  PAGE_THREADS=$(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false)]')
  ALL_THREADS=$(echo "$ALL_THREADS $PAGE_THREADS" | jq -s 'add')
  HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage')
  CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor')
  [ "$HAS_NEXT" = "false" ] && break
done

# Reverse so newest threads (last pages) are addressed first — GitHub returns oldest-first
# and the most recent review cycle's comments are the ones blocking approval.
ALL_THREADS=$(echo "$ALL_THREADS" | jq 'reverse')

echo "Total unresolved threads: $(echo "$ALL_THREADS" | jq 'length')"
echo "$ALL_THREADS" | jq '[.[] | {id, path, line, body: .comments.nodes[0].body[:200]}]'

Step 3 — Address every thread in

ALL_THREADS
, then resolve.

Only after this loop completes (all pages fetched, count confirmed) should you begin making fixes.

Why reverse? GraphQL returns threads oldest-first and exposes no

orderBy
option. A PR with 373 threads has ~4 pages; threads from the latest review cycle land on the last pages. Processing in reverse ensures the newest, most blocking comments are addressed first — the earlier pages mostly contain outdated threads from prior cycles.

Filter to unresolved threads only — skip any thread where

isResolved: true
.
comments(last: 1)
returns the most recent comment in the thread — act on that; it reflects the reviewer's final ask. Use the thread
id
(Relay global ID) to track threads across polls.

2. Top-level reviews — REST (MUST paginate)

gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate

CRITICAL — always

--paginate
. Reviews default to 30 per page. PRs can have 80–170+ reviews (mostly empty resolution events). Without pagination you miss reviews past position 30 — including
autogpt-reviewer
's structured review which is typically posted after several CI runs and sits well beyond the first page.

Two things to extract:

  • Overall state: look for
    CHANGES_REQUESTED
    or
    APPROVED
    reviews.
  • Actionable feedback: non-empty bodies only. Empty-body reviews are thread-resolution events — they indicate progress but have no feedback to act on.

Where each reviewer posts:

  • autogpt-reviewer
    — posts detailed structured reviews ("Blockers", "Should Fix", "Nice to Have") as top-level reviews. Not present on every PR. Address ALL items.
  • sentry[bot]
    — posts bug predictions as inline threads. Fix real bugs, explain false positives.
  • coderabbitai[bot]
    — posts summaries as top-level reviews AND actionable items as inline threads. Address actionable items.
  • Human reviewers — can post in any source. Address ALL non-empty feedback.

3. PR conversation comments — REST

gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate

Mostly contains: bot summaries (

coderabbitai[bot]
), CI/conflict detection (
github-actions[bot]
), and author status updates. Scan for non-empty messages from non-bot human reviewers that aren't the PR author — those are the ones that need a response.

For each unaddressed comment

CRITICAL: The only valid sequence is fix → commit → push → reply → resolve. Never resolve a thread without a real code commit.

Resolving a thread via

resolveReviewThread
without an actual fix is the most common failure mode — it makes unresolved counts drop without any real change, producing a false "done" signal. If the issue was genuinely a false positive (no code change needed), reply explaining why and then resolve. Otherwise:

Address comments one at a time: fix → commit → push → inline reply → resolve.

  1. Read the referenced code, make the fix (or reply explaining why it's not needed)
  2. Commit and push the fix
  3. Reply inline (not as a new top-level comment) referencing the fixing commit — this is what resolves the conversation for bot reviewers (coderabbitai, sentry):

Use a markdown commit link so GitHub renders it as a clickable reference. Always get the full SHA with

git rev-parse HEAD
after committing — never copy a SHA from a previous commit or hardcode one:

FULL_SHA=$(git rev-parse HEAD)
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies \
  -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): <description>"
Comment typeHow to reply
Inline review (
pulls/{N}/comments
)
gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID}/replies -f body="🤖 Fixed in [abc1234](https://github.com/Significant-Gravitas/AutoGPT/commit/FULL_SHA): <description>"
Conversation (
issues/{N}/comments
)
gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments -f body="🤖 Fixed in [abc1234](https://github.com/Significant-Gravitas/AutoGPT/commit/FULL_SHA): <description>"

What counts as a valid resolution

Only two situations justify calling

resolveReviewThread
:

  1. Real code fix: you changed the code, committed + pushed, and replied with the SHA. The commit diff must actually address the concern — not just touch the same file.
  2. Genuine false positive: the reviewer's concern does not apply to this code, and you can give a specific technical reason (e.g. "Not applicable —
    sdk_cwd
    is pre-validated by
    _make_sdk_cwd()
    which applies normpath + prefix assertion before reaching this point").

Anti-patterns that look resolved but aren't — never do these:

  • "Accepted, tracked as follow-up"
    — a deferral, not a fix. The concern is still open. Do not resolve.
  • "Acknowledged"
    or
    "Same as above"
    — these are acknowledgements, not fixes. Do not resolve.
  • "Fixed in abc1234"
    where
    abc1234
    is a commit that doesn't actually change the flagged line/logic — dishonest. Verify
    git show abc1234 -- path/to/file
    changes the right thing before posting.
  • Resolving without replying — the reviewer never sees what happened.

When in doubt: if a code change is needed, make it. A deferred issue means the thread stays open until the follow-up PR is merged.

Codecov coverage

Codecov patch target is 80% on changed lines. Checks are informational (not blocking) but should be green.

Running coverage locally

Backend (from

autogpt_platform/backend/
):

poetry run pytest -s -vv --cov=backend --cov-branch --cov-report term-missing

Frontend (from

autogpt_platform/frontend/
):

pnpm vitest run --coverage

When codecov/patch fails

  1. Find uncovered files:
    git diff --name-only $(gh pr view --json baseRefName --jq '.baseRefName')...HEAD
  2. For each uncovered file — extract inline logic to
    helpers.ts
    /
    helpers.py
    and test those (highest ROI). Colocate tests as
    *_test.py
    (backend) or
    __tests__/*.test.ts
    (frontend).
  3. Run coverage locally to verify, commit, push.

Format and commit

After fixing, format the changed code:

  • Backend (from
    autogpt_platform/backend/
    ):
    poetry run format
  • Frontend (from
    autogpt_platform/frontend/
    ):
    pnpm format && pnpm lint && pnpm types

If API routes changed, regenerate the frontend client:

cd autogpt_platform/backend && poetry run rest &
REST_PID=$!
trap "kill $REST_PID 2>/dev/null" EXIT
WAIT=0; until curl -sf http://localhost:8006/health > /dev/null 2>&1; do sleep 1; WAIT=$((WAIT+1)); [ $WAIT -ge 60 ] && echo "Timed out" && exit 1; done
cd ../frontend && pnpm generate:api:force
kill $REST_PID 2>/dev/null; trap - EXIT

Never manually edit files in

src/app/api/__generated__/
.

Then commit and push immediately — never batch commits without pushing. Each fix should be visible on GitHub right away so CI can start and reviewers can see progress.

Never push empty commits (

git commit --allow-empty
) to re-trigger CI or bot checks. When a check fails, investigate the root cause (unchecked PR checklist, unaddressed review comments, code issues) and fix those directly. Empty commits add noise to git history.

For backend commits in worktrees:

poetry run git commit
(pre-commit hooks).

Coverage

Codecov enforces patch coverage on new/changed lines — new code you write must be tested. Before pushing, verify you haven't left new lines uncovered:

cd autogpt_platform/backend
poetry run pytest --cov=. --cov-report=term-missing {path/to/changed/module}

Look for lines marked

miss
— those are uncovered. Add tests for any new code you wrote as part of addressing comments.

Rules:

  • New code you add should have tests
  • Don't remove existing tests when fixing comments
  • If a reviewer asks you to delete code, also delete its tests, but verify coverage hasn't dropped on remaining lines

The loop

address comments → format → commit → push
→ wait for CI (while addressing new comments) → fix failures → push
→ re-check comments after CI settles
→ repeat until: all comments addressed AND CI green AND no new comments arriving

Polling for CI + new comments

After pushing, poll for both CI status and new comments in a single loop. Do not use

gh pr checks --watch
— it blocks the tool and prevents reacting to new comments while CI is running.

Note:

gh pr checks --watch --fail-fast
is tempting but it blocks the entire Bash tool call, meaning the agent cannot check for or address new comments until CI fully completes. Always poll manually instead.

Polling loop — repeat every 30 seconds:

  1. Check CI status:
gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,name,link

Parse the results: if every check has

bucket
of
"pass"
or
"skipping"
, CI is green. If any has
"fail"
, CI has failed. Otherwise CI is still pending.

  1. Check for merge conflicts:
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json mergeable --jq '.mergeable'

If the result is

"CONFLICTING"
, the PR has a merge conflict — see "Resolving merge conflicts" below. If
"UNKNOWN"
, GitHub is still computing mergeability — wait and re-check next poll.

  1. Check for new/changed comments (all three sources):

    Inline threads — re-run the GraphQL query from "Fetch comments". For each unresolved thread, record

    {thread_id, last_comment_databaseId}
    as your baseline. On each poll, action is needed if:

    • A new thread
      id
      appears that wasn't in the baseline (new thread), OR
    • An existing thread's
      last_comment_databaseId
      has changed (new reply on existing thread)

    Conversation comments:

    gh api repos/Significant-Gravitas/AutoGPT/issues/{N}/comments --paginate
    

    Compare total count and newest

    id
    against baseline. Filter to non-empty, non-bot, non-author-update messages.

    Top-level reviews:

    gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/reviews --paginate
    

    Watch for new non-empty reviews (

    CHANGES_REQUESTED
    or
    COMMENTED
    with body). Compare total count and newest
    id
    against baseline.

  2. React in this precedence order (first match wins):

What happenedAction
Merge conflict detectedSee "Resolving merge conflicts" below.
Mergeability is
UNKNOWN
GitHub is still computing mergeability. Sleep 30 seconds, then restart polling from the top.
New comments detectedAddress them (fix → commit → push → reply). After pushing, re-fetch all comments to update your baseline, then restart this polling loop from the top (new commits invalidate CI status).
CI failed (bucket == "fail")Get failed check links:
gh pr checks {N} --repo Significant-Gravitas/AutoGPT --json bucket,link --jq '.[] | select(.bucket == "fail") | .link'
. Extract run ID from link (format:
.../actions/runs/<run-id>/job/...
), read logs with
gh run view <run-id> --repo Significant-Gravitas/AutoGPT --log-failed
. Fix → commit → push → restart polling.
CI green + no new commentsDo not exit immediately. Bots (coderabbitai, sentry) often post reviews shortly after CI settles. Continue polling for 2 more cycles (60s) after CI goes green. Only exit after 2 consecutive green+quiet polls.
CI pending + no new commentsSleep 30 seconds, then poll again.

The loop ends when: CI fully green + all comments addressed + 2 consecutive polls with no new comments after CI settled.

Resolving merge conflicts

  1. Identify the PR's target branch and remote:
gh pr view {N} --repo Significant-Gravitas/AutoGPT --json baseRefName --jq '.baseRefName'
git remote -v   # find the remote pointing to Significant-Gravitas/AutoGPT (typically 'upstream' in forks, 'origin' for direct contributors)
  1. Pull the latest base branch with a 3-way merge:
git pull {base-remote} {base-branch} --no-rebase
  1. Resolve conflicting files, then verify no conflict markers remain:
if grep -R -n -E '^(<<<<<<<|=======|>>>>>>>)' <conflicted-files>; then
  echo "Unresolved conflict markers found — resolve before proceeding."
  exit 1
fi
  1. Stage and push:
git add <conflicted-files>
git commit -m "Resolve merge conflicts with {base-branch}"
git push
  1. Restart the polling loop from the top — new commits reset CI status.

GitHub abuse rate limits

Two distinct rate limits exist — they have different causes and recovery times:

ErrorHTTP codeCauseRecovery
{"code":"abuse"}
403Secondary rate limit — too many write operations (comments, mutations) in a short windowWait 2–3 minutes. 60s is often not enough.
{"message":"API rate limit exceeded"}
429Primary rate limit — too many API calls per hourWait until
X-RateLimit-Reset
header timestamp

Prevention: Add

sleep 3
between individual thread reply API calls. When posting >20 replies, increase to
sleep 5
.

Recovery from secondary rate limit (403):

  1. Stop all API writes immediately
  2. Wait 2 minutes minimum (not 60s — secondary limits are stricter)
  3. Resume with
    sleep 3
    between each call
  4. If 403 persists after 2 min, wait another 2 min before retrying

Never batch all replies in a tight loop — always space them out.

Parallel thread resolution

When a PR has more than 10 unresolved threads, addressing one commit per thread is slow. Use this strategy instead:

Group by file, batch per commit

  1. Sort
    ALL_THREADS
    by
    path
    — threads in the same file can share a single commit.
  2. Fix all threads in one file →
    git commit
    git push
    → reply to all those threads with the same SHA → resolve them all.
  3. Move to the next file group and repeat.

This reduces N commits to (number of files touched), which is usually 3–5 instead of 15–30.

Posting replies concurrently (for large batches)

For truly independent thread groups (different files, no shared logic), you can post replies in parallel using background subshells — but always space out API writes:

# Post replies to a batch of threads concurrently, 3s apart
(
  sleep 3
  gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID1}/replies \
    -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..."
) &
(
  sleep 6
  gh api repos/Significant-Gravitas/AutoGPT/pulls/{N}/comments/{ID2}/replies \
    -f body="🤖 Fixed in [${FULL_SHA:0:9}](https://github.com/Significant-Gravitas/AutoGPT/commit/${FULL_SHA}): ..."
) &
wait  # wait for all background replies before resolving

Then resolve sequentially (GraphQL mutations):

for THREAD_ID in "$THREAD1" "$THREAD2" "$THREAD3"; do
  gh api graphql -f query="mutation { resolveReviewThread(input: {threadId: \"${THREAD_ID}\"}) { thread { isResolved } } }"
  sleep 3
done

Always sleep 3s between individual API writes — GitHub's secondary rate limit (403) triggers on bursts of >20 writes. Increase to

sleep 5
when posting more than 20 replies in a batch.

Resolving threads via GraphQL

Use

resolveReviewThread
only after the commit is pushed and the reply is posted:

gh api graphql -f query='mutation { resolveReviewThread(input: {threadId: "THREAD_ID"}) { thread { isResolved } } }'

Never call this mutation before committing the fix. The orchestrator will verify actual unresolved counts via GraphQL after you output

ORCHESTRATOR:DONE
— false resolutions will be caught and you will be re-briefed.

Verify actual count before outputting ORCHESTRATOR:DONE

Before claiming "0 unresolved threads", always query GitHub directly — don't rely on your own bookkeeping. Paginate all pages — a single

first: 100
query misses threads beyond page 1:

# Step 1: get total thread count
gh api graphql -f query='
{
  repository(owner: "Significant-Gravitas", name: "AutoGPT") {
    pullRequest(number: {N}) {
      reviewThreads { totalCount }
    }
  }
}' | jq '.data.repository.pullRequest.reviewThreads.totalCount'

# Step 2: paginate all pages, count truly unresolved
CURSOR=""; UNRESOLVED=0
while true; do
  AFTER=${CURSOR:+", after: \"$CURSOR\""}
  PAGE=$(gh api graphql -f query="
  {
    repository(owner: \"Significant-Gravitas\", name: \"AutoGPT\") {
      pullRequest(number: {N}) {
        reviewThreads(first: 100${AFTER}) {
          pageInfo { hasNextPage endCursor }
          nodes { isResolved }
        }
      }
    }
  }")
  UNRESOLVED=$(( UNRESOLVED + $(echo "$PAGE" | jq '[.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved==false)] | length') ))
  HAS_NEXT=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage')
  CURSOR=$(echo "$PAGE" | jq -r '.data.repository.pullRequest.reviewThreads.pageInfo.endCursor')
  [ "$HAS_NEXT" = "false" ] && break
done
echo "Unresolved threads: $UNRESOLVED"

Only output

ORCHESTRATOR:DONE
after this loop reports 0.