Claude-skill-registry review-blocking

Discover functions that block the main thread, then micro-audit each for internal optimization opportunities

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/other/other/review-blocking" ~/.claude/skills/majiayu000-claude-skill-registry-review-blocking && rm -rf "$T"
manifest: skills/other/other/review-blocking/SKILL.md
source content

Enter planning mode. Two-phase audit: first discover which functions block the main thread using mechanical criteria, then micro-audit each function's internals for optimization opportunities. Use an agent team for the audit phase — the per-function work is embarrassingly parallel.

How This Differs from review-latency

review-latency
traces paths end-to-end (window change → store, user action → pixels) and cares about ordering, compound cost, and whether something is on the critical path. This skill opens each function in isolation and asks "can this be faster internally?" without caring about callers, frequency, or path position. The different framing surfaces different findings — path-tracing skips a function that's "only called once," but micro-auditing still catches the redundant array copy inside it.

Phase 1 — Discovery

Find functions that block the main thread using these criteria. The criteria are mechanical — use grep and query tools, not judgment.

Tier 1 — Always audit

These functions block the main thread by definition:

  • Functions containing
    Critical "On"
    — they hold an interrupt lock. Every microsecond inside Critical delays pending hotkey/timer callbacks.
  • Timer callbacks — they interrupt other work. Use
    query_timers.ps1
    to inventory all timer callbacks in
    src/core/
    and
    src/gui/
    .
  • Hotkey callbacks — entry points for user input. Grep for
    Hotkey(
    registrations and the callback functions they reference.
  • WinEvent callbacks
    DllCall("SetWinEventHook"
    callbacks. These fire on every focus change, window create/destroy.

Tier 2 — Audit if non-trivial

These functions have cost that scales:

  • Functions with loops over arrays/maps
    for
    loops,
    Loop
    blocks iterating collections. Cost scales with collection size.
  • Functions with 3+ DllCalls or ComCalls — each DllCall/ComCall has marshaling overhead (~1-2μs). Three or more in one function means the overhead is non-trivial. Post-#177,
    Shader_PreRender
    has 40+ ComCalls per invocation (D3D11 state setup, compute dispatch, Draw, GPU readback) and runs N times per frame (once per active shader layer).
    FX_PreRenderShaderLayers
    loops over all active layers calling
    Shader_PreRender
    each.
  • Functions that build or copy collections
    Array()
    ,
    Map()
    ,
    .Push()
    ,
    .Clone()
    inside a function body. Allocation cost.

Tier 3 — Direct callees of Tier 1/2

For each Tier 1/2 function, identify its direct callees (depth 1 only) using

query_callchain.ps1 -Depth 1
(or
query_function_visibility.ps1
). Add those to the audit list. Do NOT recurse further — if a helper's helper is slow, the parent function's audit will surface the call cost.

Discovery output

Produce a categorized function list:

FunctionFileTierReason
_WEH_WinEventProc
winevent_hook.ahk
1WinEvent callback + Critical
GUI_Repaint
gui_paint.ahk
1Timer callback
Blacklist_IsMatch
blacklist.ahk
3Called by
Blacklist_IsWindowEligible
(Tier 2)

Scope

Only functions in

src/core/
and
src/gui/
(plus
src/shared/
files they call into). Exclude
src/lib/
(third-party),
src/editors/
(not on main thread during Alt-Tab),
src/pump/
(separate process).

Use

query_state.ps1
to determine if a blocking function is only reachable from specific state machine branches — a function that blocks only during
IDLE
is less critical than one that blocks during
ACTIVE
.

Phase 2 — Micro-Audit

Use an agent team. Split the discovery list into batches and assign each batch to a parallel agent. Each agent reads the function body (use

query_function.ps1 <funcName>
) and audits it in isolation.

What to look for inside each function

The rule: changes must be invisible to callers. Same inputs → same outputs → same side effects. Only internal implementation changes.

Redundant work:

  • Computing a value that's already available (passed as param, stored in a global, computed earlier in the function)
  • Re-reading a property/field multiple times when the value can't change mid-function
  • Building a string or array that's immediately discarded or only partially used

Allocation waste:

  • Creating arrays/maps that could be pre-allocated
    static
    buffers reused across calls — EXCEPT in functions reachable during STA pump (paint path, COM callers, QPC). Reentrancy overwrites the static buffer mid-use. See
    ahk-patterns.md
    Hot Path Resource Rules.
  • String concatenation in loops (each concat allocates a new string in AHK)
  • .Push()
    in a loop when the final size is known (pre-size with
    Array(n)
    if AHK supports it, or use a pre-allocated buffer)

Loop inefficiencies:

  • Hoistable work inside loop bodies (expressions that don't depend on the loop variable)
  • Map/array lookups repeated per-iteration that could be cached in a local
  • Redundant
    HasOwnProp()
    /
    .Has()
    checks before access when the key is guaranteed to exist

DllCall overhead:

  • Multiple DllCalls that could be combined (e.g., separate Get/Set calls that could be one call with more params)
  • Missing
    "Cdecl"
    on callbacks (forces slower marshaling)
  • String parameters that could be pre-encoded to avoid per-call UTF-16 conversion

Critical section duration:

  • Work inside
    Critical "On"
    ...
    Critical "Off"
    that doesn't need the interrupt lock
  • Computations that could be moved before Critical or after Critical
  • Note: Do NOT suggest removing Critical sections or restructuring them — that's
    review-criticals
    territory. Only suggest moving non-critical work outside the existing boundaries.

Cache opportunities:

  • Pure computations (same input → same output) that are called repeatedly with the same arguments
  • static
    locals for values computed once and reused across calls (DPI scaling factors, compiled regex patterns, etc.)

What NOT to flag

  • Algorithmic changes that alter the function's contract — changing sort order, filtering differently, returning different data
  • Moving work to a different function or restructuring call patterns — that's
    review-latency
    territory
  • Removing Critical sections or changing their scope — that's
    review-criticals
    territory
  • Buffer loops suitable for MCode — that's
    review-mcode
    territory (but do note if you spot one, as a cross-reference)

Micro-audit output (per function)

For each function, agents report:

FindingLine(s)CurrentProposedEst. SavingComplexity
title
local read twice from
rec.title
45, 52Two property accessesCache in local
t := rec.title
~0.5μsTrivial
String concat in 30-iteration loop60-65
s .= item.name ","
per iteration
Pre-size or use Array+Join~5-15μsLow

Est. Saving — be honest. If it's sub-microsecond, say so. The user wants the full picture, not a filtered one.

Complexity — Trivial (one-line), Low (few lines, obvious), Medium (structural change within function), High (requires new static/cache infrastructure).

Validation

After the agent team reports, validate every finding yourself. Read the actual function code.

For each candidate:

  1. Cite evidence: "I verified by reading
    file.ahk
    lines X–Y" with actual code quoted.
  2. Confirm invisibility: Verify the change doesn't alter the function's output or side effects. If there's any doubt, flag it.
  3. Check for existing optimization: This codebase has been through optimization passes. The "redundant" lookup might be intentional (value can change mid-function due to callbacks). The "unnecessary" allocation might be required (buffer reuse would be unsafe due to STA pump reentrancy — COM calls dispatch arbitrary callbacks through
    Critical "On"
    , so static buffers in any function reachable from the paint/COM path can be overwritten mid-use).
  4. Counter-argument: "What would make this optimization counterproductive?" — Does caching a value risk staleness? Does hoisting out of a loop change behavior when the loop body has side effects?
  5. Observed vs inferred: Did you read the function body, or infer the issue from the function name?

Plan Format

Section 1 — Discovery results:

FunctionFileTierReason
............

Total: N functions across M files.

Section 2 — Micro-audit findings (grouped by file):

For each file, list all findings across its functions:

winevent_hook.ahk

FunctionFindingLine(s)Est. SavingComplexityFix
_WEH_WinEventProc
Redundant
WinGetTitle
call
88~50μsTrivialCache in local

gui_paint.ahk

FunctionFindingLine(s)Est. SavingComplexityFix

Section 3 — Cross-references to other review skills:

FindingFunctionRelevant SkillNote
NumGet loop over icon buffer
_ExtractAlpha
review-mcode
Candidate for native C
Critical section holds through file I/O
_SaveState
review-race-conditions
Scope may be too wide

Order Section 2 by estimated total savings per file (sum of all findings in that file), highest first.

Ignore any existing plans — create a fresh one.