Claude-skill-registry review-blocking
Discover functions that block the main thread, then micro-audit each for internal optimization opportunities
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/other/other/review-blocking" ~/.claude/skills/majiayu000-claude-skill-registry-review-blocking && rm -rf "$T"
skills/other/other/review-blocking/SKILL.mdEnter planning mode. Two-phase audit: first discover which functions block the main thread using mechanical criteria, then micro-audit each function's internals for optimization opportunities. Use an agent team for the audit phase — the per-function work is embarrassingly parallel.
How This Differs from review-latency
review-latency traces paths end-to-end (window change → store, user action → pixels) and cares about ordering, compound cost, and whether something is on the critical path. This skill opens each function in isolation and asks "can this be faster internally?" without caring about callers, frequency, or path position. The different framing surfaces different findings — path-tracing skips a function that's "only called once," but micro-auditing still catches the redundant array copy inside it.
Phase 1 — Discovery
Find functions that block the main thread using these criteria. The criteria are mechanical — use grep and query tools, not judgment.
Tier 1 — Always audit
These functions block the main thread by definition:
- Functions containing
— they hold an interrupt lock. Every microsecond inside Critical delays pending hotkey/timer callbacks.Critical "On" - Timer callbacks — they interrupt other work. Use
to inventory all timer callbacks inquery_timers.ps1
andsrc/core/
.src/gui/ - Hotkey callbacks — entry points for user input. Grep for
registrations and the callback functions they reference.Hotkey( - WinEvent callbacks —
callbacks. These fire on every focus change, window create/destroy.DllCall("SetWinEventHook"
Tier 2 — Audit if non-trivial
These functions have cost that scales:
- Functions with loops over arrays/maps —
loops,for
blocks iterating collections. Cost scales with collection size.Loop - Functions with 3+ DllCalls or ComCalls — each DllCall/ComCall has marshaling overhead (~1-2μs). Three or more in one function means the overhead is non-trivial. Post-#177,
has 40+ ComCalls per invocation (D3D11 state setup, compute dispatch, Draw, GPU readback) and runs N times per frame (once per active shader layer).Shader_PreRender
loops over all active layers callingFX_PreRenderShaderLayers
each.Shader_PreRender - Functions that build or copy collections —
,Array()
,Map()
,.Push()
inside a function body. Allocation cost..Clone()
Tier 3 — Direct callees of Tier 1/2
For each Tier 1/2 function, identify its direct callees (depth 1 only) using
query_callchain.ps1 -Depth 1 (or query_function_visibility.ps1). Add those to the audit list. Do NOT recurse further — if a helper's helper is slow, the parent function's audit will surface the call cost.
Discovery output
Produce a categorized function list:
| Function | File | Tier | Reason |
|---|---|---|---|
| | 1 | WinEvent callback + Critical |
| | 1 | Timer callback |
| | 3 | Called by (Tier 2) |
Scope
Only functions in
src/core/ and src/gui/ (plus src/shared/ files they call into). Exclude src/lib/ (third-party), src/editors/ (not on main thread during Alt-Tab), src/pump/ (separate process).
Use
query_state.ps1 to determine if a blocking function is only reachable from specific state machine branches — a function that blocks only during IDLE is less critical than one that blocks during ACTIVE.
Phase 2 — Micro-Audit
Use an agent team. Split the discovery list into batches and assign each batch to a parallel agent. Each agent reads the function body (use
query_function.ps1 <funcName>) and audits it in isolation.
What to look for inside each function
The rule: changes must be invisible to callers. Same inputs → same outputs → same side effects. Only internal implementation changes.
Redundant work:
- Computing a value that's already available (passed as param, stored in a global, computed earlier in the function)
- Re-reading a property/field multiple times when the value can't change mid-function
- Building a string or array that's immediately discarded or only partially used
Allocation waste:
- Creating arrays/maps that could be pre-allocated
buffers reused across calls — EXCEPT in functions reachable during STA pump (paint path, COM callers, QPC). Reentrancy overwrites the static buffer mid-use. Seestatic
Hot Path Resource Rules.ahk-patterns.md - String concatenation in loops (each concat allocates a new string in AHK)
in a loop when the final size is known (pre-size with.Push()
if AHK supports it, or use a pre-allocated buffer)Array(n)
Loop inefficiencies:
- Hoistable work inside loop bodies (expressions that don't depend on the loop variable)
- Map/array lookups repeated per-iteration that could be cached in a local
- Redundant
/HasOwnProp()
checks before access when the key is guaranteed to exist.Has()
DllCall overhead:
- Multiple DllCalls that could be combined (e.g., separate Get/Set calls that could be one call with more params)
- Missing
on callbacks (forces slower marshaling)"Cdecl" - String parameters that could be pre-encoded to avoid per-call UTF-16 conversion
Critical section duration:
- Work inside
...Critical "On"
that doesn't need the interrupt lockCritical "Off" - Computations that could be moved before Critical or after Critical
- Note: Do NOT suggest removing Critical sections or restructuring them — that's
territory. Only suggest moving non-critical work outside the existing boundaries.review-criticals
Cache opportunities:
- Pure computations (same input → same output) that are called repeatedly with the same arguments
locals for values computed once and reused across calls (DPI scaling factors, compiled regex patterns, etc.)static
What NOT to flag
- Algorithmic changes that alter the function's contract — changing sort order, filtering differently, returning different data
- Moving work to a different function or restructuring call patterns — that's
territoryreview-latency - Removing Critical sections or changing their scope — that's
territoryreview-criticals - Buffer loops suitable for MCode — that's
territory (but do note if you spot one, as a cross-reference)review-mcode
Micro-audit output (per function)
For each function, agents report:
| Finding | Line(s) | Current | Proposed | Est. Saving | Complexity |
|---|---|---|---|---|---|
local read twice from | 45, 52 | Two property accesses | Cache in local | ~0.5μs | Trivial |
| String concat in 30-iteration loop | 60-65 | per iteration | Pre-size or use Array+Join | ~5-15μs | Low |
Est. Saving — be honest. If it's sub-microsecond, say so. The user wants the full picture, not a filtered one.
Complexity — Trivial (one-line), Low (few lines, obvious), Medium (structural change within function), High (requires new static/cache infrastructure).
Validation
After the agent team reports, validate every finding yourself. Read the actual function code.
For each candidate:
- Cite evidence: "I verified by reading
lines X–Y" with actual code quoted.file.ahk - Confirm invisibility: Verify the change doesn't alter the function's output or side effects. If there's any doubt, flag it.
- Check for existing optimization: This codebase has been through optimization passes. The "redundant" lookup might be intentional (value can change mid-function due to callbacks). The "unnecessary" allocation might be required (buffer reuse would be unsafe due to STA pump reentrancy — COM calls dispatch arbitrary callbacks through
, so static buffers in any function reachable from the paint/COM path can be overwritten mid-use).Critical "On" - Counter-argument: "What would make this optimization counterproductive?" — Does caching a value risk staleness? Does hoisting out of a loop change behavior when the loop body has side effects?
- Observed vs inferred: Did you read the function body, or infer the issue from the function name?
Plan Format
Section 1 — Discovery results:
| Function | File | Tier | Reason |
|---|---|---|---|
| ... | ... | ... | ... |
Total: N functions across M files.
Section 2 — Micro-audit findings (grouped by file):
For each file, list all findings across its functions:
winevent_hook.ahk
winevent_hook.ahk| Function | Finding | Line(s) | Est. Saving | Complexity | Fix |
|---|---|---|---|---|---|
| Redundant call | 88 | ~50μs | Trivial | Cache in local |
gui_paint.ahk
gui_paint.ahk| Function | Finding | Line(s) | Est. Saving | Complexity | Fix |
|---|
Section 3 — Cross-references to other review skills:
| Finding | Function | Relevant Skill | Note |
|---|---|---|---|
| NumGet loop over icon buffer | | | Candidate for native C |
| Critical section holds through file I/O | | | Scope may be too wide |
Order Section 2 by estimated total savings per file (sum of all findings in that file), highest first.
Ignore any existing plans — create a fresh one.