Claude-skill-registry Localizing Variables
Declare variables in smallest possible scope, initialize close to first use, minimize span and live time
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/localizing-variables" ~/.claude/skills/majiayu000-claude-skill-registry-localizing-variables && rm -rf "$T"
skills/data/localizing-variables/SKILL.mdLocalizing Variables
Overview
The Principle of Proximity: Keep related actions together. Declare variables in the smallest scope possible, initialize them close to where they're first used, and keep all references to a variable close together.
Core principle: Minimize the window of vulnerability. The smaller the scope and the closer the references, the less can go wrong and the easier code is to understand.
Goal: Reduce what you must keep in mind at any one time.
When to Use
Apply to every variable you declare:
- When declaring variables
- When initializing variables
- When reviewing code with scattered variable usage
- When refactoring to improve clarity
Warning signs:
- Variables declared at top of function, used at bottom
- All variables initialized together, far from first use
- Large gap between variable declaration and use
- Must scroll to see variable declaration and usage together
- Variables have function/class scope when could be more local
- Can't see all uses of variable on one screen
Key Concepts
Scope
How widely visible a variable is:
- Block scope - visible only within
or indented block (smallest){} - Loop scope - visible only within loop
- Function scope - visible throughout function
- Class scope - visible to all methods in class
- Module scope - visible throughout file
- Global scope - visible everywhere (largest, avoid)
Rule: Start with smallest scope. Expand only if necessary.
Span
Distance between successive references to a variable:
a = 0 # First reference b = 0 # 1 line between references to a c = 0 # 2 lines between references to a a = b + c # Second reference # Span of a: 2 lines
Goal: Minimize span. Keep references close together.
Live Time
Total statements between first and last reference:
recordIndex = 0 # Line 2 - first reference # ... 24 lines of other code ... recordIndex += 1 # Line 28 - last reference # Live time: 28 - 2 + 1 = 27 statements
Goal: Minimize live time. Reduce window of vulnerability.
The Principle of Proximity
Keep related actions together:
❌ Bad (declarations far from use):
def process_data(): # All declarations at top index = 0 total = 0 done = False result = [] # 20 lines later... while index < count: index += 1 # 30 lines later... while not done: if total > threshold: done = True # 40 lines later... result.append(final_value) return result
Live times: index=25, total=35, done=35, result=40. Average: 34 lines.
✅ Good (declare close to use):
def process_data(): # Declare index right before loop that uses it index = 0 while index < count: index += 1 # Declare total and done right before loop that uses them total = 0 done = False while not done: if total > threshold: done = True # Declare result right before use result = [] result.append(final_value) return result
Live times: index=3, total=5, done=5, result=2. Average: 4 lines.
Improvement: 34 → 4 average live time (8.5x better)
Aggressive Scope Minimization
Technique 1: Declare at Point of First Use
Languages like C++, Java, Python, JavaScript allow this:
❌ Bad:
def calculate_report(): total = 0 # Declared at top count = 0 average = 0.0 # 10 lines later... total = sum(values) count = len(values) average = total / count if count > 0 else 0.0
✅ Good:
def calculate_report(): # 10 lines of other work... # Declare right before use total = sum(values) count = len(values) average = total / count if count > 0 else 0.0
Technique 2: Use Block Scope
In languages supporting block scope, use it:
# Process old data - variables scoped to this block { old_data = get_old_data() old_total = sum(old_data) print_summary(old_data, old_total) } # old_data, old_total die here # Process new data - fresh variables, no collision { new_data = get_new_data() new_total = sum(new_data) print_summary(new_data, new_total) } # new_data, new_total die here
Technique 3: Initialize at Declaration
❌ Bad:
# Declare and initialize separately user_count: int user_count = 0 active_users: list active_users = []
✅ Good:
# Initialize when declaring user_count = 0 active_users = []
Technique 4: Use Loop-Scoped Variables
Many languages support declaring loop variables in the loop:
# Variable i exists ONLY within this loop for i in range(count): process(i) # i doesn't exist here # Another loop can use i without conflict for i in range(other_count): process_other(i)
Technique 5: Group Related Statements
Keep statements working with same variables together:
❌ Bad (scattered):
old_data = get_old_data() new_data = get_new_data() old_total = sum(old_data) new_total = sum(new_data) print_old_summary(old_data, old_total) print_new_summary(new_data, new_total)
Must track 6 variables simultaneously
✅ Good (grouped):
# Group 1: Old data (track 2 variables) old_data = get_old_data() old_total = sum(old_data) print_old_summary(old_data, old_total) # Group 2: New data (track 2 variables) new_data = get_new_data() new_total = sum(new_data) print_new_summary(new_data, new_total)
Track only 2 variables at a time
Minimize Global/Class Variables
Globals have enormous scope, span, and live time - avoid them.
❌ Bad:
total = 0 # Global def add_to_total(value): global total total += value # Total lives forever, visible everywhere def get_total(): global total return total
✅ Good:
class Counter: def __init__(self): self._total = 0 # Private, encapsulated def add(self, value): self._total += value # Scoped to class def get_total(self): return self._total
Even better (eliminate state if possible):
def calculate_total(values): return sum(values) # No state, no scope issues
Measuring Improvement
Calculate Span
Count lines between successive references:
value = 10 # Reference 1 line_a() # 1 line between line_b() # 2 lines between result = value * 2 # Reference 2 # Span: 2 lines
Target: Average span < 5 lines
Calculate Live Time
Count lines from first to last reference (inclusive):
count = 0 # Line 5 - first reference # ... code ... count += 1 # Line 42 - last reference # Live time: 42 - 5 + 1 = 38 lines
Target: Average live time < 10 lines
Global variables: Infinite live time (another reason to avoid)
Common Mistakes
❌ All variables at top (C-style):
def process(): # Declare everything at top i = 0 j = 0 total = 0 result = [] temp = None # Use i 20 lines later for i in range(10): ...
✅ Declare where used:
def process(): # Use variables close to declaration for i in range(10): # i scoped to loop ... total = 0 # Declared right before use for item in items: total += item
❌ Wide scope when narrow would work:
def calculate(): result = 0 # Function scope if condition_a: result = calculate_a() # Could be block-scoped print(result) # result accessible here but not used if condition_b: result = calculate_b() # Reusing same variable print(result)
✅ Narrow scope:
def calculate(): if condition_a: result = calculate_a() # Block scope print(result) # result doesn't exist here if condition_b: result = calculate_b() # Fresh variable, no collision print(result)
❌ Long live time:
index = 0 # Line 2 # ... 50 lines of unrelated code ... while index < count: # Line 52 - finally used index += 1 # Live time: 51 lines
✅ Short live time:
# ... 50 lines of other code ... index = 0 # Line 52 - right before use while index < count: index += 1 # Live time: 2 lines
Practical Guidelines
1. Initialize at Declaration
Languages like C++, Java, Python, JavaScript support this:
# ✅ Declare and initialize together user_count = 0 active_users = get_active_users() total_revenue = calculate_revenue(orders)
2. Loop Variables in Loop Declaration
# ✅ Loop variable scoped to loop for user in users: process(user) # user exists only here for item in items: handle(item) # item exists only here
3. Initialize Before Loop, Not at Function Start
❌ Bad:
def process_records(): index = 0 # Line 2 # 30 lines of other work... # Finally use index while index < record_count: # Line 32 index += 1
✅ Good:
def process_records(): # 30 lines of other work... # Initialize right before loop index = 0 while index < record_count: index += 1
Why: When you modify code and add outer loop, initialization is correctly placed for re-initialization on each pass.
4. Extract Related Statements Into Routines
Long routines create large scope. Break into smaller routines:
# ✅ Each routine has small scope def process_old_data(): old_data = get_old_data() # Scoped to this routine only old_total = sum(old_data) return old_total def process_new_data(): new_data = get_new_data() # Fresh variable, no collision new_total = sum(new_data) return new_total
Variables automatically die when routine exits.
5. Prefer Most Restricted Visibility
Hierarchy (most restricted to least):
- Block/loop local (if language supports)
- Function local
- Private instance variable
- Protected instance variable
- Public instance variable
- Module-level
- Global
Start at #1, move down only if necessary.
Convenience vs Intellectual Manageability
Two philosophies:
Convenience Philosophy
"Make variables global so they're convenient to access anywhere. Don't fool around with parameter lists."
Problem: Easy to write, hard to read/maintain. Any routine can modify any variable. Must understand entire program to modify one part.
Intellectual Manageability Philosophy
"Keep variables as local as possible. Hide information. Minimize what you must think about at once."
Benefit: Harder to write (must think about scope), easier to read/maintain. Can understand one routine without knowing all others.
Code Complete's recommendation: Favor intellectual manageability. Code is read 10x more than written.
Example Transformation
❌ Before (wide scope, long live time):
def summarize_data(): # All variables at top with function scope old_data = None num_old = 0 total_old = 0 new_data = None num_new = 0 total_new = 0 old_data = get_old_data() # Line 8 num_old = len(old_data) total_old = sum(old_data) print_summary(old_data, total_old, num_old) # Line 11 save_summary(total_old, num_old) new_data = get_new_data() # Line 14 num_new = len(new_data) total_new = sum(new_data) print_summary(new_data, total_new, num_new) # Line 17 save_summary(total_new, num_new)
Must track 6 variables throughout entire function. Live times: old_data=4, new_data=4, etc.
✅ After (narrow scope, short live time):
def summarize_data(): # Group 1: Old data (variables live only 4 lines) old_data = get_old_data() num_old = len(old_data) total_old = sum(old_data) print_summary(old_data, total_old, num_old) save_summary(total_old, num_old) # old_data, num_old, total_old mentally "die" here # Group 2: New data (fresh variables, 4 lines) new_data = get_new_data() num_new = len(new_data) total_new = sum(new_data) print_summary(new_data, total_new, num_new) save_summary(total_new, num_new)
Track 3 variables at a time. Same live times, but mental load reduced.
✅ Even better (extract to routines):
def summarize_data(): process_old_data() # Variables scoped inside process_new_data() # Variables scoped inside def process_old_data(): # Variables live only in this routine (5 lines) old_data = get_old_data() num_old = len(old_data) total_old = sum(old_data) print_summary(old_data, total_old, num_old) save_summary(total_old, num_old) # Variables die when routine exits
Track 3 variables maximum. Variables automatically cleaned up.
Quick Reference
| Situation | Technique | Example |
|---|---|---|
| Loop variable | Declare in loop | |
| Temporary calculation | Inline or immediate use | right before |
| Used in one block | Declare in that block | if-block variable stays in if-block |
| Used across function | Function-local only if necessary | Don't make it class/global |
| Shared across methods | Private instance variable | Not public unless necessary |
| Truly global | Access routine instead | Wrap in getter/setter |
Common Patterns
Pattern 1: Loop Counters
# ✅ Counter scoped to loop for i in range(len(items)): process(items[i]) # i doesn't exist here - good # Can reuse i in another loop without collision for i in range(len(others)): process(others[i])
Pattern 2: Calculation Results
# ✅ Calculate right before use def generate_report(): # Other work... # Calculate only when needed, use immediately total_revenue = sum(order.total for order in orders) print(f"Total Revenue: ${total_revenue}") # Different calculation later active_user_count = len([u for u in users if u.is_active]) print(f"Active Users: {active_user_count}")
Pattern 3: Temporary Values
# ✅ Temporary lives only 2-3 lines def swap_values(arr, i, j): temp = arr[i] # Temporary variable arr[i] = arr[j] arr[j] = temp # temp used and done (live time: 3 lines)
Pattern 4: Iteration State
# ✅ State variables near the loop they control def process_until_done(): # Other work... # Declare state right before loop done = False attempts = 0 while not done and attempts < max_attempts: done = try_process() attempts += 1
Measuring Your Code
Calculate average span and live time for a function:
def example(): a = 0 # Line 2, ref 1 b = 0 # Line 3 c = 0 # Line 4 a = b + c # Line 5, ref 2 of a d = a * 2 # Line 6, ref 3 of a # Span of a: (5-2=3) + (6-5=1) = 4, average 2 # Live time of a: 6-2+1 = 5 lines
Good metrics:
- Average span: < 5 lines
- Average live time: < 10 lines
If higher: Consider localizing more aggressively.
Benefits
1. Reduces Window of Vulnerability
Shorter live time = fewer lines where variable could be incorrectly modified:
# Live time = 50 lines value = 0 # Line 1 # ... 48 lines where value could be accidentally changed ... return value # Line 50 vs. # Live time = 2 lines value = calculate() # Line 49 return value # Line 50 - less can go wrong
2. Easier to Understand
Seeing declaration and usage together aids comprehension:
# ✅ See both on one screen count = len(items) print(f"Processing {count} items") # ❌ Must scroll to see declaration # ... Line 1: count = 0 # ... 50 lines later... # ... Line 51: print(f"Processing {count} items") # What's count?
3. Reduces Initialization Errors
Variables initialized close to use are less likely to have stale values:
# ✅ Initialized fresh each loop iteration for batch in batches: count = 0 # Reset for each batch for item in batch: count += 1
vs.
# ❌ Might forget to reset count = 0 # Top of function for batch in batches: # Forgot to reset count - accumulates across batches! for item in batch: count += 1
4. Easier Refactoring
Short live time makes extracting to separate routine easier:
# Related statements with short-lived variables # are easy to extract into their own routine old_data = get_old_data() # Lines 10-13 old_total = sum(old_data) print_summary(old_data, old_total) # → Extract to process_old_data() routine
Quick Checklist
For each variable, ask:
- Is this declared in smallest possible scope?
- Could it be loop-scoped instead of function-scoped?
- Could it be function-local instead of class-level?
- Is it initialized close to first use?
- Are all references to it close together?
- Could I extract this section into a routine to reduce scope further?
- If class/global variable: could it be passed as parameter instead?
If any answer is "yes, could be smaller" → localize it.
Real-World Impact
From Code Complete:
- Research shows shorter live times correlate with fewer errors
- Proximity aids comprehension
- Local scope prevents unintended side effects
- Baseline test: agent declared all variables at top (live time ~15-25 lines)
- Better practice: average live time < 10 lines
Key insight: The more you can hide, the less you must keep in mind. The less in mind, the fewer errors.
Integration with Other Skills
For initialization: See patterns in skills/designing-before-coding for thinking about data initialization early in design
For naming: See skills/naming-variables - short-lived local variables can have shorter names; longer-lived variables need more descriptive names