Claude-skill-registry gpui-performance
Performance optimization techniques for GPUI including rendering optimization, layout performance, memory management, and profiling strategies. Use when user needs to optimize GPUI application performance or debug performance issues.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/gpui-performance" ~/.claude/skills/majiayu000-claude-skill-registry-gpui-performance && rm -rf "$T"
manifest:
skills/data/gpui-performance/SKILL.mdsource content
GPUI Performance Optimization
Metadata
This skill provides comprehensive guidance on optimizing GPUI applications for rendering performance, memory efficiency, and overall runtime speed.
Instructions
Rendering Optimization
Understanding the Render Cycle
State Change → cx.notify() → Render → Layout → Paint → Display
Key Points:
- Only call
when state actually changescx.notify() - Minimize work in
methodrender() - Cache expensive computations
- Reduce element count and nesting
Avoiding Unnecessary Renders
// BAD: Renders on every frame impl MyComponent { fn start_animation(&mut self, cx: &mut ViewContext<Self>) { cx.spawn(|this, mut cx| async move { loop { cx.update(|_, cx| cx.notify()).ok(); // Forces rerender! Timer::after(Duration::from_millis(16)).await; } }).detach(); } } // GOOD: Only render when state changes impl MyComponent { fn update_value(&mut self, new_value: i32, cx: &mut ViewContext<Self>) { if self.value != new_value { self.value = new_value; cx.notify(); // Only notify on actual change } } }
Optimize Subscription Updates
// BAD: Always rerenders on model change let _subscription = cx.observe(&model, |_, _, cx| { cx.notify(); // Rerenders even if nothing relevant changed }); // GOOD: Selective updates let _subscription = cx.observe(&model, |this, model, cx| { let data = model.read(cx); // Only rerender if relevant field changed if data.relevant_field != this.cached_field { this.cached_field = data.relevant_field.clone(); cx.notify(); } });
Memoization Pattern
use std::cell::RefCell; use std::collections::hash_map::DefaultHasher; use std::hash::{Hash, Hasher}; struct MemoizedComponent { model: Model<Data>, cached_result: RefCell<Option<(u64, String)>>, // (hash, result) } impl MemoizedComponent { fn expensive_computation(&self, cx: &ViewContext<Self>) -> String { let data = self.model.read(cx); // Calculate hash of input let mut hasher = DefaultHasher::new(); data.relevant_fields.hash(&mut hasher); let hash = hasher.finish(); // Return cached if unchanged if let Some((cached_hash, cached_result)) = &*self.cached_result.borrow() { if *cached_hash == hash { return cached_result.clone(); } } // Compute and cache let result = perform_expensive_computation(&data); *self.cached_result.borrow_mut() = Some((hash, result.clone())); result } }
Layout Performance
Minimize Layout Complexity
// BAD: Deep nesting div() .flex() .child( div() .flex() .child( div() .flex() .child( div().child("Content") ) ) ) // GOOD: Flat structure div() .flex() .flex_col() .gap_4() .child("Header") .child("Content") .child("Footer")
Use Fixed Sizing When Possible
// BETTER: Fixed sizes (no layout calculation) div() .w(px(200.)) .h(px(100.)) .child("Fixed size") // SLOWER: Dynamic sizing (requires layout calculation) div() .w_full() .h_full() .child("Dynamic size")
Avoid Layout Thrashing
// BAD: Reading layout during render impl Render for BadComponent { fn render(&mut self, cx: &mut ViewContext<Self>) -> impl IntoElement { let width = cx.window_bounds().get_bounds().size.width; // Using width immediately causes layout thrashing div().w(width) } } // GOOD: Cache layout-dependent values struct GoodComponent { cached_width: Pixels, } impl GoodComponent { fn on_window_resize(&mut self, cx: &mut ViewContext<Self>) { let width = cx.window_bounds().get_bounds().size.width; if self.cached_width != width { self.cached_width = width; cx.notify(); } } }
Virtual Scrolling for Long Lists
struct VirtualList { items: Vec<String>, scroll_offset: f32, viewport_height: f32, item_height: f32, } impl Render for VirtualList { fn render(&mut self, cx: &mut ViewContext<Self>) -> impl IntoElement { // Calculate visible range let start_index = (self.scroll_offset / self.item_height).floor() as usize; let visible_count = (self.viewport_height / self.item_height).ceil() as usize; let end_index = (start_index + visible_count).min(self.items.len()); // Only render visible items div() .h(px(self.viewport_height)) .overflow_y_scroll() .on_scroll(cx.listener(|this, event, cx| { this.scroll_offset = event.scroll_offset.y; cx.notify(); })) .child( div() .h(px(self.items.len() as f32 * self.item_height)) .child( div() .absolute() .top(px(start_index as f32 * self.item_height)) .children( self.items[start_index..end_index] .iter() .map(|item| { div() .h(px(self.item_height)) .child(item.as_str()) }) ) ) ) } }
Memory Management
Preventing Memory Leaks
// LEAK: Subscription not stored impl BadView { fn new(model: Model<Data>, cx: &mut ViewContext<Self>) -> Self { cx.observe(&model, |_, _, cx| cx.notify()); // Leak! Self { model } } } // CORRECT: Store subscription struct GoodView { model: Model<Data>, _subscription: Subscription, // Cleaned up on Drop } impl GoodView { fn new(model: Model<Data>, cx: &mut ViewContext<Self>) -> Self { let _subscription = cx.observe(&model, |_, _, cx| cx.notify()); Self { model, _subscription } } }
Avoid Circular References
// BAD: Circular reference struct CircularRef { self_view: Option<View<Self>>, // Circular! } // GOOD: Use weak references or redesign struct NoCycle { other_view: View<OtherView>, // No cycle }
Bounded Collections
use std::collections::VecDeque; const MAX_HISTORY: usize = 100; struct BoundedHistory { items: VecDeque<Item>, } impl BoundedHistory { fn add_item(&mut self, item: Item) { self.items.push_back(item); // Maintain size limit while self.items.len() > MAX_HISTORY { self.items.pop_front(); } } }
Reuse Allocations
struct BufferedComponent { buffer: String, // Reused across operations } impl BufferedComponent { fn format_data(&mut self, data: &[Item]) -> &str { self.buffer.clear(); // Reuse allocation for item in data { use std::fmt::Write; write!(&mut self.buffer, "{}\n", item.name).ok(); } &self.buffer } }
Profiling Strategies
CPU Profiling with cargo-flamegraph
# Install cargo install flamegraph # Profile application cargo flamegraph --bin your-app # With specific features cargo flamegraph --bin your-app --features profiling # Opens flamegraph.svg showing CPU time distribution
Memory Profiling
# valgrind (Linux) valgrind --tool=massif --massif-out-file=massif.out ./target/release/your-app ms_print massif.out # heaptrack (Linux) heaptrack ./target/release/your-app heaptrack_gui heaptrack.your-app.*.gz # Instruments (macOS) instruments -t "Allocations" ./target/release/your-app
Custom Performance Monitoring
use std::time::Instant; struct PerformanceMonitor { frame_times: VecDeque<Duration>, max_samples: usize, } impl PerformanceMonitor { fn new() -> Self { Self { frame_times: VecDeque::with_capacity(100), max_samples: 100, } } fn record_frame(&mut self, duration: Duration) { self.frame_times.push_back(duration); if self.frame_times.len() > self.max_samples { self.frame_times.pop_front(); } // Warn if frame is slow (> 16ms for 60fps) if duration.as_millis() > 16 { eprintln!("⚠️ Slow frame: {}ms", duration.as_millis()); } } fn average_fps(&self) -> f64 { if self.frame_times.is_empty() { return 0.0; } let total: Duration = self.frame_times.iter().sum(); let avg = total / self.frame_times.len() as u32; 1000.0 / avg.as_millis() as f64 } fn percentile(&self, p: f64) -> Duration { let mut sorted: Vec<_> = self.frame_times.iter().copied().collect(); sorted.sort(); let index = (sorted.len() as f64 * p) as usize; sorted[index.min(sorted.len() - 1)] } } // Usage in component impl MyView { fn measure_render<F>(&mut self, f: F, cx: &mut ViewContext<Self>) where F: FnOnce(&mut Self, &mut ViewContext<Self>) { let start = Instant::now(); f(self, cx); let elapsed = start.elapsed(); self.perf_monitor.record_frame(elapsed); // Log stats periodically if self.frame_count % 60 == 0 { println!( "Avg FPS: {:.1}, p95: {}ms, p99: {}ms", self.perf_monitor.average_fps(), self.perf_monitor.percentile(0.95).as_millis(), self.perf_monitor.percentile(0.99).as_millis(), ); } } }
Benchmark with Criterion
// benches/component_bench.rs use criterion::{black_box, criterion_group, criterion_main, Criterion, BenchmarkId}; fn render_benchmark(c: &mut Criterion) { let mut group = c.benchmark_group("rendering"); for size in [10, 100, 1000].iter() { group.bench_with_input( BenchmarkId::from_parameter(size), size, |b, &size| { b.iter(|| { App::test(|cx| { let items = vec![Item::default(); size]; let view = cx.new_view(|cx| { ListView::new(items, cx) }); view.update(cx, |view, cx| { black_box(view.render(cx)); }); }); }); } ); } group.finish(); } criterion_group!(benches, render_benchmark); criterion_main!(benches);
Batching Updates
// BAD: Multiple individual updates for item in items { self.model.update(cx, |model, cx| { model.add_item(item); // Triggers rerender each time! cx.notify(); }); } // GOOD: Batch into single update self.model.update(cx, |model, cx| { for item in items { model.add_item(item); } cx.notify(); // Single rerender });
Async Rendering Optimization
struct AsyncView { loading_state: Model<LoadingState>, } impl AsyncView { fn load_data(&mut self, cx: &mut ViewContext<Self>) { let loading_state = self.loading_state.clone(); // Show loading immediately self.loading_state.update(cx, |state, cx| { *state = LoadingState::Loading; cx.notify(); }); // Load asynchronously cx.spawn(|_, mut cx| async move { // Fetch data let data = fetch_data().await?; // Update state once cx.update_model(&loading_state, |state, cx| { *state = LoadingState::Loaded(data); cx.notify(); })?; Ok::<_, anyhow::Error>(()) }).detach(); } }
Caching Strategies
Result Caching
use std::collections::HashMap; struct CachedRenderer { cache: RefCell<HashMap<String, CachedElement>>, } impl CachedRenderer { fn render_cached( &self, key: String, render_fn: impl FnOnce() -> AnyElement, ) -> AnyElement { let mut cache = self.cache.borrow_mut(); cache.entry(key) .or_insert_with(|| CachedElement::new(render_fn())) .element .clone() } fn invalidate(&self, key: &str) { self.cache.borrow_mut().remove(key); } }
Resources
Performance Targets
Rendering:
- Target: 60 FPS (16.67ms per frame)
- Render + Layout: ~10ms
- Paint: ~6ms
- Warning: Any frame > 16ms
Memory:
- Monitor heap growth
- Warning: Steady increase (leak)
- Target: Stable after initialization
Startup:
- Window display: < 100ms
- Fully interactive: < 500ms
Profiling Tools
CPU Profiling:
- cargo-flamegraph: Visualize CPU time
- perf (Linux): System-level profiling
- Instruments (macOS): Apple's profiler
Memory Profiling:
- valgrind/massif: Memory usage tracking
- heaptrack: Heap allocation tracking
- Instruments: Memory allocations
Benchmarking:
- criterion: Statistical benchmarking
- cargo bench: Built-in benchmarks
- hyperfine: Command-line tool benchmarking
Best Practices
- Measure First: Profile before optimizing
- Minimize Renders: Only
when necessarycx.notify() - Cache Results: Memoize expensive computations
- Batch Updates: Group state changes
- Virtual Scrolling: For long lists
- Flat Layouts: Avoid deep nesting
- Fixed Sizing: When possible
- Monitor Memory: Watch for leaks
- Async Loading: Don't block UI
- Test Performance: Include benchmarks
Common Bottlenecks
- Subscription in render (memory leak)
- Expensive computation in render
- Deep component nesting
- Unnecessary rerenders
- Layout thrashing
- Large lists without virtualization
- Memory leaks from circular refs
- Unbounded collections