Summary
After the new single-pass compiler is functional (ENG-9142, ENG-9143, ENG-9144), profile it against the current compiler to identify and fix any performance regressions, then optimize low-hanging fruit in the hot paths.
Prior Art: Existing CodSpeed Benchmark Suite
There is already a pytest-codspeed benchmark suite in tests/benchmarks/ that covers parts of the compilation pipeline. Understanding what it already tests — and what it doesn't — is essential for scoping this work.
What exists today
tests/benchmarks/test_compilation.py benchmarks three things:
test_compile_page — times _compile_page(evaluated_page), which is the final step of rendering an already-evaluated component tree to JS (calling _get_all_imports, _get_all_dynamic_imports, _get_all_custom_code, _get_all_hooks, and component.render())
test_compile_stateful — times _compile_stateful_components([evaluated_page]), which runs the StatefulComponent memoization and shared component extraction
test_get_all_imports — times evaluated_page._get_all_imports() in isolation
tests/benchmarks/test_evaluate.py benchmarks:
test_evaluate_page — times calling the page function itself (component tree construction)
tests/benchmarks/fixtures.py provides two representative page fixtures:
_complicated_page — a sidebar-heavy layout with accordion navigation, ~50 link items across categories, using frozen dataclasses and map() for component generation. Exercises deeply nested component trees with many children.
_stateful_page — a page with rx.cond, rx.match, rx.foreach (including nested foreach), state var references, and event handlers. Exercises the stateful component path.
Both fixtures are parametrized so each benchmark runs against both page types.
What the existing suite does NOT cover
The existing benchmarks focus on individual compilation functions in isolation — they don't measure the full end-to-end pipeline as orchestrated by App._compile(). Specifically, these are not benchmarked today:
- Page evaluation + style application —
compile_unevaluated_page() calls into_component() then _add_style_recursive(). The test_evaluate_page benchmark only covers the first part.
- Full pipeline orchestration — The overhead of
App._compile() itself: iterating pages, executor setup, progress tracking, file writing, etc.
- Multi-page compilation — How compile time scales with 5, 10, 20+ pages (especially with shared components across pages).
- Import merging/collapsing —
merge_imports() and collapse_imports() in reflex/utils/imports.py are called on every page's accumulated imports but aren't benchmarked independently.
- Plugin dispatch overhead (new) — The async generator machinery and
CompilerHooks._dispatch() cost per component, which is new to the single-pass architecture.
Goal: extend the suite, don't replace it
The existing CodSpeed benchmarks should be preserved and adapted to benchmark the equivalent new-compiler functions. This gives us direct before/after comparisons on the same fixtures. New benchmarks should be added to cover the gaps listed above.
Tasks
1. Adapt existing benchmarks for the new compiler
The current benchmarks call _compile_page() and _compile_stateful_components() directly. After the new compiler lands:
test_compile_page — Add a parallel benchmark that compiles the same evaluated_page through the new plugin pipeline (i.e., run the full CompileContext.compile() for a single page). Keep the old benchmark temporarily for A/B comparison.
test_compile_stateful — This may no longer apply if StatefulComponent is replaced (ENG-9145). Replace with a benchmark for the new MemoizeStatefulPlugin if applicable.
test_get_all_imports — Add a parallel benchmark that runs ConsolidateImportsPlugin on the same page to compare single-pass import collection vs the recursive _get_all_imports() walk.
2. Add new benchmarks for gaps in coverage
Add these to tests/benchmarks/:
test_compile_pipeline — End-to-end: construct a CompileContext with both fixture pages plus additional pages sharing the side_bar() component, and time compile_ctx.compile(). This measures the full orchestrated pipeline including plugin dispatch.
test_plugin_dispatch_overhead — Micro-benchmark: time CompilerHooks._dispatch("compile_component", comp) for a single component with the default plugin set. This isolates the async generator dispatch cost.
test_style_application — Time _add_style_recursive() (or its plugin replacement ApplyStylePlugin) on the _complicated_page fixture. This is currently unmeasured.
test_import_merging — Time merge_imports() and collapse_imports() on a realistic set of import dicts collected from the _complicated_page fixture.
test_multi_page_scaling — Time compilation of 1, 5, 10, 20 pages (reusing the same page fixtures) to characterize scaling behavior.
3. Profile the new compilation pipeline
Use cProfile / py-spy / scalene to identify hot paths beyond what CodSpeed measures. Known areas to watch:
- Plugin dispatch overhead:
CompilerHooks._dispatch() creates generators for each plugin per component. For a tree with 1000 components and 8 plugins, that's 8000 generator creations. Measure if this is material.
- Async generator overhead: Each
compile_component hook is an async generator with yield / asend. If the overhead per-component is even 10μs, it adds up at scale.
ContextVar.get() calls: Plugins call PageContext.get() and CompileContext.get() frequently. ContextVar lookups are fast but not free.
- Import merging:
merge_imports() and collapse_imports() in reflex/utils/imports.py are called frequently. Profile to see if they're a bottleneck.
- Component rendering:
component.render() is likely the most expensive single operation per component.
4. Optimize low-hanging fruit
Based on profiling, expected optimizations include:
- Skip no-op plugins: If a plugin's
compile_component is the default (base protocol method), don't dispatch to it. The demo code already checks for this.
- Batch plugin dispatch: Instead of creating individual generators per plugin, consider a combined dispatch that reduces generator overhead.
- Cache component renders: If a component is immutable (ENG-9146), cache its
render() output.
- Reduce dict/set operations: The consolidation plugins accumulate into dicts/sets on every component. Consider batched approaches or pre-allocated structures.
- Minimize
isinstance checks: Several plugins check isinstance(comp, Component) — consider a pre-computed flag or tag.
5. Compare against baseline
Run the CodSpeed suite (both old and new benchmarks) with the old compiler and the new compiler, and document:
- Per-benchmark time comparison
- Compilation time comparison on realistic apps
- Memory usage comparison
- Per-page time distribution
Acceptance Criteria
Key Files
tests/benchmarks/test_compilation.py — existing CodSpeed benchmarks to extend
tests/benchmarks/test_evaluate.py — existing page evaluation benchmark
tests/benchmarks/fixtures.py — _complicated_page, _stateful_page, SideBarState, BenchmarkState fixtures
tests/benchmarks/conftest.py — fixture wiring
reflex/compiler/compiler.py — compilation functions (old and new)
reflex/compiler/plugins.py (new) — plugin implementations
reflex/components/component.py — render(), _get_imports(), etc.
reflex/utils/imports.py — merge_imports(), collapse_imports()
Notes
- CodSpeed runs in CI and provides automatic regression detection on PRs. The new benchmarks will inherit this behavior, giving us ongoing protection against compile-time regressions.
- The benchmark should eventually be extended with a "real app" fixture (e.g., a subset of the reflex-web docs site), but the existing synthetic fixtures are a good starting point.
- Performance of the compiler disproportionately affects developer experience (hot reload speed), so even small improvements matter.
- Keep in mind that the compiler is almost entirely CPU-bound Python. The biggest wins will come from doing less work (fewer traversals, more caching) rather than from parallelism.
Summary
After the new single-pass compiler is functional (ENG-9142, ENG-9143, ENG-9144), profile it against the current compiler to identify and fix any performance regressions, then optimize low-hanging fruit in the hot paths.
Prior Art: Existing CodSpeed Benchmark Suite
There is already a
pytest-codspeedbenchmark suite intests/benchmarks/that covers parts of the compilation pipeline. Understanding what it already tests — and what it doesn't — is essential for scoping this work.What exists today
tests/benchmarks/test_compilation.pybenchmarks three things:test_compile_page— times_compile_page(evaluated_page), which is the final step of rendering an already-evaluated component tree to JS (calling_get_all_imports,_get_all_dynamic_imports,_get_all_custom_code,_get_all_hooks, andcomponent.render())test_compile_stateful— times_compile_stateful_components([evaluated_page]), which runs theStatefulComponentmemoization and shared component extractiontest_get_all_imports— timesevaluated_page._get_all_imports()in isolationtests/benchmarks/test_evaluate.pybenchmarks:test_evaluate_page— times calling the page function itself (component tree construction)tests/benchmarks/fixtures.pyprovides two representative page fixtures:_complicated_page— a sidebar-heavy layout with accordion navigation, ~50 link items across categories, using frozen dataclasses andmap()for component generation. Exercises deeply nested component trees with many children._stateful_page— a page withrx.cond,rx.match,rx.foreach(including nested foreach), state var references, and event handlers. Exercises the stateful component path.Both fixtures are parametrized so each benchmark runs against both page types.
What the existing suite does NOT cover
The existing benchmarks focus on individual compilation functions in isolation — they don't measure the full end-to-end pipeline as orchestrated by
App._compile(). Specifically, these are not benchmarked today:compile_unevaluated_page()callsinto_component()then_add_style_recursive(). Thetest_evaluate_pagebenchmark only covers the first part.App._compile()itself: iterating pages, executor setup, progress tracking, file writing, etc.merge_imports()andcollapse_imports()inreflex/utils/imports.pyare called on every page's accumulated imports but aren't benchmarked independently.CompilerHooks._dispatch()cost per component, which is new to the single-pass architecture.Goal: extend the suite, don't replace it
The existing CodSpeed benchmarks should be preserved and adapted to benchmark the equivalent new-compiler functions. This gives us direct before/after comparisons on the same fixtures. New benchmarks should be added to cover the gaps listed above.
Tasks
1. Adapt existing benchmarks for the new compiler
The current benchmarks call
_compile_page()and_compile_stateful_components()directly. After the new compiler lands:test_compile_page— Add a parallel benchmark that compiles the sameevaluated_pagethrough the new plugin pipeline (i.e., run the fullCompileContext.compile()for a single page). Keep the old benchmark temporarily for A/B comparison.test_compile_stateful— This may no longer apply ifStatefulComponentis replaced (ENG-9145). Replace with a benchmark for the newMemoizeStatefulPluginif applicable.test_get_all_imports— Add a parallel benchmark that runsConsolidateImportsPluginon the same page to compare single-pass import collection vs the recursive_get_all_imports()walk.2. Add new benchmarks for gaps in coverage
Add these to
tests/benchmarks/:test_compile_pipeline— End-to-end: construct aCompileContextwith both fixture pages plus additional pages sharing theside_bar()component, and timecompile_ctx.compile(). This measures the full orchestrated pipeline including plugin dispatch.test_plugin_dispatch_overhead— Micro-benchmark: timeCompilerHooks._dispatch("compile_component", comp)for a single component with the default plugin set. This isolates the async generator dispatch cost.test_style_application— Time_add_style_recursive()(or its plugin replacementApplyStylePlugin) on the_complicated_pagefixture. This is currently unmeasured.test_import_merging— Timemerge_imports()andcollapse_imports()on a realistic set of import dicts collected from the_complicated_pagefixture.test_multi_page_scaling— Time compilation of 1, 5, 10, 20 pages (reusing the same page fixtures) to characterize scaling behavior.3. Profile the new compilation pipeline
Use
cProfile/py-spy/scaleneto identify hot paths beyond what CodSpeed measures. Known areas to watch:CompilerHooks._dispatch()creates generators for each plugin per component. For a tree with 1000 components and 8 plugins, that's 8000 generator creations. Measure if this is material.compile_componenthook is an async generator withyield/asend. If the overhead per-component is even 10μs, it adds up at scale.ContextVar.get()calls: Plugins callPageContext.get()andCompileContext.get()frequently.ContextVarlookups are fast but not free.merge_imports()andcollapse_imports()inreflex/utils/imports.pyare called frequently. Profile to see if they're a bottleneck.component.render()is likely the most expensive single operation per component.4. Optimize low-hanging fruit
Based on profiling, expected optimizations include:
compile_componentis the default (base protocol method), don't dispatch to it. The demo code already checks for this.render()output.isinstancechecks: Several plugins checkisinstance(comp, Component)— consider a pre-computed flag or tag.5. Compare against baseline
Run the CodSpeed suite (both old and new benchmarks) with the old compiler and the new compiler, and document:
Acceptance Criteria
tests/benchmarks/test_compilation.pyhave parallel versions for the new compiler pipelinetests/benchmarks/fixtures.py(extend if needed, don't replace)test_compile_pageandtest_get_all_importsbenchmarks vs the old compilerKey Files
tests/benchmarks/test_compilation.py— existing CodSpeed benchmarks to extendtests/benchmarks/test_evaluate.py— existing page evaluation benchmarktests/benchmarks/fixtures.py—_complicated_page,_stateful_page,SideBarState,BenchmarkStatefixturestests/benchmarks/conftest.py— fixture wiringreflex/compiler/compiler.py— compilation functions (old and new)reflex/compiler/plugins.py(new) — plugin implementationsreflex/components/component.py—render(),_get_imports(), etc.reflex/utils/imports.py—merge_imports(),collapse_imports()Notes