Skip to content

Single-pass compiler: Implement selective page recompilation and caching #6216

@masenf

Description

@masenf

Summary

After the single-pass compiler works correctly (ENG-9144) and Component immutability is in place (ENG-9146), implement caching at the page and component level so that unchanged pages are not recompiled during hot reload.

Depends on ENG-9144, ENG-9146, ENG-9147.

Background

Today, every hot reload recompiles every page from scratch. The App._compile() method has a basic staleness check (_should_compile()) that can skip the entire frontend compile when nothing changed, but there's no granularity — it's all-or-nothing.

The current code already has hints of this ambition:

  • compiler_utils.write_file() checks if file content is identical before writing (avoids unnecessary FS writes)
  • Dev mode skips StatefulComponent shared extraction and page purging

But none of these skip the expensive part: evaluating page functions, walking component trees, and rendering to JS.

Why id()-based caching is insufficient

A naive approach would be to cache based on id(component) — if the same object appears again, reuse the cached output. However, in practice most Reflex apps use helper functions that recreate component trees on every call:

def navbar():
    return rx.hstack(
        rx.link("Home", href="/"),
        rx.link("About", href="/about"),
    )

def index():
    return rx.vstack(navbar(), rx.text("Welcome"))  # navbar() returns a NEW tree every time

def about():
    return rx.vstack(navbar(), rx.text("About"))    # another NEW tree, different id()

Every invocation of navbar() produces fresh Component instances with new id() values, even though the resulting tree is structurally identical. Since this pattern is extremely common (shared layouts, sidebars, headers, footers via helper functions), id()-based caching would have near-zero hit rates for the components that matter most.

Proposed Approach: Structural content-based caching

1. Structural hash for components

Implement a content-based hash on Component that captures the structural identity of a subtree — its type, props, and children — without requiring the expensive render() step:

class Component:
    @cached_property
    def _structural_hash(self) -> int:
        """Hash based on component type, props, and children's structural hashes.
        
        Two component trees that would produce identical render() output
        will have the same structural hash.
        """
        return hash((
            type(self),
            self._get_props_tuple(),      # frozen/hashable representation of props
            tuple(child._structural_hash for child in self.children),
        ))

With immutable components (ENG-9146), this hash can be computed once and cached as a cached_property. The hash is much cheaper than render() since it doesn't produce JS strings — it's just hashing Python types and integers.

2. Component-level render cache (within a compilation run)

Cache render() output keyed on structural hash. When the same navbar() tree appears on multiple pages, it's rendered once:

class RenderCachePlugin(CompilerPlugin):
    """Cache render() output for structurally identical components."""
    
    _render_cache: dict[int, str] = {}  # structural_hash -> rendered JS
    
    async def compile_component(self, comp):
        comp, children = yield
        if not isinstance(comp, Component):
            yield
            return
        
        h = comp._structural_hash
        if h in self._render_cache:
            # Reuse cached render output
            ...
        else:
            rendered = comp.render()
            self._render_cache[h] = rendered
            ...

3. Component-level metadata cache (imports, hooks, custom code)

Similarly, the per-component _get_imports(), _get_hooks(), _get_custom_code() results can be cached by structural hash. If two rx.button("Click", on_click=...) instances are structurally identical, their imports are identical too:

# In ConsolidateImportsPlugin:
h = comp._structural_hash
if h in self._imports_cache:
    page_ctx.imports.append(self._imports_cache[h])
else:
    imports = comp._get_imports()
    self._imports_cache[h] = imports
    page_ctx.imports.append(imports)

4. Page-level cache across hot reloads

For cross-reload caching, compare the structural hash of each page's root component to the previously compiled version:

@dataclasses.dataclass
class PageCache:
    _cache: dict[str, CachedPage] = dataclasses.field(default_factory=dict)

@dataclasses.dataclass  
class CachedPage:
    root_structural_hash: int       # structural hash of the root component
    compiled_output: str            # the generated JS
    page_context: PageContext       # accumulated compilation data
    output_path: str

During compilation:

for route, page_fn in pages.items():
    component = page_fn()  # always re-evaluate the page function
    h = component._structural_hash
    
    if (cached := page_cache.get(route)) and h == cached.root_structural_hash:
        # Structurally identical — reuse cached output
        compile_results.append((cached.output_path, cached.compiled_output))
        continue
    
    # Structure changed — recompile this page
    page_ctx = await compile_page(component)
    page_cache[route] = CachedPage(root_structural_hash=h, ...)

This correctly handles the navbar() helper case: even though navbar() returns new objects, the structural hash of the full page tree will be the same if nothing actually changed.

5. Handling state-dependent components

Components that reference State vars need special attention. Two components can be structurally different if they reference different state vars, even if they look similar. The structural hash must incorporate Var references:

# In _get_props_tuple():
# Include Var references so that rx.text(State.name) and rx.text(State.email)
# have different structural hashes

This should happen naturally if Var objects have proper __hash__ implementations, since they'll be part of the props.

Acceptance Criteria

  • Component._structural_hash is implemented as a cached property on immutable components
  • Structural hash correctly differentiates components with different types, props, children, or Var references
  • Structural hash correctly identifies structurally identical trees created by separate function calls
  • Component-level render cache avoids re-rendering structurally identical components within a compilation run
  • Per-component metadata caches (imports, hooks, custom code) keyed on structural hash
  • Page-level cache skips recompilation for structurally unchanged pages during hot reload
  • Cache hit/miss rates are logged at debug level
  • Cache is properly invalidated when component structure actually changes
  • Benchmark: measure hot reload time on a 20+ page app with shared layout helpers when only 1 page changes (expecting significant speedup)
  • Benchmark: measure cache hit rate for shared components (navbar, sidebar, etc.) across pages
  • No stale output bugs — changing a component always produces updated output
  • Cache is cleared on reflex run (fresh start) but preserved across hot reloads

Edge Cases to Handle

  • Hash collisions: Structural hash collisions are theoretically possible. Use the hash as a fast-path check and fall back to a deeper comparison if needed, or accept the astronomically low collision probability of a good hash.
  • Components with side effects in create(): Some components may do non-deterministic things during creation. The structural hash won't detect this — document it as a known limitation.
  • Var identity vs equality: Ensure that State.name produces the same hash regardless of when/where it's accessed (it should, since Var objects are value-based).
  • Style changes: App-level style changes (App.style) affect all components. The structural hash of individual components won't change, but the ApplyStylePlugin output will. The page-level cache key should also incorporate the app style hash.

Key Files

  • reflex/components/component.pyComponent class, where _structural_hash would live
  • reflex/compiler/plugins.py (new) — plugin implementations where caching hooks would live
  • reflex/compiler/compiler.py — compilation functions, output path management
  • reflex/compiler/utils.pywrite_file() already has content-comparison logic
  • reflex/app.py_should_compile(), hot reload integration
  • reflex/vars/base.pyVar.__hash__ implementation (important for structural hash correctness)

Notes

  • The structural hash approach is inspired by how React's virtual DOM diffing works — comparing tree structure rather than object identity.
  • This pairs well with the OTel instrumentation item from the roadmap (1g) — cache hit/miss rates and "pages skipped" counts would be excellent metrics to export.
  • The page function still needs to be called each time (to pick up any code changes), but the resulting tree can be cheaply compared to the cached version via structural hash before committing to a full recompile.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAnything you want improved

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions