Skip to content

RegexGenerator source model lacks value equality, preventing effective incremental caching #125409

@eiriktsarpalis

Description

@eiriktsarpalis

Summary

The RegexGenerator's incremental source model lacks proper value equality, meaning Roslyn's incremental pipeline cannot effectively cache results. The ObjectImmutableArraySequenceEqualityComparer applied to the source model calls .Equals() on each element, but several key types in the model use reference equality, causing the source output callback to fire on every compilation change regardless of whether the regex patterns have actually changed.

Non-equatable types in the source model

The source model is an ImmutableArray<object> containing:

  • RegexMethod instances
  • (RegexMethod, string, CompilationData) tuples (limited-support results)
  • (RegexMethod, string, Dictionary<string, string[]>, CompilationData) tuples (full codegen results)

The following types lack value equality:

RegexTree (sealed class, no Equals override)

RegexMethod is a sealed record whose positional parameter RegexTree Tree is a sealed class with no Equals/IEquatable<T> implementation. It uses reference equality. It contains:

  • RegexNode Root — sealed class, no equality
  • RegexFindOptimizations FindOptimizations — sealed class, no equality
  • Hashtable? fields — no value equality

Since RegexTree is reconstructed fresh on every compilation, two identical regex patterns always produce non-equal RegexTree references, causing the incremental cache to miss every time.

AnalysisResults (sealed class, no Equals override)

RegexMethod's positional parameter AnalysisResults Analysis is a sealed class with no equality implementation. It contains HashSet<RegexNode> fields and a RegexTree reference — none of which support value equality.

Dictionary<string, string[]> (BCL type, reference equality)

The full codegen result tuple includes a Dictionary<string, string[]> for required helpers. Dictionary<TKey, TValue> does not implement value equality — two dictionaries with identical entries compare as unequal.

Impact

Despite applying ObjectImmutableArraySequenceEqualityComparer, the source generation callback fires on every compilation change because the above types always compare as unequal. The WithComparer achieves nothing in practice for the common case.

Possible solutions

  1. Implement IEquatable<T> on RegexTree, AnalysisResults, and their dependencies (RegexNode, RegexFindOptimizations). This is a significant undertaking given the depth of the object graph.
  2. Convert the source model to use deeply equatable types (e.g., ImmutableEquatableArray for the helpers dictionary, and custom equatable wrappers for the tree/analysis). The reverted commit 850a8ea explored this direction partially.
  3. Compute a hash/fingerprint of the regex tree and use that for equality instead of deep structural comparison.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions