Skip to content

Conversation

@alex60217101990
Copy link

Array Lazy Hash Computation Optimization

Why the changes in this PR are needed?

The current Array implementation maintains a separate hashs []int slice that stores precomputed hashes for each element. This leads to:

  • Excessive memory consumption: 8 bytes × N elements for every array
  • Unnecessary computations: Hashes are computed even if never used
  • Extra allocations: During array creation, copying, and slicing operations

Many arrays are created for temporary use and never participate in hash-based operations (e.g., map lookups or set operations).

What are the changes in this PR?

Refactor the Array structure to use lazy hash computation:

Key Changes:

  1. Removed hashs []int field from the Array struct

    • Memory savings: 8*N bytes per array (where N is the number of elements)
  2. Added hashValid bool flag to track hash computation state

    • Flag indicates whether the hash has been computed
  3. Implemented lazy evaluation in the Hash() method

    • Hash is computed only on first access
    • Subsequent calls return the cached value
  4. Incremental hash updates in Array.Append()

    • If hash is already computed, it's updated incrementally: hash += newElement.Hash()
    • If hash was not computed, computation is deferred
  5. Updated all related methods:

    • NewArray() - no longer computes hashes at creation time
    • Copy() - copies the hashValid flag
    • Sorted() - preserves computed hash (sorting doesn't change hash)
    • Slice() - creates a slice with invalid hash
    • rehash() - simplified to just invalidate the cache
    • set() - invalidates hash instead of recomputing

Benchmark Results

Key Improvements:

Array Creation (ArrayCreation)

  • 10 elements: -68% time, -67% memory
  • 100 elements: -82% time, -95% memory
  • 1000 elements: -85% time, -99% memory
  • 10000 elements: -93% time, -99% memory

Append Operations (ArrayAppend)

  • 10 elements: -65% time, -52% memory, -40% allocs
  • 100 elements: -71% time, -59% memory, -40% allocs
  • 1000 elements: -77% time, -62% memory, -40% allocs

Array Copy (ArrayCopy)

  • 10 elements: -13% time, -21% memory
  • 100 elements: -9% time, -21% memory
  • 1000 elements: -4% time, -20% memory
  • 10000 elements: -10% time, -20% memory

Slice Operations (ArraySlice)

  • 100 elements: -69% time, -25% memory
  • 1000 elements: -96% time, -25% memory
  • 10000 elements: -99.5% time, -25% memory

Set Operations (ArraySet)

  • 10 elements: -91% time
  • 100 elements: -97% time
  • 1000 elements: -99.7% time

Operations Without Hash Access (ArrayNoHashAccess)

This benchmark demonstrates the real benefit of lazy evaluation - when hash is not needed:

  • 10 elements: -68% time, -67% memory, -50% allocs
  • 100 elements: -82% time, -95% memory, -50% allocs
  • 1000 elements: -86% time, -99% memory, -50% allocs
  • 10000 elements: -94% time, -99% memory, -50% allocs

Overall Geometric Mean:

  • Execution time: -59%
  • Memory usage: -59%
  • Number of allocations: -19%

Full benchstat Results

Detailed results are available in:

Refactor Array structure to use lazy hash evaluation instead of maintaining
a separate hashs slice, reducing memory overhead by 8*N bytes per array.

Key changes:
- Remove hashs []int slice from Array struct
- Add hashValid bool flag to track hash computation state
- Implement lazy hash computation in Hash() method
- Optimize Append() with incremental hash updates when hash is already computed
- Update Copy(), Sorted(), Slice() to work with new hash model
- Simplify rehash() to just invalidate the hash cache

This optimization reduces memory allocations while maintaining performance
through intelligent caching and incremental updates.

Signed-off-by: alex60217101990 <alex6021710@gmail.com>
Add comprehensive benchmarks to measure the impact of lazy hash computation
optimization for Array operations.

Signed-off-by: alex60217101990 <alex6021710@gmail.com>
@alex60217101990 alex60217101990 force-pushed the ast/lazy-eval-array-hash branch from 4141a17 to cb4a46b Compare December 23, 2025 08:49
@netlify
Copy link

netlify bot commented Dec 23, 2025

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 4141a17
🔍 Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/694a5730fb498e0008a6d775
😎 Deploy Preview https://deploy-preview-8155--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Dec 23, 2025

Deploy Preview for openpolicyagent ready!

Name Link
🔨 Latest commit 0fa2376
🔍 Latest deploy log https://app.netlify.com/projects/openpolicyagent/deploys/695d5fc4b7324a00082c7aa4
😎 Deploy Preview https://deploy-preview-8155--openpolicyagent.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@srenatus
Copy link
Contributor

srenatus commented Jan 5, 2026

That race detected in tests is probably genuine. Can you have a look? Also, let's better discuss improvements like this before diving into it, I could have made you aware of the proneness to data races in this approach 💭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants