Skip to content

feat!: add lazy BTreeMap iterator for improved performance (breaking API change)#375

Merged
maksymar merged 37 commits intomainfrom
maksym/entry-iter
Jul 17, 2025
Merged

feat!: add lazy BTreeMap iterator for improved performance (breaking API change)#375
maksymar merged 37 commits intomainfrom
maksym/entry-iter

Conversation

@maksymar
Copy link
Copy Markdown
Contributor

@maksymar maksymar commented Jul 15, 2025

This PR changes the BTreeMap iterator's Item type from (K, V) to LazyEntry to enable lazy evaluation of keys and values.

This is a BREAKING API CHANGE, as it deviates from the canonical (K, V) pattern of std::collections::BTreeMap.

The key motivation is performance: previously, each next() call eagerly deserialized and cloned key and value, even if the consumer discarded them. With LazyEntry, key and value access is deferred until explicitly requested, significantly reducing unnecessary allocations and cloning in cases where only part of the entry is needed or iteration is short-circuited.

Code change summary:

  • Introduced LazyEntry with on-demand access to key/value.
  • Replaced Box<Node<K>> with Rc<Node<K>> to allow shared ownership between iterator and entries.
  • Updated all internal iterator consumers to explicitly call .into_pair() where eager materialization is needed.

Performance change summary:

  • BTreeSet: 68 improvements, no regressions. Median instruction count dropped by -6.61%, best case -17.24%
  • BTreeMap: 1 small regression (+2.09%), 10 improvements, with up to -98.33% fewer instructions in range-heavy cases

@maksymar maksymar requested a review from Copilot July 15, 2025 14:16

This comment was marked as outdated.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 15, 2025

canbench 🏋 (dir: ./benchmarks/vec) 3f0e65b 2025-07-17 19:02:09 UTC

./benchmarks/vec/canbench_results.yml is up to date
📦 canbench_results_vec.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 15, 2025

canbench 🏋 (dir: ./benchmarks/memory_manager) 3f0e65b 2025-07-17 19:02:28 UTC

./benchmarks/memory_manager/canbench_results.yml is up to date
📦 canbench_results_memory-manager.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 15, 2025

canbench 🏋 (dir: ./benchmarks/nns) 3f0e65b 2025-07-17 19:02:21 UTC

./benchmarks/nns/canbench_results.yml is up to date
📦 canbench_results_nns.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max +18.93M | p75 +159.18K | median +107 | p25 0 | min 0]
    change %: [max +0.32% | p75 +0.14% | median 0.01% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 16, 2025

canbench 🏋 (dir: ./benchmarks/io_chunks) 3f0e65b 2025-07-17 19:03:00 UTC

./benchmarks/io_chunks/canbench_results.yml is up to date
📦 canbench_results_io_chunks.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 16, 2025

canbench 🏋 (dir: ./benchmarks/btreeset) 3f0e65b 2025-07-17 19:02:25 UTC

./benchmarks/btreeset/canbench_results.yml is up to date
📦 canbench_results_btreeset.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   Improvements detected 🟢
    counts:   [total 100 | regressed 0 | improved 68 | new 0 | unchanged 32]
    change:   [max 0 | p75 -2.30K | median -368.66K | p25 -1.21M | min -12.91M]
    change %: [max 0.00% | p75 -1.27% | median -6.61% | p25 -11.33% | min -17.24%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                                    | calls |     ins |  ins Δ% | HI |  HI Δ% | SMI |  SMI Δ% |
|--------|-----------------------------------------|-------|---------|---------|----|--------|-----|---------|
|   -    | btreeset_is_subset_blob_32              |       |  44.80K |  -2.02% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_range_blob_64                  |       |  24.44M |  -2.67% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_iter_blob_64                   |       |  39.45M |  -2.95% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_subset_blob_64              |       |  55.21K |  -4.76% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_subset_blob_128             |       |  85.06K |  -4.81% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_range_blob_32                  |       |  13.29M |  -4.82% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_iter_blob_32                   |       |  20.94M |  -5.38% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_subset_blob_256             |       | 124.18K |  -5.60% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_blob_1024         |       | 101.68M |  -6.08% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_blob_1024                |       | 101.69M |  -6.08% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_blob_1024 |       | 101.67M |  -6.10% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_blob_512          |       |  53.20M |  -6.21% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_blob_512                 |       |  53.21M |  -6.22% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_blob_512  |       |  53.19M |  -6.25% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_subset_blob_512             |       | 202.26K |  -6.26% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_blob_256          |       |  28.98M |  -6.43% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_blob_256                 |       |  28.98M |  -6.44% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_blob_256  |       |  28.96M |  -6.50% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_subset_blob_1024            |       | 358.42K |  -6.71% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_range_blob_16                  |       |   9.51M |  -6.74% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_blob_128          |       |  16.71M |  -6.74% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_blob_128                 |       |  16.71M |  -6.76% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_blob_128  |       |  16.69M |  -6.87% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_range_blob_8                   |       |   9.14M |  -6.92% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_blob_32           |       |   4.32M |  -7.17% |  0 |  0.00% |   0 |   0.00% |
|  ...   | ... 18 rows omitted ...                 |       |         |         |    |        |     |         |
|   -    | btreeset_is_disjoint_blob_8             |       |   1.85M | -11.55% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_8             |       |   2.73M | -11.56% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_16            |       |   2.92M | -11.63% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_blob_8    |       |   2.78M | -11.72% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_16            |       |   1.91M | -12.30% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_u64                      |       |   1.97M | -13.24% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_u64               |       |   1.96M | -13.36% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_u64                |       |   1.96M | -13.56% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_u64                |       |   1.34M | -13.68% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_128           |       |  13.35M | -13.82% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_256           |       |  23.00M | -13.87% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_512           |       |  41.56M | -14.04% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_1024          |       |  78.74M | -14.09% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_u32                |       |   1.32M | -14.44% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_u32                |       |   1.92M | -14.74% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_intersection_u32               |       |   1.92M | -14.81% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_u64       |       |   1.94M | -14.88% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_128           |       |   7.90M | -15.29% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_256           |       |  13.16M | -15.69% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_superset_blob_64            |       |   7.27M | -15.82% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_512           |       |  23.37M | -16.15% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_union_u32                      |       |   1.90M | -16.24% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_symmetric_difference_u32       |       |   1.90M | -16.31% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_1024          |       |  43.77M | -16.42% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreeset_is_disjoint_blob_64            |       |   4.38M | -17.24% |  0 |  0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jul 16, 2025

canbench 🏋 (dir: ./benchmarks/btreemap) 3f0e65b 2025-07-17 19:04:12 UTC

./benchmarks/btreemap/canbench_results.yml is up to date
📦 canbench_results_btreemap.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   Regressions and improvements 🔴🟢
    counts:   [total 303 | regressed 1 | improved 10 | new 0 | unchanged 292]
    change:   [max +1.55M | p75 +4.12K | median +3.82K | p25 0 | min -1.09B]
    change %: [max +2.09% | p75 0.00% | median 0.00% | p25 0.00% | min -98.33%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 303 | regressed 0 | improved 0 | new 0 | unchanged 303]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 303 | regressed 0 | improved 0 | new 0 | unchanged 303]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                                 | calls |     ins |  ins Δ% | HI |  HI Δ% | SMI |  SMI Δ% |
|--------|--------------------------------------|-------|---------|---------|----|--------|-----|---------|
|   +    | btreemap_v2_scan_keys_rev_1k_0b      |       | 984.22K |  +2.09% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_rev_1k_0b      |       | 978.17K | -20.48% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_1k_0b          |       | 975.99K | -20.68% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_range_value_sum_1k_10kib |       |  20.67M | -63.59% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_range_value_sum_20_10mib |       | 398.31M | -63.91% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_range_key_sum_1k_10kib   |       |   2.57M | -95.47% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_1k_10kib       |       |   2.49M | -95.61% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_rev_1k_10kib   |       |   2.48M | -95.63% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_range_key_sum_20_10mib   |       |  18.47M | -98.33% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_rev_20_10mib   |       |  18.47M | -98.33% |  0 |  0.00% |   0 |   0.00% |
|   -    | btreemap_v2_scan_iter_20_10mib       |       |  18.47M | -98.33% |  0 |  0.00% |   0 |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

@maksymar maksymar requested a review from Copilot July 17, 2025 06:54

This comment was marked as outdated.

This comment was marked as outdated.

@maksymar maksymar changed the title perf: lazy entry iter feat!: add lazy evaluation to BTreeMap iterator via LazyEntry Jul 17, 2025
@maksymar maksymar requested a review from Copilot July 17, 2025 07:40
@maksymar maksymar changed the title feat!: add lazy evaluation to BTreeMap iterator via LazyEntry feat!: add lazy BTreeMap iterator for improved performance Jul 17, 2025
@maksymar maksymar changed the title feat!: add lazy BTreeMap iterator for improved performance feat!: add lazy BTreeMap iterator for improved performance (breaking change) Jul 17, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces lazy deserialization for BTreeMap iteration by wrapping entries in a LazyEntry that defers key/value cloning until accessed, at the cost of a breaking change to the iterator’s Item type.

  • Introduce LazyEntry and switch Iter::Item from (K, V) to LazyEntry<'_, K, V, M>
  • Replace Box<Node<K>> with Rc<Node<K>> to share node ownership between iterators and entries
  • Update all iterator consumers (tests, benchmarks, and internal APIs) to call .into_pair(), .key(), or .value() as needed

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/api_conformance.rs Updated stable.iter() and stable.range() calls to use into_pair()
src/btreeset.rs Adapted next() to call entry.key().clone() on LazyEntry
src/btreemap/proptests.rs Updated property tests to map LazyEntry into pairs with into_pair()
src/btreemap/iter.rs Swapped Box<Node<K>> for Rc<Node<K>>, added LazyEntry and updated iterator logic
src/btreemap.rs Revised docs and public API (added collect_e helper, updated examples)
benchmarks/nns/src/nns_vote_cascading.rs Changed .map() closures to use entry.key() / entry.value()
benchmarks/nns/canbench_results.yml Updated benchmark instruction counts
benchmarks/btreeset/canbench_results.yml Updated benchmark instruction counts
benchmarks/btreemap/src/main.rs Updated benchmark closures to use entry.key() / entry.value()
benchmarks/btreemap/canbench_results.yml Updated benchmark instruction counts
Comments suppressed due to low confidence (3)

src/btreemap/iter.rs:450

  • [nitpick] LazyEntry is a public type but lacks documentation comments explaining its purpose and usage. Please add doc comments to describe the struct and its methods (key, value, into_pair).
pub struct LazyEntry<'a, K, V, M>

src/btreemap.rs:1164

  • [nitpick] Consider re-exporting LazyEntry in the btreemap module (e.g. pub use iter::LazyEntry;) so consumers of map.iter() can easily import and discover the entry type without reaching into internal modules.
    pub fn iter(&self) -> Iter<'_, K, V, M> {

src/btreemap.rs:1374

  • [nitpick] The helper function name collect_e is not very descriptive—consider renaming it to something like collect_pairs or collect_entries to clarify that it transforms LazyEntry into (K, V) pairs.
    fn collect_e<'a, K, V, M>(it: impl Iterator<Item = LazyEntry<'a, K, V, M>>) -> Vec<(K, V)>

Comment thread src/btreemap.rs
@maksymar maksymar marked this pull request as ready for review July 17, 2025 07:53
@maksymar maksymar requested a review from a team as a code owner July 17, 2025 07:53
@maksymar maksymar requested a review from frankdavid July 17, 2025 07:55
@maksymar maksymar changed the title feat!: add lazy BTreeMap iterator for improved performance (breaking change) feat!: add lazy BTreeMap iterator for improved performance (breaking API change) Jul 17, 2025
Copy link
Copy Markdown
Contributor

@berestovskyy berestovskyy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Comment thread src/btreemap.rs Outdated
Comment thread src/btreemap.rs Outdated
Comment thread src/btreemap/iter.rs
@maksymar maksymar requested a review from frankdavid July 17, 2025 18:55
@maksymar maksymar merged commit e12f43f into main Jul 17, 2025
16 checks passed
@maksymar maksymar deleted the maksym/entry-iter branch July 17, 2025 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants