Skip to content

perf: batch allocate overflow pages in BTreeMap#357

Merged
maksymar merged 14 commits intomainfrom
maksym/dbg-flame
Jul 1, 2025
Merged

perf: batch allocate overflow pages in BTreeMap#357
maksymar merged 14 commits intomainfrom
maksym/dbg-flame

Conversation

@maksymar
Copy link
Copy Markdown
Contributor

@maksymar maksymar commented Jun 30, 2025

This PR improves performance of writing big keys/values into BTreeMap that take several pages.

Now it preallocate overflow pages in advance, which helps in cases of allocating a single chunk of bytes, say 10 MB.
Improvement on write_chunks_btreemap_1 benchmark is -1.37% instructions.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 30, 2025

canbench 🏋 (dir: ./benchmarks/btreemap) 3440e24 2025-06-30 19:31:44 UTC

./benchmarks/btreemap/canbench_results.yml is up to date
📦 canbench_results_btreemap.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 285 | regressed 0 | improved 0 | new 0 | unchanged 285]
    change:   [max +47.96M | p75 +10.97K | median 0 | p25 0 | min -27.15M]
    change %: [max +1.72% | p75 0.00% | median 0.00% | p25 0.00% | min -0.57%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 285 | regressed 0 | improved 0 | new 0 | unchanged 285]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 285 | regressed 0 | improved 0 | new 0 | unchanged 285]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 30, 2025

canbench 🏋 (dir: ./benchmarks/btreeset) 3440e24 2025-06-30 19:30:11 UTC

./benchmarks/btreeset/canbench_results.yml is up to date
📦 canbench_results_btreeset.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max +10.93M | p75 0 | median 0 | p25 0 | min 0]
    change %: [max +0.79% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 100 | regressed 0 | improved 0 | new 0 | unchanged 100]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 30, 2025

canbench 🏋 (dir: ./benchmarks/compare) 3440e24 2025-06-30 19:30:53 UTC

./benchmarks/compare/canbench_results.yml is up to date
📦 canbench_results_compare.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max +120.50M | p75 0 | median 0 | p25 0 | min -16.64M]
    change %: [max +0.14% | p75 0.00% | median 0.00% | p25 0.00% | min -1.37%]

  heap_increase:
    status:   Regressions and improvements 🔴🟢
    counts:   [total 18 | regressed 1 | improved 1 | new 0 | unchanged 16]
    change:   [max +1.60K | p75 0 | median 0 | p25 0 | min -21]
    change %: [max +inf% | p75 0.00% | median 0.00% | p25 0.00% | min -61.76%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 18 | regressed 0 | improved 0 | new 0 | unchanged 18]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------

Only significant changes:
| status | name                    | calls |     ins |  ins Δ% |    HI |   HI Δ% |   SMI |  SMI Δ% |
|--------|-------------------------|-------|---------|---------|-------|---------|-------|---------|
|   +    | read_chunks_btreemap_1  |       | 148.72M |   0.00% | 1.60K |   +inf% |     0 |   0.00% |
|   -    | write_chunks_btreemap_1 |       | 357.21M |  -1.37% |    13 | -61.76% | 1.54K |   0.00% |

ins = instructions, HI = heap_increase, SMI = stable_memory_increase, Δ% = percent change

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 30, 2025

canbench 🏋 (dir: ./benchmarks/memory_manager) 3440e24 2025-06-30 19:30:02 UTC

./benchmarks/memory_manager/canbench_results.yml is up to date
📦 canbench_results_memory-manager.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 3 | regressed 0 | improved 0 | new 0 | unchanged 3]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 30, 2025

canbench 🏋 (dir: ./benchmarks/vec) 3440e24 2025-06-30 19:29:53 UTC

./benchmarks/vec/canbench_results.yml is up to date
📦 canbench_results_vec.csv available in artifacts

---------------------------------------------------

Summary:
  instructions:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max +20.00K | p75 +20.00K | median 0 | p25 0 | min 0]
    change %: [max +0.62% | p75 +0.55% | median 0.00% | p25 0.00% | min 0.00%]

  heap_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

  stable_memory_increase:
    status:   No significant changes 👍
    counts:   [total 16 | regressed 0 | improved 0 | new 0 | unchanged 16]
    change:   [max 0 | p75 0 | median 0 | p25 0 | min 0]
    change %: [max 0.00% | p75 0.00% | median 0.00% | p25 0.00% | min 0.00%]

---------------------------------------------------
CSV results saved to canbench_results.csv

@maksymar maksymar requested a review from Copilot June 30, 2025 19:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the performance of writing large keys/values into a BTreeMap by preallocating overflow pages, which helps avoid repeated page allocations. Key changes include:

  • Renaming and using a unified variable (position) in both read and write loops for clarity.
  • Adding a preallocation step for overflow pages in the NodeWriter.
  • Updating benchmark result YAMLs to reflect performance improvements.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/btreemap/node/io.rs Refactored read/write loops and added batch allocation of overflow pages.
benchmarks/vec/canbench_results.yml Updated benchmark metrics for vec operations.
benchmarks/compare/canbench_results.yml Updated benchmark metrics for compare operations.
benchmarks/btreeset/canbench_results.yml Updated benchmark metrics for btreeset operations.
benchmarks/btreemap/canbench_results.yml Updated benchmark metrics for btreemap operations.
Comments suppressed due to low confidence (2)

src/btreemap/node/io.rs:243

  • Consider adding a clarifying comment about how 'end_offset' is determined and how compute_num_overflow_pages_needed handles edge cases, to help future maintainers understand the overflow preallocation logic.
            compute_num_overflow_pages_needed(end_offset, self.page_size.get() as u64) as usize;

src/btreemap/node/io.rs:266

  • [nitpick] Verify that the transition from using 'bytes_written' to 'position' consistently preserves the intended semantics when processing segments of varying lengths.
            match page_idx {

@maksymar maksymar changed the title perf: TBD perf: batch allocate overflow pages Jun 30, 2025
@maksymar maksymar changed the title perf: batch allocate overflow pages perf: batch allocate overflow pages in BTreeMap Jun 30, 2025
@maksymar maksymar marked this pull request as ready for review June 30, 2025 20:02
@maksymar maksymar requested a review from a team as a code owner June 30, 2025 20:02
Comment thread src/btreemap/node/io.rs
@maksymar maksymar merged commit 914a061 into main Jul 1, 2025
20 checks passed
@maksymar maksymar deleted the maksym/dbg-flame branch July 1, 2025 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants