chore: increase row count and batch size for more deterministic tests #9088

Weijun-H · 2026-01-02T13:57:13Z

Which issue does this PR close?

Closes #NNN.

Rationale for this change

Previous benchmark is too fast to deterministically measure the performance improvement because they run only in 2-7 microsecond.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Jefffrey

After these changes on an M4 mac

arrow-rs (pr_9088)$ cargo bench -p arrow-json
    Finished `bench` profile [optimized] target(s) in 0.05s
     Running benches/serde.rs (/Users/jeffrey/.cargo_target_cache/release/deps/serde-a1ab5d1498b8bdfe)
small_i32               time:   [323.29 µs 325.91 µs 328.65 µs]
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild

Benchmarking large_i32: Collecting 100 samples in estimated 6.5412 s (20k iterations)^C⏎                                                                                                                              arrow-rs (pr_9088)$ cargo bench -p arrow-json --bench serde
    Finished `bench` profile [optimized] target(s) in 0.06s
     Running benches/serde.rs (/Users/jeffrey/.cargo_target_cache/release/deps/serde-a1ab5d1498b8bdfe)
small_i32               time:   [314.67 µs 315.97 µs 317.47 µs]
                        change: [−2.6760% −1.8315% −0.9768%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  6 (6.00%) high mild
  4 (4.00%) high severe

large_i32               time:   [315.96 µs 317.31 µs 318.82 µs]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

small_i64               time:   [483.97 µs 485.09 µs 486.20 µs]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

medium_i64              time:   [485.07 µs 487.39 µs 491.23 µs]
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

large_i64               time:   [486.93 µs 491.01 µs 498.46 µs]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

small_f32               time:   [575.10 µs 578.78 µs 585.47 µs]
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe

large_f32               time:   [572.65 µs 573.71 µs 574.81 µs]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

alamb · 2026-01-06T21:47:04Z

arrow-json/benches/serde.rs

    c.bench_function(name, |b| {
        b.iter(|| {
-            let builder = ReaderBuilder::new(schema.clone()).with_batch_size(64);
+            let builder = ReaderBuilder::new(schema.clone()).with_batch_size(batch_size);


I think a batch size of 256k (2**18) is also too big -- can we use 4K or 8KB instead? I think that would be more realistic?

Parsing 256K rows does make sense to me

chore: increase row count and batch size for more deterministic tests

e3555ed

github-actions bot added the arrow Changes to the arrow crate label Jan 2, 2026

Jefffrey approved these changes Jan 6, 2026

View reviewed changes

alamb reviewed Jan 6, 2026

View reviewed changes

chore

e25eb42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: increase row count and batch size for more deterministic tests #9088

chore: increase row count and batch size for more deterministic tests #9088

Weijun-H commented Jan 2, 2026 •

edited

Loading

Uh oh!

Jefffrey left a comment

Uh oh!

alamb Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chore: increase row count and batch size for more deterministic tests #9088

Are you sure you want to change the base?

chore: increase row count and batch size for more deterministic tests #9088

Conversation

Weijun-H commented Jan 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Jefffrey left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Weijun-H commented Jan 2, 2026 •

edited

Loading