Skip to content

[FR] PredictNumItersNeeded() 1.4 correction factor #1848

@ruben-laso

Description

@ruben-laso

Problem description
In the function PredictNumItersNeeded() there is this 1.4 correction factor.
This causes the time running the experiment to exceed by ~40% the time specified by --benchmark_min_time.
Of course, --benchmark_min_time denotes the minimum amount of time to run the benchmark, but an overrun of 40% seems excessive.
This is particularly relevant in supercomputers, where CPU time is expensive.

  • Why is this estimation done, instead of stopping the iterations when the accumulated "iteration time" exceeds the target time?
  • Is there a reason for selecting 1.4 as a correction factor?
  • In cases where the execution times are not stable, could this prediction be wrong by a large margin?

Suggested solution
I suggest either removing the correction factor or making it configurable (with a default value of 1.0).

Example
As shown in the following output (executed with --benchmark_min_time=1s) the real execution time is ~1.4s: $7039 \times 198481 = 1397107759$, $8775 \times 160984 = 1412634600$, ...

------------------------------------------------------------------------------------------------------------------------
Benchmark                                                              Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------
GNU-TBB/std::adjacent_difference/double/1024/manual_time            7039 ns         7022 ns       198481 bytes_per_second=6.50968Gi/s
GNU-TBB/std::adjacent_find/double/1024/manual_time                  8775 ns         8702 ns       160984 bytes_per_second=2.61075Gi/s
GNU-TBB/std::all_of/double/1024/manual_time                         7704 ns         7529 ns       199585 bytes_per_second=2.97387Gi/s
GNU-TBB/std::any_of/double/1024/manual_time                         4707 ns         4625 ns       301878 bytes_per_second=4.86754Gi/s

When executing the same code with --benchmark_min_time=0.71s ($1/1.4 \simeq 0.71$), the execution times are much closer to 1s: $7681 \times 134530 = 1033324930$, $9265 \times 103410 = 958093650$, ...

------------------------------------------------------------------------------------------------------------------------
Benchmark                                                              Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------------------------------------
GNU-TBB/std::adjacent_difference/double/1024/manual_time            7681 ns         7615 ns       134530 bytes_per_second=5.96565Gi/s
GNU-TBB/std::adjacent_find/double/1024/manual_time                  9265 ns         9175 ns       103410 bytes_per_second=2.47288Gi/s
GNU-TBB/std::all_of/double/1024/manual_time                         7996 ns         7572 ns       110333 bytes_per_second=2.86519Gi/s
GNU-TBB/std::any_of/double/1024/manual_time                         4656 ns         4680 ns       192865 bytes_per_second=4.92025Gi/s

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions