Header Only High Precision Timer Lib for effortless, on the fly, execution time measurements of code blocks, using only standard C++ 20, and x86 intrinsics.
- Cycle level precision measurements ( define LOWER_PRECISION_MIN_OVERHEAD before including the library to get that behaviour ( tiny difference) ).
- Independent of underlying CPU clock speed ( because the measurements are in cycles ).
- Option to display the results also in nanos ( if the user provides the actual CPU speed clock on the running core ).
- Effortless usage ( just add: HPT::Timer T("Code Block Name"); at the start of any code block ).
- Ultra light ( negligible overhead of time ( like 70 - 74 cycles avg on a modern CPU ) and space ).
- Standalone and header only ( only a C++ 20 supporting compiler needed without any extra dependencies ).
- Full stat results ( totals, min, median, max, avg, std, 90th, 99th ).
- Option to stop and restart the timer inside the code block, and also reset the stats.
- Easily turn off the measurements by simply defining the TURN_OFF_MEASUREMENTS constant before including the library.
- Can handle an enormous amount of measurements / calls ( it uses frequency tables to store them ).
This lib is best suited for accurate and consistent measuring production code with real world data. It is NOT a benchmarking lib (although it can be used for that too). The process is simple:
- Measure a specific code block in production with real world data, using the proper setup ( see Notes ).
- Make a code change in that block.
- Do step 1 again.
- See if that code change actually improved the stats you care about.
- If so, then keep that code change in production ( with a similar setup ) with this ( or similar ) data.
Just copy the single header file ( HPT.hpp ) in your include folder. Then include it in your code.
explicit Timer(const std::string& name = "Generic") noexcept;
void start(void) noexcept;
void stopAndRecord(void);
void setName(const std::string& name);
const std::string& getName(void) const noexcept;
static void ClearMeasurements(const std::string& name = "Generic");
static void ClearAllMeasurements(void);
static void setStatHighlight(const Stats& stat);
static void resetStatHighlight(const Stats& stat);
static void ClearStatHighlights(void);
static void PrintResults(const size_t cpuSpeedInMGHz = 0, const size_t zeroCodeCycles = 0, std::ostream& os = std::cout, const bool printNotes = true);// #define TURN_OFF_MEASUREMENTS
#include "HPT.hpp"
void FunctionToMeasure(size_t N)
{
HPT::Timer T("FunctionToMeasure");
// any code here...
}
int main(void)
{
for ( size_t index{}; index < 1'000'000; ++index )
FunctionToMeasure(index);
HPT::PrintResults();
std::exit(EXIT_SUCCESS);
}-
To get the measurements also in nanoseconds, please provide the positive 'cpuSpeedInMGHz' argument in 'PrintResults' (assuming CPU has invariant TSC support).
-
To find the actual real time speed in MHz, of all CPU cores, run this on Linux:
watch -n.1 "grep "^[c]pu MHz" /proc/cpuinfo"
-
For more accurate results:
- Do not perform nested measurements.
- Measure only from one thread.
- Provide the 'zeroCodeCycles' (measurement of an empty code block) argument in 'PrintResults'.
- Measure as few code blocks as possible at the same time.
- Keep the measured block names as short as possible (SSO).
- Disable hyperthreading.
- Disable turbo boost and force the 'performance' governor.
- Make sure the measuring thread has the highest priority and is pinned to an isolated CPU core throughout the measuring period.
- Keep running your benchmarks on the same core for consistency.
