-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Windows Build Number
10.0.19045.0
Processor Architecture
AMD64
Memory
2x8 GB DDR4
Storage Type, free / capacity
NVME SSD 138 GB/ 512 GB
Relevant apps installed
N/A.
Traces collected via Feedback Hub
N/A.
Isssue description
Hello,
I am yet to discover that this issue is reported elsewhere. I am developing an application that requires sub-millisecond sleep precision upon calling functions such as Sleep(1). To do so, I am raising the clock interrupt frequency by calling NtSetTimerResolution. I decided to benchmark the precision scaling of Sleep(1) by using QueryPerformanceCounter against raising the Timer Resolution to determine a good balance between power efficiency and precision however, I encountered unexpected results.
#include <iomanip>
#include <iostream>
#include <windows.h>
#include <tlhelp32.h>
#include <vector>
extern "C" NTSYSAPI NTSTATUS NTAPI NtSetTimerResolution(ULONG DesiredResolution, BOOLEAN SetResolution, PULONG CurrentResolution);
int main() {
// benchmark 0.5ms - 1ms Timer Resolution
double begin = 0.5;
double stop = 1;
double increment = 0.002;
int samples = 20;
ULONG minimum_resolution, current_resolution;
LARGE_INTEGER start, end, freq;
QueryPerformanceFrequency(&freq);
std::cout << "RequestedResolutionMs,DeltaMs\n";
for (double resolution = begin; resolution <= stop; resolution += increment) {
NtSetTimerResolution(resolution * 10000, true, ¤t_resolution);
// get an average result for 20 Sleep(1) benchmarks for each resolution
std::vector<double> sleep_delays;
for (int i = 0; i <= samples; i++) {
// benchmark Sleep(1)
QueryPerformanceCounter(&start);
Sleep(1);
QueryPerformanceCounter(&end);
double delta_s = (double)(end.QuadPart - start.QuadPart) / freq.QuadPart;
double delta_ms = delta_s * 1000;
double delta_from_sleep = delta_ms - 1;
sleep_delays.push_back(delta_from_sleep);
}
size_t size = sleep_delays.size();
double sum = 0.0;
for (double delay : sleep_delays) {
sum += delay;
}
double average = sum / size;
std::cout << resolution << "," << average << "\n";
}
}The program above outputs the results of the Sleep(1) delta from 1ms with different clock interrupt intervals in a CSV format. As shown in the graph below, there is a directly proportional relationship between the clock interrupt frequency and Sleep(1) precision as I would expect, but the sleep precision with 0.5ms Timer Resolution seems to be worse than a slightly lower resolution such as 0.506ms. In fact, 0.5ms resolution is providing the same precision as ~0.745ms which is a lower resolution. How does this make sense? I asked a group of people online to repeat my benchmark in which, the behaviour was reproducible on several machines (30+).
Why is this the case? Is there anything that can be done to resolve this phenomenon? Several developers out there query the maximum resolution using NtQueryTimerResolution then set the Timer Resolution according to that (0.5ms) are missing out on precision due to this. It is almost as if there is a slight offset that results in higher precision (0.5ms + 0.006ms in my case). The behaviour outlined in this issue is of course not expected in the sense that a higher resolution results in the same sleep precision as a lower resolution.
Steps to reproduce
- Compile the program and configure ntdll.lib as a linker dependency
- Close any external programs that may be requesting a resolution higher than 1ms
- Run the program and either copy + paste the results into
results.txtor redirect stdout - Visit https://chart-studio.plotly.com/create and click Import -> Upload -> results.txt
Expected Behavior
0.5ms Timer Resolution resulting in high Sleep(1) precision and outperforms lower Timer Resolutions such as 0.506ms.
Actual Behavior
0.5ms Timer Resolution resulting in low Sleep(1) precision and underperforms compared to a lower Timer Resolution such as 0.506ms.
