-
Notifications
You must be signed in to change notification settings - Fork 19
Description
Windows Build Number
10.0.19044.0
Processor Architecture
AMD64
Memory
16GB
Storage Type, free / capacity
SSD 512GB
Relevant apps installed
Windows
Traces collected via Feedback Hub
N/A
Isssue description
I tested PrefetchVirtualMemory to minimize page faults when creating a file mapping, and measured the performance (compiler ticks using __rdtsc() intrinsic) and the actual count of page faults (using GetProcessMemoryInfo) when processing files, with and without a call to PrefetchVirtualMemory.
To my surprise, PrefetchVirtualMemory does not work as advertise and does not do it's only job, which was to actually fetch the memory from disk as efficiently as possible to minimize page faults.
I used the following process for testing, loading a range of file sizes (I'll share my results for a ~500MB file):
I opened a file handle, created a memory map and a view of the whole file, then I processed the whole file making sure to touch every page of the allocated view.
Then I repeated the process, and called PrefetchVirtualMemory on the view before processing the file.
I got the exact same page fault count overall in both cases, and to add insult to injury, the performance was slightly worse with PrefetchVirtualMemory, making the whole thing useless as it is its only job.
Steps to reproduce
Running the following code on windows:
#include <psapi.h> // for GetProcessMemoryInfo
char process_file(const char *file_name, bool use_prefetch_virtual_memory) {
HANDLE file_handle;
file_handle = CreateFileA(file_name, GENERIC_READ, FILE_SHARE_READ,
0, OPEN_EXISTING, FILE_FLAG_SEQUENTIAL_SCAN, 0);
if(file_handle == INVALID_HANDLE_VALUE) {
fprintf(stderr, "Could not open the file '%s'\n", file_name);
return 0;
}
LARGE_INTEGER file_size;
if(!GetFileSizeEx(file_handle, &file_size)) {
CloseHandle(file_handle);
fprintf(stderr, "Could not open the file '%s'\n", file_name);
return 0;
}
size_t size = file_size.QuadPart;
HANDLE mapping_handle = CreateFileMapping(file_handle, NULL, PAGE_READONLY | SEC_COMMIT, 0, 0, NULL);
const char *view = (const char *)MapViewOfFile(mapping_handle, FILE_MAP_READ, 0, 0, 0);
PROCESS_MEMORY_COUNTERS counter0, counter1, counter2;
GetProcessMemoryInfo(GetCurrentProcess(), &counter0, sizeof(counter0));
WIN32_MEMORY_RANGE_ENTRY memory_range_entry[] = {(void *)view, size};
if(use_prefetch_virtual_memory) PrefetchVirtualMemory(GetCurrentProcess(), 1, memory_range_entry, 0);
GetProcessMemoryInfo(GetCurrentProcess(), &counter1, sizeof(counter1));
char result = 0; // I'm using "result" here only so that the optimizer won't skip this loop
for(int i = 0; i < size; ++i) {
result += view[i];
}
GetProcessMemoryInfo(GetCurrentProcess(), &counter2, sizeof(counter2));
DWORD prefetch_fault_count = counter1.PageFaultCount - counter0.PageFaultCount;
DWORD manual_fault_count = counter2.PageFaultCount - counter1.PageFaultCount;
printf("Using PrefetchVirtualMemory: %s\n", use_prefetch_virtual_memory ? "yes" : "no");
printf("Page faults (during prefetch): %d\n", prefetch_fault_count);
printf("Page faults (during file processing): %d\n", manual_fault_count);
printf("Page faults (overall): %d\n", prefetch_fault_count + manual_fault_count);
UnmapViewOfFile(view);
CloseHandle(mapping_handle);
CloseHandle(file_handle);
return result;
}Expected Behavior
Using PrefetchVirtualMemory: no
Page faults (during prefetch): 0
Page faults (during file processing): 161027
Page faults (overall): 161027
Using PrefetchVirtualMemory: yes
Page faults (during prefetch): 314
Page faults (during file processing): **MUCH LOWER** than 160713
Page faults (overall): **MUCH LOWER** than 161027
Actual Behavior
Using PrefetchVirtualMemory: no
Page faults (during prefetch): 0
Page faults (during file processing): 161027
Page faults (overall): 161027
Using PrefetchVirtualMemory: yes
Page faults (during prefetch): 314
Page faults (during file processing): 160713
Page faults (overall): 161027