Linux: handle offline CPUs when getting frequencies#650
Linux: handle offline CPUs when getting frequencies#650mwahlroos wants to merge 3 commits intohtop-dev:masterfrom
Conversation
Fixes: htop-dev#648 If some CPUs have been turned offline, those CPUs will not appear in the active CPU count parsed from /proc/stat. Offline CPUs that are still present may still count in terms of sequential CPU IDs in both /proc/cpuinfo and the CPU entries in /sys/devices/system/cpu/. Reading the frequencies for the N active CPUs from CPUFreq information in /sys/devices/system/cpu/cpu[0..N-1] may thus give incorrect results if some of the CPUs [0..N-1] are actually offline. If e.g. CPUs 1 and 3 have been turned offline on a system with four CPU cores, the code would attempt to read the frequencies for the remaining two online CPUs from /sys/devices/system/cpu/cpu{0,1}, while the correct sysfs CPU entries from which to read the frequencies would be {0,2}. The code path that reads the frequencies from /proc/cpuinfo instead of CPUfreq suffers from a similar problem. This fixes the reporting of frequencies in case of offline CPUs being mixed within the range of online CPUs.
BenBE
left a comment
There was a problem hiding this comment.
Please follow the style guide regarding indentation.
| xSnprintf(pathBuffer, sizeof(pathBuffer), "/sys/devices/system/cpu/cpu%u/online", cpuId); | ||
|
|
||
| FILE* file = fopen(pathBuffer, "r"); |
There was a problem hiding this comment.
Please use the wrapper function for openat or readfile to reduce the number of FS level operations.
There was a problem hiding this comment.
Also avoid the FILE* APIs as they have quite a bit of overhead.
| static int getConfiguredCPUCount(void) { | ||
| return (int) sysconf(_SC_NPROCESSORS_CONF); | ||
| } |
There was a problem hiding this comment.
This should be available globally already IIRC.
There was a problem hiding this comment.
I tried looking for that but couldn't find it.
ProcessList has cpuCount, which is updated on Linux by LinuxProcessList_updateCPUcount, but that number only includes the number of active CPUs, and using it as the upper limit for the CPU id leads to precisely the problem that I'm trying to solve.
Any pointers to where this would be available globally?
|
|
||
| static int scanCPUFreqencyFromSysCPUFreq(LinuxProcessList* this) { | ||
| unsigned int cpus = this->super.cpuCount; | ||
| int configuredCPUs = getConfiguredCPUCount(); |
There was a problem hiding this comment.
Try to cache that value and avoid requesting it on every run; occasional refresh is no problem though.
| if (onlineCPUIndex == 0) | ||
| clock_gettime(CLOCK_MONOTONIC, &start); | ||
|
|
||
| FILE* file = fopen(pathBuffer, "r"); |
There was a problem hiding this comment.
Avoid fopen in favour of openat; opendir the path /sys/devices/system/cpu and only scan from there. May combine this with the above and opendir cpu%d additionally.
| @@ -1859,14 +1897,14 @@ static int scanCPUFreqencyFromSysCPUFreq(LinuxProcessList* this) { | |||
| if (fscanf(file, "%lu", &frequency) == 1) { | |||
|
@mwahlroos Feel free to rebase/squash commits on your branch as you see fit. The final PR should preferably only contain a (minimal) set of linear changes to implement the change without all the intermediate steps from typos and bug fixes. |
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
Currently htop does not support offline CPUs and hot-swapping, e.g. via
echo 0 > /sys/devices/system/cpu/cpu2/online
Split the current single cpuCount variable into activeCPUs and
existingCPUs.
Supersedes: htop-dev#650
Related: htop-dev#580
|
@mwahlroos can this be closed as #656 has been merged? |
|
Closing this PR as #656 does this better. |
Fixes: #648
When getting frequencies from CPUFreq, the CPU nodes in sysfs are iterated from 0 to M-1 where M is the number of configured CPUs in the system and which should include offline ones as well, thus allowing for iterating through the full number of possible CPUs. The ones that are offline according to sysfs are ignored, and frequencies are fetched for up to N others where N is the number for which CPUData entries have been allocated.
When getting frequencies from
/proc/cpuinfo, the CPU nodes and their frequencies are just taken in the order in which they appear. Offline CPUs do not seem to appear incpuinfo, so they're automatically skipped. The CPU IDs fromcpuinfoare ignored, apart from checking the output line format.Possible issues that I can think of:
_SC_NPROCESSORS_CONFused for getting the total number of CPUs here (in the CPUFreq code path) isn't standard, but it seems to have been mentioned on the Linux man pagesysconf(3)since circa 2011./sys/devices/system/cpu/onlinefor a list of online CPUs instead, which would require a bit of work to do properly but would allow for a single read./proc/cpuinfocode path in its form suggested here assumes that offline CPUs do not appear and that the online nodes appear in the order of their CPU IDs. The author of the previous code didn't seem to want to make that assumption, and I'm not sure if it's valid.Any feedback is welcome!