Skip to content

CPU utilization computation fixes#2196

Merged
jkotas merged 2 commits into
dotnet:masterfrom
jkotas:cpu-utilization
Jan 25, 2020
Merged

CPU utilization computation fixes#2196
jkotas merged 2 commits into
dotnet:masterfrom
jkotas:cpu-utilization

Conversation

@jkotas
Copy link
Copy Markdown
Member

@jkotas jkotas commented Jan 25, 2020

  • On Unix, move scalling for total number of processors from PAL to managed side, so that it can use container limit aware ProcessorCount
  • Delete asserts for CPU utilization being between 0 and 100. These asserts can fail due to races or rounding errors.
  • Converted a few classes to structs

Fixes #2195

- On Unix, move scalling for total number of processors from PAL to managed side, so that it can use container limit aware ProcessorCount
- Delete asserts for CPU utilization being between 0 and 100. These asserts can fail due to races or rounding errors.
- Converted a few classes to structs

Fixes #2195
* compute the current process's CPU utilization instead.
* compute the current process's CPU utilization instead. The CPU utilization
* returned is sum of utilization accross all processors, e.g. this function will
* return 200 when two cores are running at 100%.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by removing the multiplication by number of processors, but then this saying it's larger for more processors. I'm sure you're right, but what's my misunderstanding?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System.Native shim does not have easy access to the process count adjusted for container limits. The processor count on managed size is adjusted and thus it is the right number to use to compute how much spare cycles may extra threads take advantage of.

In other words, this change should get us equivalent of this piece of logic from CoreCLR PAL.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's essentially removing division, not multiplication. It's not obvious from reading the unified diff because GitHub hides the cpuUtilization = (int32_t)(cpuBusyTime * 100 / cpuTotalTime); line. The multiplication in cpuTotalTime is affecting the divisor.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, I missed that, thanks.

long cpuTotalTime = ((long)userTime - _processCpuInfo.userTime) + ((long)kernelTime - _processCpuInfo.kernelTime);
long cpuBusyTime = cpuTotalTime - ((long)idleTime - _processCpuInfo.idleTime);
long cpuTotalTime = ((long)userTime - _userTime) + ((long)kernelTime - _kernelTime);
long cpuBusyTime = cpuTotalTime - ((long)idleTime - _idleTime);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any tests that would fall if we got something wrong here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, this is only used for Mono on Windows that gets a very light testing. If we got this wrong, this would manifest itself as a performance bug in thread pool scaling.

Hopefully, we will find time to switch to the managed threadpool for CoreCLR too. Validating details like this would be the bulk of the work for that.

@jkotas jkotas merged commit 15f5465 into dotnet:master Jan 25, 2020
@jkotas jkotas deleted the cpu-utilization branch January 26, 2020 04:35
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SystemNative_GetCpuUtilization assert failing under load

4 participants