Read /proc/net files with a single read syscall.#361
Merged
shirou merged 2 commits intoshirou:masterfrom May 4, 2017
Merged
Conversation
The /proc/net files are not guaranteed to be consistent, they are only consitent on the row level. This is probably one of the reasons why consequent read calls might return duplicate entries - the kernel is changing the file as it is being read. In certain situations this might lead to loop like situations - the same net entry is being returned when reading the file as new connections are added to the kernel tcp table, i.e there can be a lot of duplications. This commit is trying to reduce the duplications, by fetching the contents of the net files with a single read syscall.
Owner
|
So cool! Thank you. Could you add some comments about why not to use |
Owner
|
Thank you for your contribution! |
apoorv007
added a commit
to archsaber/gopsutil
that referenced
this pull request
Jun 24, 2017
|
ioutil.ReadFile is still slow on centos-6, when /proc/net/tcp is bigger than 30M. But it's not go's problem, use cat is slow too. |
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The /proc/net files are not guaranteed to be consistent, they are only
consitent on the row level. This is probably one of the reasons why
consequent read calls might return duplicate entries - the kernel is
changing the file as it is being read. In certain situations this might
lead to loop like situations - the same net entry is being returned by consequent read calls as new connections are added to the kernel tcp table.
This PR is trying to reduce the duplications, by fetching the contents
of the net files with a single read syscall.
This discussion on Stackoverflow goes into more detail on how the /proc files work in terms of consistency.
In our use case, there were certain situations where the netstat telegraf input plugin spiked eating up a lot of memory. On further inspection it appeared that this was due to a lot of entries in the /proc/net/tcp file. These entries, however, did not correspond to higher tcp connections - they remained within the expected range. This is what leads me to believe, that in certain situations loop like conditions are created and a lot of duplicate rows appear in /proc/net/tcp.
Consider the following graphs:
The memory in the graph is the memory taken from the netstat telegraf plugin. You can see how spikes in the /proc/net/tcp length correspond to spikes in netstat's memory usage. Additionally, in those spikes, there are no real spikes in tcp connection counts. The length of /proc/net/tcp was gathered separately using
cat /proc/net/tcp.catitself issues many read calls to the file, so it has the same behavior as the netstat plugin.After the patch the graphs look like this:
There are no more spikes in memory.
Slightly more CPU is consumed, but this is compensated with the reduction of duplicates and stabilization of the memory used. For smaller use cases I don't think CPU usage will be an issue, in higher ones overall single reads yields much better performance.
In our use case we are observing a 56% improvement in the time required to gather the netstat data: