Skip to content

libct/cg/fs: getPageUsageByNUMA: rewrite/optimize, fix panic, add more tests#2755

Merged
AkihiroSuda merged 3 commits into
opencontainers:masterfrom
kolyshkin:numa-stat
Feb 1, 2021
Merged

libct/cg/fs: getPageUsageByNUMA: rewrite/optimize, fix panic, add more tests#2755
AkihiroSuda merged 3 commits into
opencontainers:masterfrom
kolyshkin:numa-stat

Conversation

@kolyshkin
Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin commented Jan 20, 2021

This is a rewrite of getPageUsageByNUMA, fixing a number of issues.

  1. Fix a panic for lines that do not contain = (first commit, by @Ace-Tang).

  2. Be less strict to unknown contents, i.e. skip it. This makes the
    function more future-proof. Before this commit, if a line like
    "a=b" is encountered, the function returns an error, which is
    propagated all the way up to and returned by (CgroupManager).GetStats.

  3. Be more strict to contents it recognizes, i.e. return an error.
    In case the first field in the line is recognized (e.g. "total=123",
    the rest of the line should be in format "N= ...".

  4. Optimize. Before this commit, addNUMAStatsByType was called for every
    item in the line, which is excessive and might even be slow in case
    there are many NUMA nodes. It is enough to look up the field once.
    This probably helps to reduce the load on garbage collector, too.

  5. Remove a bunch of global numaNode* and numaStat* constants. Those
    were used by only one function, and it does not make sense to have
    them defined globally. Some were moved to the function, some were
    eliminated entirely.

  6. Improve readability and added code comments.

Finally, add some test cases for good and bad contents.

NOTE this PR includes #2751.

Closes: #2751

strings.SplitN not always return N fields if not staify, sometimes
cgroup interface add some custom fields make parse memory.numa_stat
fails, it will case panic

Signed-off-by: acetang <aceapril@126.com>
@kolyshkin
Copy link
Copy Markdown
Contributor Author

CI failure is unrelated (flaky nsexec debug logging, see #2756). Restarted.

dqminh
dqminh previously approved these changes Jan 20, 2021
Comment thread libcontainer/cgroups/fs/memory.go Outdated
// first item was already validated,
// so be strict to the rest
return stats, fmt.Errorf("malformed line %q in %s",
scanner.Text(), filename)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's captured the line before and reuse here, scanner.Text() effectively convert []byte to string, which afaik is a little costly, even though it doesn't matter that much in practice because this is error case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I thought about it and figured out optimizing the error case makes little sense.

Anyway, fixed, and I also made a few tiny improvements (better comments, use break instead of continue nextline). Please re-lgtm :)

Rewrite getPageUsageByNUMA

1. Be less strict to unknown contents, i.e. skip it. This makes the
   function more future-proof. Before this commit, if a line like
   "a=b" is encountered, the function returns an error, which is
   propagated all the way up to and returned by (CgroupManager).GetStats.

2. Be more strict to contents it recognizes, i.e. return an error.
   In case the first field in the line is recognized (e.g. "total=123",
   the rest of the line should be in format "N<id>=<value> ...".

3. Optimize. Before this commit, addNUMAStatsByType was called for every
   item in the line, which is excessive and might even be slow in case
   there are many NUMA nodes. It is enough to look up the field once.

4. Remove a bunch of global numaNode* and numaStat* constants. Those
   were used by only one function, and it does not make sense to have
   them defined globally. Some were moved to the function, some were
   eliminated entirely.

5. Improve readability and added code comments.

Finally, add some test cases for good and bad contents.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
It was already explained why we ignore the error, so let's ignore this
deliberately.

This fixes

> name.go:22:7: Error return value of `join` is not checked (errcheck)

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin kolyshkin changed the title libct/cg/fs: getPageUsageByNUMA: rewrite/optimize libct/cg/fs: getPageUsageByNUMA: rewrite/optimize, fix panic, add more tests Jan 21, 2021
@kolyshkin kolyshkin requested a review from dqminh January 21, 2021 18:26
@AkihiroSuda AkihiroSuda merged commit 2046f26 into opencontainers:master Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants