agent: add support for online memory and cpu separately#331
agent: add support for online memory and cpu separately#331jodh-intel merged 1 commit intokata-containers:masterfrom cedriccchen:master
Conversation
|
This PR is related to kata-containers/runtime#624 |
grpc.go
Outdated
|
|
||
| func (a *agentGRPC) onlineCPUMem(req *pb.OnlineCPUMemRequest) error { | ||
| if req.NbCpus <= 0 { | ||
| if req.NbCpus < 0 { |
|
Hi @clarecch - thanks for raising! It might be worth changing the commit message a little. Previously -- as you have said -- CPU onlining was always attempted (even when zero cpus were specified). But also, it looks like because of that initial test, previously memory was only ever onlined when CPUs were also onlined? With this change, we online memory "always", not only when the number of cpus is zero I think? It might even be clearer to change the code to online the memory first - we could then keep the original lgtm |
|
CI is not happy, @clarecch can you fix it? |
|
@devimc I will fix it as soon as possible. |
|
@jodh-intel @devimc I think there're three solutions:
|
|
@linzichang I like the number 3
|
|
@clarecch what do you think ? |
|
Hi @linzichang - I agree and I think for clarity it makes sense to have separate However, presumably we created the Aside: on a general point, how about renaming |
|
@devimc I like number 3 too, but it seems to need a lot of modification |
|
Yep - I think option 3 is the "cleanest", but let's get some more input from others.... |
Codecov Report
@@ Coverage Diff @@
## master #331 +/- ##
==========================================
- Coverage 44.92% 44.68% -0.24%
==========================================
Files 15 15
Lines 2393 2354 -39
==========================================
- Hits 1075 1052 -23
+ Misses 1176 1167 -9
+ Partials 142 135 -7 |
|
@devimc @jodh-intel @linzichang @clarecch We put onlinecpu and onlinememory into one call since they are mostly called together, and we save one grpc call when they actually are. Also we did want to only use a small set of grpc APIs to keep it simple. And if we split/rename the onlinecpumem API, it will break grpc backward compatibility. So I would suggest we go with option 2 and name the new field |
|
@bergwolf If we want to keep grpc backward compatibility from being broken, I think the option 2 is the only choice. There are two solutions I think:
|
|
@clarecch, I don't think we need an extra field. The new grpc onlinecpumem handler can:
And all cases are covered. WDYT? |
|
@bergwolf Yes, this solution is quite good, which covers most cases, except one case NbCpus=0 and CpuOnly=false. In this case we don't know NbCpus=0 is an input error or really want hotplug memory only. Is that matters? ^^ |
|
@clarecch I think we can rely on So to answer your question, |
|
Adding a diff --git a/protocols/grpc/agent.proto b/protocols/grpc/agent.proto
index b930d94..865fe0f 100644
--- a/protocols/grpc/agent.proto
+++ b/protocols/grpc/agent.proto
@@ -330,11 +330,13 @@ message ListRoutesRequest {
message OnlineCPUMemRequest {
// Wait specifies if the caller waits for the agent to online all resources.
// If true the agent returns once all resources have been connected, otherwise all
// resources are connected asynchronously and the agent returns immediately.
bool wait = 1;
// NbCpus specifies the number of CPUs that were added and the agent has to online.
uint32 nb_cpus = 2;
+
+ bool cpu_only = 3;
}... as long as we document in that file every possible scenario (as we have done on this issue) to maximise understanding. Related: #150. |
|
I have no idea why the ubuntu-ci failed. May someone give a help? |
|
|
||
| func (a *agentGRPC) onlineCPUMem(req *pb.OnlineCPUMemRequest) error { | ||
| if req.NbCpus <= 0 { | ||
| if req.NbCpus == 0 && req.CpuOnly { |
There was a problem hiding this comment.
What about NbCpus < 0? We should fail in such case.
There was a problem hiding this comment.
NbCpus is uint32. I think there is no need to consider NbCpus <0.
There was a problem hiding this comment.
heh, good point! I thought it was int32 as we had <= there ;)
|
CI failed with |
|
@bergwolf I have no idea what happen when Ci run k8s-memory.bats and failed. Could you please help to find out the reason, if convenient? |
|
This PR kata-containers/runtime#643 also fails on this test. But it just bumps version of golang. Kind of wired. I think there is something wrong with CI. |
|
CI green now. It needs one more approval on the protocol change though. |
|
Hold on. @clarecch Do you generate pb.go file by proto-gen-go tool? |
|
good catch @caoruidong! @clarecch please run |
In func OnlineCPUMem, cpu is always onlined, which is not expected. So we add "CpuOnly" field to support separating memory and cpu online. Fixes #332 Signed-off-by: Clare Chen <clare.chenhui@huawei.com> Signed-off-by: Zichang Lin <linzichang@huawei.com>
|
@devimc @caoruidong @bergwolf Commit is updated, and CI is green. PTAL! |
|
Thanks @clarecch ! Still need one more approval on the protocol change though. ping @kata-containers/agent |
| @@ -193,21 +193,25 @@ func updateContainerCpuset(cgroupPath string, newCpuset string, cookies cookie) | |||
| } | |||
|
|
|||
| func (a *agentGRPC) onlineCPUMem(req *pb.OnlineCPUMemRequest) error { | |||
There was a problem hiding this comment.
@linzichang @bergwolf
For sure this patch works, but why don't we simply define two different functions for CPU and memory ?
This way, no need for specific flags, we would simply call the appropriate function:
onlineCPUonlineMemory
There was a problem hiding this comment.
We did discuss this a bit above (#331 (comment)) and I agree it would be a lot clearer (but potentially slower as multiple calls might be required).
There was a problem hiding this comment.
Yes I can see that!
@bergwolf I understand the concern about multiple calls, but I think it's definitely negligible compared to the whole stack.
And about the API compatibility breakage, we better break it now to make it right, as the project is still not completely stable.
agent: add support for online memory and cpu separately.
In func OnlineCPUMem, cpu is always onlined,
which is not expected. So we add "CpuOnly"
field to support separating memory and cpu online.
Fixes #332
Signed-off-by: Clare Chen clare.chenhui@huawei.com
Signed-off-by: Zichang Lin linzichang@huawei.com