Skip to content
This repository was archived by the owner on Apr 20, 2026. It is now read-only.

COMPUTE-5948: Add Lyft patches on top of 1.23.17#39

Merged
abhinavdahiya merged 15 commits intorelease-1.23.17-lyft.1from
custom-patch
Jun 26, 2023
Merged

COMPUTE-5948: Add Lyft patches on top of 1.23.17#39
abhinavdahiya merged 15 commits intorelease-1.23.17-lyft.1from
custom-patch

Conversation

@abhinavdahiya
Copy link
Copy Markdown

@abhinavdahiya abhinavdahiya commented Jun 15, 2023

  • 49b8055 - (origin/release-1.22.9-lyft.5, release-1.22.9-lyft.5) containers should always be allowed to start when no sidecar is present (containers should always be allowed to start when no sidecar is present #40) (13 minutes ago)
  • e698aaa - (origin/release-1.22.9-lyft.4, release-1.22.9-lyft.4) container ordering: Don't run application container when container status is not available (container ordering: Don't run application container when container status is not available #37) (6 weeks ago)
  • 04127ec - (origin/release-1.22.9-lyft.3, release-1.22.9-lyft.3) Merge pull request sidecar ordering: handle case where there is no known container state #35 from lyft/fix_container_wait (6 weeks ago)
    |
    | * 2dccc3d - compare to zero value (8 weeks ago)
    | * f7be970 - handle case where there is no known container state (8 weeks ago)
    |/
  • 2706c97 - (origin/release-1.22.9-lyft.2, release-1.22.9-lyft.2) Add gp3 EBS to legacy cloud provider (8 weeks ago)
  • 08c099f - (origin/release-1.22.9-lyft.1) Merge pull request Release 1.22 camille cherrypick #34 from lyft/release-1.22-camille-cherrypick (12 months ago)
  • 8ab788f - disable klog for cadvisor.GetDirFsInfo cache miss (2 years, 9 months ago)
  • 0d89a1d - remove unnecessary line (3 years, 9 months ago)
  • 506458e - do not consider exited containers when calculating nanocores (3 years, 9 months ago)
  • 3872011 - pkg/kubelet: fix uint64 overflow when elapsed UsageCoreNanoSeconds exceeds 18446744073 (3 years, 10 months ago)
  • 04c8515 - pkg/kubelet: fix 1.14 compat for container restore error text (3 years, 10 months ago)
  • fb73a00 - pkg/kubelet: try restoring the container spec if its nil (3 years, 11 months ago)
  • d0e11fe - Allow metrics to be retrieved from CRI with CRI-O (4 years, 4 months ago)
  • dd4723b - remove outdated import libraries and change the outdated functions (12 months ago)
  • 5b5d5af - sidecar: glog -> klog (3 years, 11 months ago)
  • 44b64e7 - sidecar: kubelet: don't bother killing pods when non-sidecars are done (4 years ago)
  • 09c64ac - sidecar: container ordered start/shutdown support (4 years, 4 months ago)

using cherry-pick

git cherry-pick release-1.22.9..release-1.22.9-lyft.5

tomwans and others added 12 commits June 14, 2023 19:51
This change turns off the ability to completely kill pods when the
non-sidecars are done. This is useful for cronjobs, where the
non-sidecars finish work and exit, this code previously would clean up
the pod and its resources.

This feature was pulled in from kubernetes#75099.

This is a feature that sounds nice in practice, but its not what we
need. It seems to be a bit buggy since the Pod sandbox can
potentially be deleted and recreated during the liftime of the
Pod. That ain't good.
CRI-O properly implements the CRI interface, and therefore it is
capable of returning the container stats if being asked for.

There is no reason to keep CRI-O as a special use case that has to
be run with the legacy mode making kubelet using cadvisor on each
container.

This patch removes the hardcoded assumptions that CRI-O has cannot
handle to return containers stats through CRI.

Fixes kubernetes#73750

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
We're not guaranteed that the pod passed in has the ContainerSpec
we're looking for. With this, we check if the pod has the container
spec, and if it doesn't, we try to recover it one more time.
@abhinavdahiya abhinavdahiya changed the title Add Lyft patches on top of 1.23.17 COMPUTE-5948: Add Lyft patches on top of 1.23.17 Jun 22, 2023
@abhinavdahiya
Copy link
Copy Markdown
Author

abhinavdahiya commented Jun 22, 2023

d0e11fe - Allow metrics to be retrieved from CRI with CRI-O (4 years, 4 months ago)

this is still needed as upstream behavior has not changed https://github.com/kubernetes/kubernetes/blob/release-1.23/pkg/kubelet/cadvisor/util.go#L76

8ab788f - disable klog for cadvisor.GetDirFsInfo cache miss (2 years, 9 months ago)

this is still needed as the upstream issue is still open kubernetes#94825

2706c97 - (origin/release-1.22.9-lyft.2, release-1.22.9-lyft.2) Add gp3 EBS to legacy cloud provider (8 weeks ago)

this is still needed as we will migrate to EBS CSI driver after updating the kubelet to 1.23, so in 1.24 this can be removed.

@abhinavdahiya abhinavdahiya changed the base branch from release-1.23.17 to release-1.23.17-lyft.1 June 22, 2023 19:35
abhinavdahiya and others added 3 commits June 26, 2023 14:28
Based on the api documentation
https://pkg.go.dev/k8s.io/api/core/v1#ContainerState
```
ContainerState holds a possible state of container. Only one of its members may be specified.
If none of them is specified, the default one is ContainerStateWaiting.
```

There is a case where no explicit state is known for a container and it must be
treated as waiting state.

So we must check for nil value (since the State is a value and pointer we have to check
that all possible fields are nil , :( ) and treat it as Waiting state.

Also fixes the `Restart running non-sidecars despite sidecar becoming not ready` test.
A pod if needs to be restarted (i.e it had existed and has run at least one) the state
of the container should be set to last running state.

compare to zero value
…atus is not available (#37)

1. When all container statuses are nil, we should default to application
containers wait untill statuses are available

2. When application container is nil, we should default to application
containers waiting state
…nt (#40)

based on
https://lyft.slack.com/archives/C017X524VC7/p1685639606706689?thread_ts=1685481685.730389&cid=C017X524VC7

when there are no sidecars we should allow all containers to start even
when there is lack of container status.
This fixes a regression introduced in
#37 where we would block starting
containers until we get container status reported. this triggers KillPod
action in 1.16 as no progress can be made.

---------

Co-authored-by: Tom Wanielista <tomwans@users.noreply.github.com>
@abhinavdahiya abhinavdahiya force-pushed the custom-patch branch 2 times, most recently from acb04ad to 0481e2b Compare June 26, 2023 18:31
@abhinavdahiya abhinavdahiya merged commit 200be9a into release-1.23.17-lyft.1 Jun 26, 2023
@abhinavdahiya abhinavdahiya deleted the custom-patch branch June 26, 2023 18:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants