Skip to content

fix: preemptible to on-demand fallback on ICE errors#64

Open
GEverding wants to merge 3 commits intozoom:mainfrom
GEverding:fix-preemptive-ondemand-out-of-capacity
Open

fix: preemptible to on-demand fallback on ICE errors#64
GEverding wants to merge 3 commits intozoom:mainfrom
GEverding:fix-preemptive-ondemand-out-of-capacity

Conversation

@GEverding
Copy link
Copy Markdown

@GEverding GEverding commented Mar 3, 2026

Summary

Fixes three compounding bugs that prevented Karpenter from falling back to on-demand capacity when preemptible instance launches fail with "Out of host capacity" (ICE) errors. Each bug independently defeated the unavailable offerings cache, meaning all three had to be fixed together for the fallback to work.

Bug 1: Wrong capacity type written to cache

MarkUnavailableForLaunchInstanceErr was called with a hardcoded CapacityTypeOnDemand instead of the actual capacity type. Preemptible ICE errors marked on-demand as unavailable — the exact opposite of correct behavior.

Fix: Pass the actual capacityType variable from the launch path.

Bug 2: Cache key mismatch between write and read

The ICE handler wrote the cache key using instanceType.Name — the composite key like VM.Standard.E4.Flex-8-16384 (shape-cpu-mem). But CreateOfferings checked IsUnavailable using the raw OCI shape name VM.Standard.E4.Flex. The keys never matched, making the cache a no-op.

Fix: Use instanceType.Requirements.Get(LabelInstanceShapeName).Any() to get the raw shape name when writing the cache key.

Bug 3: Capacity type for launch derived from NodeClaim requirements, not selected offering

When a NodePool allows both on-demand and preemptible, the NodeClaim's spec.requirements contains capacity-type: In [on-demand, preemptible]. The Create method checked nodeReqs.Has("preemptible") — always true — so it always launched preemptible, even when the unavailable cache had correctly filtered preemptible offerings out and pickBestInstanceType had sorted on-demand as the cheapest available option.

Fix: pickBestInstanceType now returns the capacity type of the cheapest available offering for the selected zone, and Create uses that instead of reading from the NodeClaim requirements.

Verification

  • All three fixes deployed to production cluster in US-ASHBURN-AD-1
  • mix-large NodePool (single shape VM.Standard3.Flex): 8/8 nodeclaims launched as capacity-type: on-demand after preemptible ICE — confirmed fallback working
  • Unit tests added covering both the fallback path (preemptible unavailable → on-demand launch) and the happy path (preemptible available → preemptible launch)

When a preemptible instance launch fails with an InsufficientCapacityError
(Out of host capacity or service limit exceeded), the unavailable offerings
cache was incorrectly marking the on-demand offering as unavailable due to a
hardcoded corev1.CapacityTypeOnDemand parameter.

This caused a pathological feedback loop: preemptible launches would fail,
on-demand would be marked unavailable, and subsequent scheduling rounds would
continue selecting preemptible (since it appeared 'available' in the cache)
while on-demand was never attempted. The result was infinite ICE churn with
nodeclaims being created, failing, and deleted every ~10 seconds — burning
OCI API quota without ever falling back to on-demand capacity.

The fix passes the actual capacityType variable (determined from the
NodeClaim's requirements at line 114) instead of the hardcoded on-demand
constant. This ensures that when preemptible ICEs, the preemptible offering
is marked unavailable, allowing Karpenter's next scheduling round to select
on-demand as a fallback.
The unavailable offerings cache had a key mismatch between write and read
paths. On ICE errors, the cache was written with the composite instance
type name (e.g. 'VM.Standard.E3.Flex-16-32768' which includes cpu and
memory suffixes), but the offering creation path reads using the raw OCI
shape name (e.g. 'VM.Standard.E3.Flex').

This meant cache lookups never matched cache writes, so offerings were
never actually marked as unavailable during scheduling. Combined with the
previous fix (53bd7a4) that corrected the capacity type parameter, this
ensures that when a preemptible launch ICEs:

1. The correct capacity type (preemptible) is marked unavailable
2. Using a cache key that matches what the scheduler checks
3. Allowing the next scheduling round to fall back to on-demand

Without both fixes, Karpenter would infinitely retry preemptible launches
against exhausted OCI capacity without ever attempting on-demand.
…rements

When a NodePool allows both on-demand and preemptible, the NodeClaim's
requirements contain both capacity types. Previously, Create() checked
nodeReqs.Has("preemptible") which was always true, so it always
launched preemptible — even when preemptible was marked unavailable in
the offerings cache after an ICE error.

pickBestInstanceType now returns the capacity type of the cheapest
available offering for the selected zone, which correctly reflects the
unavailable cache state. This is the third of three bugs that prevented
preemptible-to-on-demand fallback from working.

Adds tests for both the fallback path and the happy path.
@GEverding GEverding changed the title fix: mark correct capacity type as unavailable on ICE error fix: preemptible to on-demand fallback on ICE errors Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant