Skip to content

feat(pxe): default to client IP when DHCP option 43.70 is unset#1464

Open
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:carbide-pxe-mac-param-fallback
Open

feat(pxe): default to client IP when DHCP option 43.70 is unset#1464
chet wants to merge 1 commit intoNVIDIA:mainfrom
chet:carbide-pxe-mac-param-fallback

Conversation

@chet
Copy link
Copy Markdown
Contributor

@chet chet commented May 7, 2026

Description

We use client IP already for whoami and cloud-init (user-data and meta-data) calls, do the same for boot.

This introduces a new chain where, if 43.70 is NOT set, we just pass through a URL to carbide-pxe that excludes it, and then carbide-pxe pulls out the client IP (prioritizing X-Forwarded-For if it exists), and then calling back to find_by_ip to get the corresponding machine_interface_id (the same one that would have been populated by 43.70).

The flow is:

  • iPXE boots, runs embed.ipxe.
  • :carbide checks isset ${43.70}.
    • If yes, chain with ?uuid=${43.70} (existing path).
    • If no, chain without it.
  • carbide-pxe accepts either uuid as a query param (or plucks an ip_address), plumbs it through to carbide-api via PxeInstructionRequest.
  • carbide-api either:
    • Sees uuid and uses that as interface_id.
    • Sees ip_address and uses that to call find_by_ip to resolve the interface_id.
  • Identical flow from here on

Tests added!

Signed-off-by: Chet Nichols III chetn@nvidia.com

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@chet chet requested a review from a team as a code owner May 7, 2026 01:06
@chet chet changed the title feat(pxe): always provide &mac=${mac} + and use as baseline when 43.70 is unset feat(pxe): always provide mac=${mac} + and use as baseline when 43.70 is unset May 7, 2026
@Matthias247
Copy link
Copy Markdown
Contributor

I thought we moved to using the source IP for identifying what to do a long time ago, and actually don't require these parameters anymore. Can you double check?

@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 7, 2026

I thought we moved to using the source IP for identifying what to do a long time ago, and actually don't require these parameters anymore. Can you double check?

@Matthias247 Yeah so whoami and cloud-init (/user-data and /meta-data) are source IP, but not boot. That might further the case for just switching entirely to $mac, and then things are driven off client IP and client MAC?

Or we just introduce this as step 1, and maybe drop 43.70 separately as step 2? Or something entirely different?

@Matthias247
Copy link
Copy Markdown
Contributor

I thought we moved to using the source IP for identifying what to do a long time ago, and actually don't require these parameters anymore. Can you double check?

@Matthias247 Yeah so whoami and cloud-init (/user-data and /meta-data) are source IP, but not boot. That might further the case for just switching entirely to $mac, and then things are driven off client IP and client MAC?

Or we just introduce this as step 1, and maybe drop 43.70 separately as step 2? Or something entirely different?

Why can't everything use source IP?

@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 7, 2026

I thought we moved to using the source IP for identifying what to do a long time ago, and actually don't require these parameters anymore. Can you double check?

@Matthias247 Yeah so whoami and cloud-init (/user-data and /meta-data) are source IP, but not boot. That might further the case for just switching entirely to $mac, and then things are driven off client IP and client MAC?
Or we just introduce this as step 1, and maybe drop 43.70 separately as step 2? Or something entirely different?

Why can't everything use source IP?

@Matthias247 Sooo we could definitely use source IP, BUT, the problem (well not problem, but added complexity) comes in wrt IPv4/IPv6.

So ${ip} in iPXE is the IPv4 variable, and ${ip6} gives us the IPv6 variable. We could definitely do an isset with some branching to follow a v4 or v6 path (or just blindly pass ip=${ip}&ip6=$ip6}, and then let serde_urlencoded give us Some/None, and then we could look up an interface ID based on IP address).

But ${mac} lets us not have to worry about the IP in came in on, and just focus on the interface. The IP-based approach isn't hard, just extra steps.

@ajf
Copy link
Copy Markdown
Collaborator

ajf commented May 7, 2026

For external / operator-owned DHCP infrastructure, their DHCP server probably isn't going to populate DHCP option 43.70 (the encapsulated machine_interface_id) the way carbide-dhcp does, which means our embed.ipxe script which has a templated chain ${next-server}:8080/api/v0/pxe/boot?uuid=${43.70} line just no-ops (and the machine sits there forever).

The external DHCP / provisioning infrastructure won't use our iPXE kernel either so this change probably doesn't make any difference to them.

@Matthias247
Copy link
Copy Markdown
Contributor

@chet I don't understand how the iPXE variables are involved into that. I think the flow is

  1. ManagedHost makes a HTTP request to PXE server
  2. We lookup the source IP on the TCP connetion that makes the request
  3. Based on that, we can lookup the machine interface and the associated machine that makes the request
  4. We serve the request, just as we would serve it if someone would have given us the interface-id instead of doing 2) + 3). . While serving we, we might populate iPXE variables. But these that follows the earlier steps

@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 7, 2026

@chet I don't understand how the iPXE variables are involved into that. I think the flow is

  1. ManagedHost makes a HTTP request to PXE server
  2. We lookup the source IP on the TCP connetion that makes the request
  3. Based on that, we can lookup the machine interface and the associated machine that makes the request
  4. We serve the request, just as we would serve it if someone would have given us the interface-id instead of doing 2) + 3). . While serving we, we might populate iPXE variables. But these that follows the earlier steps

@Matthias247 If there's something sitting in between the client -> PXE server (like a proxy), we'd get the client IP of the proxy and not the actual host.

@Matthias247
Copy link
Copy Markdown
Contributor

@chet Ah yes. In that case the requirement for the proxy can be to include the original IP (e.g. via X-Forwarded-For) - I think we already support that in the respective path.

@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 7, 2026

@chet Ah yes. In that case the requirement for the proxy can be to include the original IP (e.g. via X-Forwarded-For) - I think we already support that in the respective path.

@Matthias247 Yeah we definitely could -- it just seemed nice to use ${mac}, and then we don't even need to make a requirement for how proxies should operate. Granted most will probably set XFF anyway, and that's probably the key callout.

So then, options are:

  • ${mac} route: we do find_by_mac_address to find the interface and use it (but we need to set ${mac} in the chain URL, operator needs to set nothing).
  • Source IP: we do a find_by_ip to find the interface and use it (for source IP, if XFF is set, we use that IP, otherwise we use the client IP -- nothing needs to be added to the chain URL -- if an operator proxy isn't setting XFF, they just need to set it).

If you want it to be source IP across the board, I'll update this. Lemme know!

@Matthias247
Copy link
Copy Markdown
Contributor

I prefer the client IP. I would have said its less prone to spoofing, but with supporting x-forwarded-for that's actually not true either. But we could at least hide x-forwarded-for support behind a config flag.

@chet chet force-pushed the carbide-pxe-mac-param-fallback branch from 963a95c to a728560 Compare May 7, 2026 23:24
@chet chet requested a review from Coco-Ben as a code owner May 7, 2026 23:24
@chet chet changed the title feat(pxe): always provide mac=${mac} + and use as baseline when 43.70 is unset feat(pxe): default to client IP when DHCP option 43.70 is unset May 7, 2026
Comment thread docs/manuals/metrics/core_metrics.md Outdated
Comment thread docs/manuals/metrics/core_metrics.md Outdated
@chet chet force-pushed the carbide-pxe-mac-param-fallback branch from a728560 to 14f12b6 Compare May 7, 2026 23:34
@chet
Copy link
Copy Markdown
Contributor Author

chet commented May 7, 2026

I prefer the client IP. I would have said its less prone to spoofing, but with supporting x-forwarded-for that's actually not true either. But we could at least hide x-forwarded-for support behind a config flag.

Okay @Matthias247, updated to client IP. Honestly I could probably drop the 43.70 check entirely, but we could also do a phased approach and next remove it from carbide-dhcp, and then unwind it from here after that if we wanted. Lemme know.

We use client IP already for `whoami` and `cloud-init` (`user-data` and `meta-data`) calls, do the same for `boot`.

This introduces a new `chain` where, if `43.70` is NOT set, we just pass through a URL to `carbide-pxe` that excludes it, and then `carbide-pxe` pulls out the client IP (prioritizing `X-Forwarded-For` if it exists), and then calling back to `find_by_ip` to get the corresponding `machine_interface_id` (the same one that would have been populated by `43.70`).

The flow is:
- *iPXE boots*, runs `embed.ipxe`.
- `:carbide` checks `isset ${43.70}`.
  - If yes, chain with `?uuid=${43.70}` (existing path).
  - If no, chain without it.
- `carbide-pxe` accepts either `uuid` as a query param (or plucks an `ip_address`), plumbs it through to `carbide-api` via `PxeInstructionRequest`.
- `carbide-api` either:
  - Sees `uuid` and uses that as `interface_id`.
  - Sees `ip_address` and uses that to call `find_by_ip` to resolve the `interface_id`.
- Identical flow from here on

Tests added!

Signed-off-by: Chet Nichols III <chetn@nvidia.com>
@chet chet force-pushed the carbide-pxe-mac-param-fallback branch from 14f12b6 to ab4b6e5 Compare May 7, 2026 23:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants