Skip to content

Support for omicron internet gateway model#588

Merged
rcgoodfellow merged 7 commits intomasterfrom
igw-src
Oct 6, 2024
Merged

Support for omicron internet gateway model#588
rcgoodfellow merged 7 commits intomasterfrom
igw-src

Conversation

@rcgoodfellow
Copy link
Contributor

The Omicron internet gateway model associates IP pools, and subsequently addresses from those IP pools, with internet gateways. When packets are directed at an internet gateway through a VPC route, they should take on a source address from the IP pool that is tied to the target internet gateway. This trivially happens when there is only one internet gateway and an instance only has an address from one IP pool. But when there are multiple IP pools in play, particularly with floating IPs from different pools, a decision needs to be made about which one to use. That question is answered by the IP-pool/internet-gateway association in combination with VPC routes that direct packets at a particular internet gateway.

This PR provides OPTE machinery for realizing Omicron internet gateway model.

Leaving in draft as I put together an e2e prototype.

@rcgoodfellow rcgoodfellow added this to the 11 milestone Aug 29, 2024
@twinfees twinfees added customer For any bug reports or feature requests tied to customer requests and removed customer For any bug reports or feature requests tied to customer requests labels Sep 19, 2024
@rcgoodfellow rcgoodfellow marked this pull request as ready for review September 30, 2024 05:06
Copy link
Collaborator

@FelixMcFelix FelixMcFelix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together Ry! I had read Internet Gateways as though they were a system-router-only concept for choosing SNAT, but being able to use them within custom routers to select source IPs is powerful.

However, I think that trying to use the router table to perform source IP selection is fraught for the reasons outlined in one of the inline comments. I think we might be able to preserve the priority/ECMP semantics we have today, without needing to throw sled-only routes in the mix:

  1. Each InternetGateway route target stores the gateway's UUID into the packet's ActionMeta. We have one route in router per InternetGateway.
  2. This PR relies on mapping each EIP to an Internet Gateway. We can instead add the parent gateway as a match predicate for each NAT/SNAT rule.
    • I suppose FIP sets would need to be split into multiple rules based on parent gateway.
  3. Add low-priority route mapping InternetGateway(None) -> Allow (VPC-internal traffic).
  4. Change default action to Deny.

But then we need to update those mappings at runtime. This would work well if external IP management was handled by an RPW, but it isn't yet. I don't know at first glance how to balance this -- you could carry the (external IP, parent gateway) mappings alongside the route updates? Let me know what you think.

router::add_entry(
&g1.port,
cidr.clone(),
cidr,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this particular line of code: can we have a new integration test which uses the expected table state for a port with another (non-default) internet gateway and asserts the packet is correctly transformed?

This commit reworks how Internet Gateways are used to narrow down
external IP selection. Instead of delegating this logic to the router
layer, internet gateway routes tag matching packets with the gateway's
ID.

This approach avoids several issues:
* Priority of selected IPs (floating>ephemeral>snat) does not require
  careful add/remove of router rule entries to enforce.
* Outbound flows will load-balance across available floating IPs within
  each gateway as required.
* We have freedom in future to bind, e.g., additional SNAT entries per
  connected gateway if needed.

Whenever external IPs are conigured, the control plane can now provide a
`External IP`->`IGW Uuid` map, allowing OPTE to determine which source IP
addresses can be chosen. I've set this for now such that an empty
mapping defaults to the exisitng behaviour (i.e., choose by [riority
from all external IPs).

I've added an extra integration test to demonstrate this behavour, which
should put us in a good spot to have Omicron update these mappings in
sled-agent. In the limit we want this to be performed by an EIP-specific
RPW, but I think the most expedient thing is to have the route RPW
insert this information since it's calculating those mappings today anyhow.
Some defensive coding against Things The Control Plane Might Want To Do.
Likely necessary given we're not constraining it s.t. an IP can only be
admitted by one IGW per VPC.
Copy link
Collaborator

@FelixMcFelix FelixMcFelix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for putting all of this together.

I think I'm happy with how this works now, and after driving it through Omicron I'm fairly confident this integrates well with the external IP policy we have today. Apologies again for the earlier fire-drill on that front!

@rcgoodfellow rcgoodfellow merged commit f3002b3 into master Oct 6, 2024
@rcgoodfellow rcgoodfellow deleted the igw-src branch October 6, 2024 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants