Support for omicron internet gateway model#588
Conversation
FelixMcFelix
left a comment
There was a problem hiding this comment.
Thanks for putting this together Ry! I had read Internet Gateways as though they were a system-router-only concept for choosing SNAT, but being able to use them within custom routers to select source IPs is powerful.
However, I think that trying to use the router table to perform source IP selection is fraught for the reasons outlined in one of the inline comments. I think we might be able to preserve the priority/ECMP semantics we have today, without needing to throw sled-only routes in the mix:
- Each InternetGateway route target stores the gateway's UUID into the packet's
ActionMeta. We have one route inrouterper InternetGateway. - This PR relies on mapping each EIP to an Internet Gateway. We can instead add the parent gateway as a match predicate for each NAT/SNAT rule.
- I suppose FIP sets would need to be split into multiple rules based on parent gateway.
- Add low-priority route mapping
InternetGateway(None)->Allow(VPC-internal traffic). - Change default action to
Deny.
But then we need to update those mappings at runtime. This would work well if external IP management was handled by an RPW, but it isn't yet. I don't know at first glance how to balance this -- you could carry the (external IP, parent gateway) mappings alongside the route updates? Let me know what you think.
| router::add_entry( | ||
| &g1.port, | ||
| cidr.clone(), | ||
| cidr, |
There was a problem hiding this comment.
Unrelated to this particular line of code: can we have a new integration test which uses the expected table state for a port with another (non-default) internet gateway and asserts the packet is correctly transformed?
This commit reworks how Internet Gateways are used to narrow down external IP selection. Instead of delegating this logic to the router layer, internet gateway routes tag matching packets with the gateway's ID. This approach avoids several issues: * Priority of selected IPs (floating>ephemeral>snat) does not require careful add/remove of router rule entries to enforce. * Outbound flows will load-balance across available floating IPs within each gateway as required. * We have freedom in future to bind, e.g., additional SNAT entries per connected gateway if needed. Whenever external IPs are conigured, the control plane can now provide a `External IP`->`IGW Uuid` map, allowing OPTE to determine which source IP addresses can be chosen. I've set this for now such that an empty mapping defaults to the exisitng behaviour (i.e., choose by [riority from all external IPs). I've added an extra integration test to demonstrate this behavour, which should put us in a good spot to have Omicron update these mappings in sled-agent. In the limit we want this to be performed by an EIP-specific RPW, but I think the most expedient thing is to have the route RPW insert this information since it's calculating those mappings today anyhow.
Some defensive coding against Things The Control Plane Might Want To Do. Likely necessary given we're not constraining it s.t. an IP can only be admitted by one IGW per VPC.
FelixMcFelix
left a comment
There was a problem hiding this comment.
Thanks again for putting all of this together.
I think I'm happy with how this works now, and after driving it through Omicron I'm fairly confident this integrates well with the external IP policy we have today. Apologies again for the earlier fire-drill on that front!
The Omicron internet gateway model associates IP pools, and subsequently addresses from those IP pools, with internet gateways. When packets are directed at an internet gateway through a VPC route, they should take on a source address from the IP pool that is tied to the target internet gateway. This trivially happens when there is only one internet gateway and an instance only has an address from one IP pool. But when there are multiple IP pools in play, particularly with floating IPs from different pools, a decision needs to be made about which one to use. That question is answered by the IP-pool/internet-gateway association in combination with VPC routes that direct packets at a particular internet gateway.
This PR provides OPTE machinery for realizing Omicron internet gateway model.
Leaving in draft as I put together an e2e prototype.