f5 vxlan integration with sdn#11181
Conversation
|
[test] |
1 similar comment
|
[test] |
f93a55e to
3ad39a5
Compare
|
FYI in the policy section of the code the use of the Legacy flag will only work in TMOS version 12.1 or newer. Ideally have a version check to only add the flag if TMOS >= 12.1 Python example of version check here: https://github.com/f5devcentral/f5-icontrol-codeshare-python/blob/master/kubernetes-example/k8s2f5.py#L328 |
|
The auth code looks to only support basicauth. This will limit authentication to local accounts and not any remote (radius/tacacs), etc... https://github.com/rajatchopra/origin/blob/3ad39a5c60f344d8fa6cb673f021effe9e139c4b/pkg/router/f5/f5.go#L413 Python example: https://github.com/F5Networks/f5-icontrol-rest-python/blob/5c43b0f6424a5f495c50fd2a318e172d5c3d7e37/icontrol/authtoken.py |
|
Doing a quick scan of the code I'm not seeing an explicit save of the running config after it is generated. F5Networks/f5-common-python#342 iControl REST sits in front of tmsh "When you use the Traffic Management Shell (tmsh) to configure the system, you must explicitly issue a save command to store the configuration data that you have generated. Otherwise, the newly-generated configuration data is not actually stored on the system. For more information about tmsh, see the Traffic Management Shell (tmsh) Reference Guide." |
|
You can save the running config by issuing a POST /mgmt/tm/sys/config with the following json payload. |
pkg/router/f5/f5.go
Outdated
| func checkIPAndGetMac(ipStr string) (string, error) { | ||
| ip := net.ParseIP(ipStr) | ||
| if ip == nil { | ||
| errStr := fmt.Sprintf("'%s' is not a valid IP address for a vtep", ipStr) |
There was a problem hiding this comment.
This implies that the vtep has other restrictions. Perhaps:
'vtep IP '%s' is not a valid IP address'
knobunc
left a comment
There was a problem hiding this comment.
The code structure LGTM... I have no idea if what it is doing with the F5 is right though :-)
Are there any tests we can add? Perhaps at least to exercise the vtep MAC generation function?
| return nil | ||
| } | ||
|
|
||
| func checkIPAndGetMac(ipStr string) (string, error) { |
There was a problem hiding this comment.
Should this have vtep in it? Or is the package scope enough to indicate that.
There was a problem hiding this comment.
Its just a util function. The package scope is enough I guess.
There was a problem hiding this comment.
Consider changing this function name to be more descriptive (e.g. convertIpToVtepMac).
There was a problem hiding this comment.
Also, consider adding a unit test.
52b068c to
0ac78a7
Compare
|
Repo problems re-[test] |
|
@smarterclayton PTAL. The tests are still being worked on, but can you please look over the code to see if it is good? |
marun
left a comment
There was a problem hiding this comment.
I don't see much in the way of test changes that would target the new functionality. Is that an oversight or is qa intended to validate this feature post-merge?
| The path to the F5 BIG\-IP SSH private key file | ||
|
|
||
| .PP | ||
| \fB\-\-f5\-setup\-osdn\-vxlan\fP=false |
| nodeEventQueue := oscache.NewEventQueue(cache.MetaNamespaceKeyFunc) | ||
| cache.NewReflector(&nodeLW{ | ||
| client: factory.NClient, | ||
| field: factory.Fields, |
There was a problem hiding this comment.
Does it makes sense to apply the same filtering to nodes as is being applied to routes?
There was a problem hiding this comment.
Dunno - would it work if you only have a subset of nodes added to the F5 config and the endpoints for a service are on nodes that are not part of the set?
There was a problem hiding this comment.
I see no harm in keeping the fields/filtering. Possible use case with FDB being populated for only a portion of the cluster (say a datacenter).
There was a problem hiding this comment.
I have no issue with supporting filtering on the nodes. I was asking whether it made sense for the same filtering parameters to be applied to both nodes and routes. The current approach of reusing the same fields and labels to filter both would preclude filtering one but not the other.
There was a problem hiding this comment.
It does not make sense to just apply field labeling to nodes - we should never apply the same field selector to two types.
There was a problem hiding this comment.
We should not be watching nodes unless the underlying router is using it. This loop should be optional and the factory must explicitly choose to use it.
| return lw.client.Endpoints(lw.namespace).Watch(opts) | ||
| } | ||
|
|
||
| // nodeLW is a list watcher for routes. |
pkg/router/f5/f5.go
Outdated
| UsePmtu: "enabled", | ||
| } | ||
| err = f5.post(url, tunnelPayload, nil) | ||
| if err != nil && err.(F5Error).httpStatusCode != 409 { |
There was a problem hiding this comment.
Consider defining 409 as a constant whose name is self-documenting (e.g. STATUS_ALREADY_EXISTS) and avoid having to comment as to why 409 is special.
pkg/router/f5/f5.go
Outdated
| } | ||
| // create the net self IP | ||
| err = f5.post(selfUrl, netSelfPayload, nil) | ||
| if err != nil && err.(F5Error).httpStatusCode != 409 { |
| return nil | ||
| } | ||
|
|
||
| func checkIPAndGetMac(ipStr string) (string, error) { |
There was a problem hiding this comment.
Consider changing this function name to be more descriptive (e.g. convertIpToVtepMac).
pkg/router/f5/plugin.go
Outdated
| glog.Warningf("Error in obtaining IP address of newly added node %s - %v", node.Name, err) | ||
| return nil | ||
| } | ||
| p.F5Client.AddVtep(ip) |
There was a problem hiding this comment.
Should this call be returning the error that is potentially returned by AddVtep()?
pkg/router/f5/plugin.go
Outdated
| glog.Warningf("Error in obtaining IP address of deleted node %s - %v", node.Name, err) | ||
| return nil | ||
| } | ||
| p.F5Client.RemoveVtep(ip) |
There was a problem hiding this comment.
Should this call be returning the error that is potentially returned by RemoveVtep?
| } | ||
| p.F5Client.RemoveVtep(ip) | ||
| case watch.Modified: | ||
| // ignore the modified event. Change in IP address of the node is not supported. |
There was a problem hiding this comment.
What's the implication of not supporting node address changes? Is it just going to be documented? Should there be logging to diagnose issues if/when address changes occur?
There was a problem hiding this comment.
I had the same question re: changes and asked Rajat on this. He said on the openshift-sdn side we don't support IP address changes for a node.
| return nil | ||
| } | ||
|
|
||
| func checkIPAndGetMac(ipStr string) (string, error) { |
There was a problem hiding this comment.
Also, consider adding a unit test.
pkg/cmd/infra/router/f5.go
Outdated
| flag.StringVar(&o.PartitionPath, "f5-partition-path", util.Env("ROUTER_EXTERNAL_HOST_PARTITION_PATH", f5plugin.F5DefaultPartitionPath), "The F5 BIG-IP partition path to use") | ||
| flag.StringVar(&o.InternalAddress, "f5-internal-address", util.Env("ROUTER_EXTERNAL_HOST_INTERNAL_ADDRESS", ""), "The F5 BIG-IP internal interface's IP address") | ||
| flag.StringVar(&o.VxlanGateway, "f5-vxlan-gateway-cidr", util.Env("ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR", ""), "The F5 BIG-IP gateway-ip-address/cidr-mask for setting up the VxLAN") | ||
| flag.BoolVar(&o.SetupOSDNVxLAN, "f5-setup-osdn-vxlan", false, "Flag to enable VxLAN setup for integration openshif-sdn with the F5 BIG-IP") |
There was a problem hiding this comment.
so this needs to be set with a command line arg option? It maybe better to implicitly set it if f5-internal-address and f5-vxlan-gateway-cidr is set - basically one without the other(s) doesn't make sense does it?
pkg/cmd/infra/router/f5.go
Outdated
| return errors.New("F5 HTTP and HTTPS vservers cannot both be blank") | ||
| } | ||
|
|
||
| if o.SetupOSDNVxLAN && (len(o.VxlanGateway) == 0 || len(o.InternalAddress) == 0) { |
There was a problem hiding this comment.
See above comment re: o.SetupOSDNVxLAN.
pkg/router/controller/controller.go
Outdated
| } | ||
| go utilwait.Forever(c.HandleRoute, 0) | ||
| go utilwait.Forever(c.HandleEndpoints, 0) | ||
| c.HandleNode() |
There was a problem hiding this comment.
Is this needed? The one from the goroutine ahead should be called soon anyway.
| return &RouterControllerFactory{ | ||
| KClient: kc, | ||
| OSClient: oc, | ||
| NClient: kc, |
There was a problem hiding this comment.
Yeah, if its same as kc - we should cleanup to just use 1 ?Client. Maybe cleanup in follow on work. I know changing it would touch a lot more code in this PR.
There was a problem hiding this comment.
A comment/todo here or in the type definition would be useful.
| func (p *UniqueHost) HandleNode(eventType watch.EventType, node *kapi.Node) error { | ||
| return p.plugin.HandleNode(eventType, node) | ||
| } | ||
|
|
There was a problem hiding this comment.
FYI, depending on which PR merges first - #11201 - will need to add HandleNode to the host_admitter plugin.
| } | ||
| p.F5Client.RemoveVtep(ip) | ||
| case watch.Modified: | ||
| // ignore the modified event. Change in IP address of the node is not supported. |
There was a problem hiding this comment.
I had the same question re: changes and asked Rajat on this. He said on the openshift-sdn side we don't support IP address changes for a node.
| // other possible values are all-match and first-match. | ||
| Strategy string `json:"strategy"` | ||
|
|
||
| Legacy bool `json:"legacy"` |
There was a problem hiding this comment.
I was curious on this one - thanks for explaining it. So if we ever decide to lock down to a specific version of the F5 api, then we would have to remove this - maybe adding a comment here would be helpful. Thx
| Name string `json:"name"` | ||
|
|
||
| // Partition name for the policy | ||
| TmPartition string `json:"tmPartition"` |
There was a problem hiding this comment.
Felt a little weird that this one is called TmPartition but every other F5 api call payload uses Partition. Thanks for explaining that one.
There was a problem hiding this comment.
Update the godoc explaining what TmPartition means or be more explicit
pkg/router/f5/f5.go
Outdated
| LocalAddress: f5.internalAddress, | ||
| Mode: "bidirectional", | ||
| Mtu: "0", | ||
| Profile: path.Join(f5.partitionPath, "vxlan-ose"), |
| nodeEventQueue := oscache.NewEventQueue(cache.MetaNamespaceKeyFunc) | ||
| cache.NewReflector(&nodeLW{ | ||
| client: factory.NClient, | ||
| field: factory.Fields, |
There was a problem hiding this comment.
Dunno - would it work if you only have a subset of nodes added to the F5 config and the endpoints for a service are on nodes that are not part of the set?
|
LGTM |
|
networking job flaked with #11452 re-[test] |
pkg/router/controller/controller.go
Outdated
| } | ||
| go utilwait.Forever(c.HandleRoute, 0) | ||
| go utilwait.Forever(c.HandleEndpoints, 0) | ||
| go utilwait.Forever(c.HandleNode, 0) |
There was a problem hiding this comment.
Why is this always on if it's only used in F5?
| type RouterControllerFactory struct { | ||
| KClient kclient.EndpointsNamespacer | ||
| OSClient osclient.RoutesNamespacer | ||
| NClient kclient.NodesInterface |
There was a problem hiding this comment.
NamespaceClient, abbreviations were a special case but shouldn't be propogared.
| verbs: | ||
| - list | ||
| - watch | ||
| - apiGroups: |
There was a problem hiding this comment.
This is a pretty big security expansion - @deads2k
There was a problem hiding this comment.
@smarterclayton didn't you want this role bindable by project-admins? This will prevent that from being possible.
As for our system router, it doesn't worry overly much, but for per-project routers, this makes it impossible to create those.
There was a problem hiding this comment.
Its also required. If we expect the F5 to participate in the SDN it needs to know about the nodes its talking. We don't have OpenShift-sdn doing that. If this is a big problem it would mean splitting the f5 into its own role, I guess?
There was a problem hiding this comment.
Its also required. If we expect the F5 to participate in the SDN it needs to know about the nodes its talking. We don't have OpenShift-sdn doing that. If this is a big problem it would mean splitting the f5 into its own role, I guess?
Or adding it as the delta required.
|
Removing from merge queue until my comments are addressed. |
41d5983 to
1d695d3
Compare
|
@smarterclayton I have addressed all your feedback. Primarily two:
Marking it [merge]. Please feel free to revert the commit if anything is loose. TODO: |
|
[merge] you stoopid flake. Merge! |
|
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10513/) (Image: devenv-rhel7_5234) |
|
I'm an idiot, it wasn't a flake! |
|
Evaluated for origin merge up to 96354f7 |
|
Evaluated for origin test up to 96354f7 |
|
back in at #5. wait and see! |
|
continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/10513/) (Base Commit: 35b539d) |
|
As David noted this causes a security vulnerability because it's escalating the privileges granted to system:router. I'm going to push a change that removes the node permission granted here and to fix the issue you can open another one (instead of reverting here). |
|
Opened a PR that removes the permission, we can iterate on the next solution at a lower pace. |
|
It's not a flake. The tests don't compile.
|
|
We could also create system:local-router or system:personal-router. On Mon, Oct 24, 2016 at 9:14 AM, David Eads notifications@github.com
|
|
You mean why I said |
|
Integration tests aren't compiling On Oct 23, 2016, at 4:45 PM, OpenShift Bot notifications@github.com wrote: continuous-integration/openshift-jenkins/merge FAILURE ( — |
|
The least disruptive thing is to create a new role with the additional So create system:router-.... (david i need a name suggestion) that contains On Mon, Oct 24, 2016 at 10:11 AM, Clayton Coleman ccoleman@redhat.com
|
|
GitHub is delivering emails I sent on Friday. Thanks github. |
|
|
This PR implements the VxLAN integration of F5 with a vxlan based overlay. Three pieces of work primarily: