From 067a94efd3dca211cb42335736ea1c6f44e3713a Mon Sep 17 00:00:00 2001 From: Peng Liu Date: Tue, 27 Aug 2019 15:14:39 +0800 Subject: [PATCH 1/3] [WIP]Add doc for the SR-IOV Network Operator https://jira.coreos.com/browse/SDN-416 --- modules/nw-multinetwork-sriov.adoc | 491 +++++++++++++++++++---------- 1 file changed, 316 insertions(+), 175 deletions(-) diff --git a/modules/nw-multinetwork-sriov.adoc b/modules/nw-multinetwork-sriov.adoc index fcbd566a6599..3a8cd7458204 100644 --- a/modules/nw-multinetwork-sriov.adoc +++ b/modules/nw-multinetwork-sriov.adoc @@ -29,6 +29,19 @@ scheduled to worker nodes that have sufficient resources available. * The SR-IOV CNI plug-in plumbs VF interfaces allocated from the SR-IOV device plug-in directly into a Pod. +You can use the {product-title} console to install SR-IOV, by deploying, +the SR-IOV Network Operator. The SR-IOV Network Operator +creates and manages the components of the SR-IOV stack end-to-end. The operator +provides following features: + +* Discover SR-IOV network device in cluster. +* Initialize the supported SR-IOV NIC models on nodes. +* Provision SR-IOV network device plugin on nodes. +* Provision SR-IOV CNI plugin executable on nodes. +* Manage configuration of SR-IOV device plugin. +* Generate NetworkAttachmentDefinition custom resources for the SR-IOV CNI +plugin. + == Supported Devices The following Network Interface Card (NIC) models are supported in @@ -40,243 +53,371 @@ and device ID 0x1015 * Mellanox MT27800 Family [ConnectX-5] 100G card with vendor ID 0x15b3 and device ID 0x1017 +== Install SR-IOV Network Operator + [NOTE] ==== -For Mellanox cards, ensure that SR-IOV is enabled in the firmware before -provisioning VFs on the host. +User can choose to install the SR-IOV Network Operator by the CLI or +by the web console. Only one instance of the SR-IOV Network Operator is allowed +to be deployed in a cluster +Following the directions below to install by the CLI. ==== -== Creating SR-IOV plug-ins and daemonsets +.Procedure +. Create Namespaces for the SR-IOV Network Operator ++ [NOTE] ==== -The creation of SR-IOV VFs is not handled by the SR-IOV device plug-in and -SR-IOV CNI. -To provision SR-IOV VF on hosts, you must configure it manually. +You can also create the Namespaces in the web console using the *Administration* + -> *Namespaces* page. ==== -To use the SR-IOV network device plug-in and SR-IOV CNI plug-in, run both -plug-ins in daemon mode on each node in your cluster. - -. Create a YAML file for the `openshift-sriov` namespace with the following -contents: +.. Create a Namespace for the SR-IOV Network Operator (for example, +`sriov-namespace.yaml`): + -[source,yaml] ---- apiVersion: v1 kind: Namespace metadata: - name: openshift-sriov + name: sriov-network-operator labels: - name: openshift-sriov - openshift.io/run-level: "0" - annotations: - openshift.io/node-selector: "" - openshift.io/description: "Openshift SR-IOV network components" + openshift.io/run-level: "1" ---- -. Run the following command to create the `openshift-sriov` namespace: +.. Run the following command to create the namespace: ++ +---- +$ oc create -f .yaml +---- ++ +For example: + ---- -$ oc create -f openshift-sriov.yaml +$ oc create -f sriov-namespace.yaml ---- -. Create a YAML file for the `sriov-device-plugin` service account with the -following contents: +. Install the SR-IOV Network Operator in the namespace by creating the following +objects: +.. Create a OperatorGroup for the SR-IOV Network Operator (for example, +`sriov-operatorgroup.yaml`) + -[source,yaml] ---- -apiVersion: v1 -kind: ServiceAccount +apiVersion: operators.coreos.com/v1 +kind: OperatorGroup metadata: - name: sriov-device-plugin - namespace: openshift-sriov + name: sriov-network-operators + namespace: sriov-network-operator +spec: + targetNamespaces: + - sriov-network-operator ---- -. Run the following command to create the `sriov-device-plugin` service account: +.. Run the following command to create the OperatorGroup: + ---- -$ oc create -f sriov-device-plugin.yaml +$ oc create -f .yaml +---- ++ +For example: ++ +---- +$ oc create -f sriov-og.yaml ---- -. Create a YAML file for the `sriov-cni` service account with the following -contents: +.. Use the following command to get the `channel` value required for the next +step. + +---- +$ oc get packagemanifest sriov-network-operator -n openshift-marketplace -o jsonpath='{.status.channels[].name}' + +preview +---- + +.. Create a Subscription object YAML file (for example, `sriov-sub.yaml`) to +subscribe a Namespace to an Operator. ++ +.Example Subscription [source,yaml] ---- -apiVersion: v1 -kind: ServiceAccount +apiVersion: operators.coreos.com/v1alpha1 +kind: Subscription metadata: - name: sriov-cni - namespace: openshift-sriov + name: sriov-network-operator-subsription + namespace: sriov-network-operator +spec: + channel: preview <2> + name: sriov-network-operator + source: redhat-operators <1> + sourceNamespace: openshift-marketplace ---- +<1> You must specify the `redhat-operators` for the `source`. +<2> Specify the `.status.channels[].name` value from the previous step. -. Run the following command to create the `sriov-cni` service account: +.. Create the Subscription object: + ---- -$ oc create -f sriov-cni.yaml +$ oc create -f .yaml +---- ++ +For example: ++ +---- +$ oc create -f sriov-sub.yaml ---- -. Create a YAML file for the `sriov-device-plugin` DaemonSet with the following -contents: +.. Change to the `sriov-network-operator` project: + +---- +$ oc project sriov-network-operator + +Now using project "sriov-network-operator" +---- + [NOTE] ==== -The SR-IOV network device plug-in daemon, when launched, will discover all the -configured SR-IOV VFs (of supported NIC models) on each node and advertise -discovered resources. The number of available SR-IOV VF resources that are -capable of being allocated can be reviewed by describing a node with the -[command]`oc describe node ` command. The resource name for the -SR-IOV VF resources is `openshift.io/sriov`. When no SR-IOV VFs are available on -the node, a value of zero is displayed. +The SR-IOV Network Operator can also be install through the web console +following the directions below. + +Before that, you have to create the `Namespace` and `OperatorGroup` as mentioned +above. ==== + +.Procedure + +. Install the SR-IOV Network Operator using the {product-title} web console: + +.. In the {product-title} web console, click *Catalog* -> *OperatorHub*. + +.. Choose *SR-IOV Network Operator* from the list of available Operators, and +click *Install*. + +.. On the *Create Operator Subscription* page, under *A specific namespace on +the cluster* select *sriov-network-operator*. Then, click *Subscribe*. + +. Verify the operator installations: + +.. Switch to the *Catalog* → *Installed Operators* page. + +.. Ensure that *SR-IOV Network Operator* is listed in the +*riov-network-operator* project with a *Status* of *InstallSucceeded*. + + -[source,yaml,subs="attributes"] ----- -kind: DaemonSet -apiVersion: apps/v1 -metadata: - name: sriov-device-plugin - namespace: openshift-sriov - annotations: - kubernetes.io/description: | - This daemon set launches the SR-IOV network device plugin on each node. -spec: - selector: - matchLabels: - app: sriov-device-plugin - updateStrategy: - type: RollingUpdate - template: - metadata: - labels: - app: sriov-device-plugin - component: network - type: infra - openshift.io/component: network - spec: - hostNetwork: true - nodeSelector: - kubernetes.io/os: linux - tolerations: - - operator: Exists - serviceAccountName: sriov-device-plugin - containers: - - name: sriov-device-plugin - image: quay.io/openshift/{image-prefix}-sriov-network-device-plugin:v4.0.0 - args: - - --log-level=10 - securityContext: - privileged: true - volumeMounts: - - name: devicesock - mountPath: /var/lib/kubelet/ - readOnly: false - - name: net - mountPath: /sys/class/net - readOnly: true - volumes: - - name: devicesock - hostPath: - path: /var/lib/kubelet/ - - name: net - hostPath: - path: /sys/class/net ----- - -. Run the following command to create the `sriov-device-plugin` DaemonSet: +[NOTE] +==== +During installation an operator might display a *Failed* status. If the operator +then installs with an *InstallSucceeded* message, you can safely ignore +the *Failed* message. +==== + + ----- -oc create -f sriov-device-plugin.yaml ----- +If the operator does not appear as installed, to troubleshoot further: -. Create a YAML file for the `sriov-cni` DaemonSet with the following contents: + -[source,yaml,subs="attributes"] +* Switch to the *Catalog* → *Operator Management* page and inspect +the *Operator Subscriptions* and *Install Plans* tabs for any failure or errors +under *Status*. +* Switch to the *Workloads* → *Pods* page and check the logs in any Pods in the +`sriov-network-operator` projects that are reporting issues. + +== Discover SR-IOV network devices + +After the SR-IOV network Operator has been install successfully. The operator +will try to discover all the SR-IOV capable network devices on worker nodes. +User can find those information from the `SriovNetworkNodeState` Custom +Resources, which are generated and updated by the operator automatically. + +One CR is created for each worker node, and shares the same name as the node. In +the interface list, you can the information of the network devices. + +You should never have to modify this CRD, nor the CRs. + +The following is an example of a typical Custom Resource for +`SriovNetworkNodeState`. + +[source,yaml] ---- -kind: DaemonSet -apiVersion: apps/v1 +apiVersion: sriovnetwork.openshift.io/v1 +kind: SriovNetworkNodeState metadata: - name: sriov-cni - namespace: openshift-sriov - annotations: - kubernetes.io/description: | - This daemon set launches the SR-IOV CNI plugin on SR-IOV capable worker nodes. + creationTimestamp: "2019-08-27T06:01:36Z" + generation: 1 + name: node-25 + namespace: sriov-network-operator + ownerReferences: + - apiVersion: sriovnetwork.openshift.io/v1 + blockOwnerDeletion: true + controller: true + kind: SriovNetworkNodePolicy + name: default + uid: 20c3c0f9-c890-11e9-91f7-0028ccd628ee + resourceVersion: "2639363" + uid: 2103a9fe-c890-11e9-91f7-0028ccd628ee spec: - selector: - matchLabels: - app: sriov-cni - updateStrategy: - type: RollingUpdate - template: - metadata: - labels: - app: sriov-cni - component: network - type: infra - openshift.io/component: network - spec: - nodeSelector: - kubernetes.io/os: linux - tolerations: - - operator: Exists - serviceAccountName: sriov-cni - containers: - - name: sriov-cni - image: quay.io/openshift/{image-prefix}-sriov-cni:v4.0.0 - securityContext: - privileged: true - volumeMounts: - - name: cnibin - mountPath: /host/opt/cni/bin - volumes: - - name: cnibin - hostPath: - path: /var/lib/cni/bin ----- - -. Run the following command to create the `sriov-cni` DaemonSet: -+ ----- -$ oc create -f sriov-cni.yaml + dpConfigVersion: d41d8cd98f00b204e9800998ecf8427e +status: + interfaces: + - deviceID: "1017" + driver: mlx5_core + mtu: 1500 + name: ens785f0 + pciAddress: "0000:18:00.0" + totalvfs: 8 + vendor: 15b3 + - deviceID: "1017" + driver: mlx5_core + mtu: 1500 + name: ens785f1 + pciAddress: "0000:18:00.1" + totalvfs: 8 + vendor: 15b3 + - deviceID: 158b + driver: i40e + mtu: 1500 + name: ens817f0 + pciAddress: 0000:81:00.0 + totalvfs: 64 + vendor: "8086" + - deviceID: 158b + driver: i40e + mtu: 1500 + name: ens817f1 + pciAddress: 0000:81:00.1 + totalvfs: 64 + vendor: "8086" + - deviceID: 158b + driver: i40e + mtu: 1500 + name: ens803f0 + pciAddress: 0000:86:00.0 + totalvfs: 64 + vendor: "8086" + syncStatus: Succeeded ---- -== Configuring additional interfaces using SR-IOV +== Configuring SR-IOV network devices + +The SR-IOV Network Operator introduces `SriovNetworkNodePolicy` Custom Resource +Definition (CRD) to define the SR-IOV network device and the configuration of +SR-IOV device plugin + +You should never have to modify this CRD. To make changes to your deployment, +create and modify a specific Custom Resource (CR). Instructions for creating or +modifying a CR are provided in this documentation as appropriate. + +[NOTE] +===== +To make the configuration change take effect, creating or modifying the Custom +Resource of `SriovNetworkNodePolicy` may trigger operator to drain the nodes, +and in some cases reboot the nodes. + +The whole process may take several minutes. All configuration changes shall +have been applied until all the pods in `sriov-network-operator` namespace are +in `Running` status. +===== + +The following is an example of a typical Custom Resource for +`SriovNetworkNodePolicy`. -. Create a YAML file for the Custom Resource (CR) with SR-IOV configuration. The -`name` field in the following CR has the value `sriov-conf`. -+ [source,yaml] ---- -apiVersion: "k8s.cni.cncf.io/v1" -kind: NetworkAttachmentDefinition +apiVersion: sriovnetwork.openshift.io/v1 +kind: SriovNetworkNodePolicy metadata: - name: sriov-conf - annotations: - k8s.v1.cni.cncf.io/resourceName: openshift.io/sriov <1> + name: policy-example <1> + namespace: sriov-network-operator <2> spec: - config: '{ - "type": "sriov", <2> - "name": "sriov-conf", - "ipam": { - "type": "host-local", - "subnet": "10.56.217.0/24", - "routes": [{ - "dst": "0.0.0.0/0" - }], - "gateway": "10.56.217.1" - } - }' + resourceName: sriov <3> + nodeSelector: + feature.node.kubernetes.io/network-sriov.capable: "true" <4> + priority: 99 <5> + mtu: 9000 <6> + numVfs: 16 <7> + nicSelector: + vendor: "15b3" <8> + deviceID: "" <9> + pfName: ["eno3", "eno4"] <10> + rootDevices: ["0000:02:00.0", "0000:02:00.1"] <11> + deviceType: netdevice <12> + isRdma: false <13> ---- -+ -<1> `k8s.v1.cni.cncf.io/resourceName` annotation is set to `openshift.io/sriov`. -<2> `type` is set to `sriov`. +<1> The name of the CR. +<2> The namespace of the CR, this must be in the same namespace of operator. +<3> The resource name of SR-IOV device plugin. Prefix `openshift.io/` will be +add when it's referred in pod annotation. It's allowed to create multiple CRs of `SriovNetworkNodePolicy` for one resource name. +<4> The node selector to select which node to be configured. User can choose to +label the nodes manually or with tools like Kubernetes Node Feature Discovery. +Only SR-IOV network devices on selected nodes will be configured. And the SR-IOV +CNI plugin and device plugin will be only deployed on selected nodes. +<5> The priority of the policy, the larger number gets lower priority. +Range from 0 to 99. +<6> The MTU of the virtual functions. Range from 1 to 9000. Leave it blank if +you don't need to change the MTU. +<7> The number of the virtual functions for each SR-IOV network device. +<8> The vendor hex code of SR-IoV device. Allowed value "8086", "15b3". +<9> The device hex code of SR-IoV device. Allowed value "158b", "1015", "1017". +<10> The names of SR-IoV physical function. +<11> The PCI addresses of SR-IoV physical function. +<12> The driver type of the virtual functions. Allowed value "netdevice", +"vfio-pci". Defaults to "netdevice". +<13> The RDMA mode. Defaults to false. -. Run the following command to create the `sriov-conf` CR: -+ +== Configuring SR-IOV networks + +The SR-IOV Network Operator also introduces a CRD `SriovNetwork` for creating +the Custom Resource of `NetworkAttachmentDefinition` of SR-IOV CNI plugin. When +you create a CR of `SriovNetwork`, the operator will created a CR of +`NetworkAttachmentDefinition` according. + +You should never have to modify this CRD. Instructions for creating or +modifying a CR are provided in this documentation as appropriate. + +[NOTE] +===== +You shall not modify or delete a Custom `SriovNetwork`, when it has been used by +any running Pods. +===== + +The following is an example of a typical Custom Resource for +`SriovNetworkNodePolicy`. + +[source,yaml] ---- -$ oc create -f sriov-conf.yaml +apiVersion: sriovnetwork.openshift.io/v1 +kind: SriovNetwork +metadata: + name: sriov-conf <1> + namespace: sriov-network-operator <2> +spec: + networkNamespace: default <3> + ipam: | <4> + { + "type": "host-local", + "subnet": "10.56.217.0/24", + "rangeStart": "10.56.217.171", + "rangeEnd": "10.56.217.181", + "routes": [{ + "dst": "10.0.0.0/8" + }], + "gateway": "10.56.217.1" + } + vlan: 0 <5> + resourceName: sriov <6> + spoofChk: true + trust: false ---- +<1> The name of the CR. The generated `NetworkAttachmentDefinition` CR will use +the same name. +<2> The namespace of the CR, this must be in the same namespace of operator. +<3> The namespace where the `NetworkAttachmentDefinition` CR will be created. +<4> The IPAM configuration for SR-IOV CNI plugin. +<5> The VLAN ID for SR-IOV CNI plugin. Range from 0 to 4095, default to 0. +<6> The SRIOV Network device plugin endpoint resource name. It shall matches +the `resourceName` defined in the `SriovNetworkNodePolicy` CR. +<7> The Virtual Function spoof check. Boolean, default to false. +<8> The Virtual Function trust mode. Boolean, default to false. + +== Configuring additional interfaces using SR-IOV . Create a YAML file for a Pod which references the name of the `NetworkAttachmentDefinition` and requests one `openshift.io/sriov` resource: From 99ae42ffb51c89d82fe8c801aaa56b3565445aa5 Mon Sep 17 00:00:00 2001 From: Peng Liu Date: Thu, 29 Aug 2019 11:31:07 +0800 Subject: [PATCH 2/3] Update based on the comments --- modules/nw-multinetwork-sriov.adoc | 100 +++++++++++++++-------------- 1 file changed, 53 insertions(+), 47 deletions(-) diff --git a/modules/nw-multinetwork-sriov.adoc b/modules/nw-multinetwork-sriov.adoc index 3a8cd7458204..14accab6fb99 100644 --- a/modules/nw-multinetwork-sriov.adoc +++ b/modules/nw-multinetwork-sriov.adoc @@ -34,14 +34,21 @@ the SR-IOV Network Operator. The SR-IOV Network Operator creates and manages the components of the SR-IOV stack end-to-end. The operator provides following features: -* Discover SR-IOV network device in cluster. +* Discover the SR-IOV network device in cluster. * Initialize the supported SR-IOV NIC models on nodes. -* Provision SR-IOV network device plugin on nodes. -* Provision SR-IOV CNI plugin executable on nodes. -* Manage configuration of SR-IOV device plugin. +* Provision the SR-IOV network device plugin on nodes. +* Provision the SR-IOV CNI plugin executable on nodes. +* Provision the SR-IOV admission controller in cluster. +* Manage configuration of SR-IOV network device plugin. * Generate NetworkAttachmentDefinition custom resources for the SR-IOV CNI plugin. +[NOTE] +==== +The SR-IOV admission controller is enabled by default, and cannot be disabled by +user. +==== + == Supported Devices The following Network Interface Card (NIC) models are supported in @@ -73,14 +80,14 @@ You can also create the Namespaces in the web console using the *Administration* -> *Namespaces* page. ==== -.. Create a Namespace for the SR-IOV Network Operator (for example, -`sriov-namespace.yaml`): +.. Create Namespace `openshift-sriov-network-operator` for the SR-IOV Network +Operator (for example, `sriov-namespace.yaml`): + ---- apiVersion: v1 kind: Namespace metadata: - name: sriov-network-operator + name: openshift-sriov-network-operator labels: openshift.io/run-level: "1" ---- @@ -107,10 +114,10 @@ apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: sriov-network-operators - namespace: sriov-network-operator + namespace: openshift-sriov-network-operator spec: targetNamespaces: - - sriov-network-operator + - openshift-sriov-network-operator ---- .. Run the following command to create the OperatorGroup: @@ -144,7 +151,7 @@ apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: sriov-network-operator-subsription - namespace: sriov-network-operator + namespace: openshift-sriov-network-operator spec: channel: preview <2> name: sriov-network-operator @@ -180,7 +187,7 @@ The SR-IOV Network Operator can also be install through the web console following the directions below. Before that, you have to create the `Namespace` and `OperatorGroup` as mentioned -above. +in above section. ==== .Procedure @@ -228,7 +235,7 @@ User can find those information from the `SriovNetworkNodeState` Custom Resources, which are generated and updated by the operator automatically. One CR is created for each worker node, and shares the same name as the node. In -the interface list, you can the information of the network devices. +the interface list, you can find the information of the network devices. You should never have to modify this CRD, nor the CRs. @@ -243,7 +250,7 @@ metadata: creationTimestamp: "2019-08-27T06:01:36Z" generation: 1 name: node-25 - namespace: sriov-network-operator + namespace: openshift-sriov-network-operator ownerReferences: - apiVersion: sriovnetwork.openshift.io/v1 blockOwnerDeletion: true @@ -312,8 +319,8 @@ Resource of `SriovNetworkNodePolicy` may trigger operator to drain the nodes, and in some cases reboot the nodes. The whole process may take several minutes. All configuration changes shall -have been applied until all the pods in `sriov-network-operator` namespace are -in `Running` status. +have been applied until all the pods in `openshift-sriov-network-operator` +namespace are in `Running` status. ===== The following is an example of a typical Custom Resource for @@ -325,7 +332,7 @@ apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: policy-example <1> - namespace: sriov-network-operator <2> + namespace: openshift-sriov-network-operator <2> spec: resourceName: sriov <3> nodeSelector: @@ -333,7 +340,7 @@ spec: priority: 99 <5> mtu: 9000 <6> numVfs: 16 <7> - nicSelector: + nicSelector: <8> vendor: "15b3" <8> deviceID: "" <9> pfName: ["eno3", "eno4"] <10> @@ -344,7 +351,8 @@ spec: <1> The name of the CR. <2> The namespace of the CR, this must be in the same namespace of operator. <3> The resource name of SR-IOV device plugin. Prefix `openshift.io/` will be -add when it's referred in pod annotation. It's allowed to create multiple CRs of `SriovNetworkNodePolicy` for one resource name. +added when it's referred in Pod spec. It's allowed to create multiple CRs +of `SriovNetworkNodePolicy` for one resource name. <4> The node selector to select which node to be configured. User can choose to label the nodes manually or with tools like Kubernetes Node Feature Discovery. Only SR-IOV network devices on selected nodes will be configured. And the SR-IOV @@ -353,33 +361,40 @@ CNI plugin and device plugin will be only deployed on selected nodes. Range from 0 to 99. <6> The MTU of the virtual functions. Range from 1 to 9000. Leave it blank if you don't need to change the MTU. -<7> The number of the virtual functions for each SR-IOV network device. -<8> The vendor hex code of SR-IoV device. Allowed value "8086", "15b3". -<9> The device hex code of SR-IoV device. Allowed value "158b", "1015", "1017". -<10> The names of SR-IoV physical function. -<11> The PCI addresses of SR-IoV physical function. -<12> The driver type of the virtual functions. Allowed value "netdevice", +<7> The number of the virtual functions is to be created for each SR-IOV +physical network device. +<8> The NIC selector selects the NIC to be configured. You don't have to specify +all the fields. However it is recommended to select the NICs as specifically as +possible, in order to minimize the possibility of overlapping cross policies. +And if you specify both `pfName` and `rootDevices` at the same time, which is +recommended, please make sure they point to the identical devices. +<9> The vendor hex code of SR-IoV device. Allowed value "8086", "15b3". +<10> The device hex code of SR-IoV device. Allowed value "158b", "1015", "1017". +<11> The names of SR-IoV physical function. +<12> The PCI addresses of SR-IoV physical function. +<13> The driver type of the virtual functions. Allowed value "netdevice", "vfio-pci". Defaults to "netdevice". -<13> The RDMA mode. Defaults to false. +<14> The RDMA mode. Defaults to false. In this release, only RoCE mode are +supported, and only for Mellanox NICs. == Configuring SR-IOV networks The SR-IOV Network Operator also introduces a CRD `SriovNetwork` for creating the Custom Resource of `NetworkAttachmentDefinition` of SR-IOV CNI plugin. When you create a CR of `SriovNetwork`, the operator will created a CR of -`NetworkAttachmentDefinition` according. +`NetworkAttachmentDefinition` accordingly. You should never have to modify this CRD. Instructions for creating or modifying a CR are provided in this documentation as appropriate. [NOTE] ===== -You shall not modify or delete a Custom `SriovNetwork`, when it has been used by -any running Pods. +You shall not modify or delete a Custom Resource of `SriovNetwork`, when it has +been used by any running Pods. ===== The following is an example of a typical Custom Resource for -`SriovNetworkNodePolicy`. +`SriovNetwork`. [source,yaml] ---- @@ -387,24 +402,15 @@ apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: sriov-conf <1> - namespace: sriov-network-operator <2> + namespace: openshift-sriov-network-operator <2> spec: networkNamespace: default <3> ipam: | <4> { - "type": "host-local", - "subnet": "10.56.217.0/24", - "rangeStart": "10.56.217.171", - "rangeEnd": "10.56.217.181", - "routes": [{ - "dst": "10.0.0.0/8" - }], - "gateway": "10.56.217.1" + "type": "dhcp" } vlan: 0 <5> resourceName: sriov <6> - spoofChk: true - trust: false ---- <1> The name of the CR. The generated `NetworkAttachmentDefinition` CR will use the same name. @@ -414,14 +420,19 @@ the same name. <5> The VLAN ID for SR-IOV CNI plugin. Range from 0 to 4095, default to 0. <6> The SRIOV Network device plugin endpoint resource name. It shall matches the `resourceName` defined in the `SriovNetworkNodePolicy` CR. -<7> The Virtual Function spoof check. Boolean, default to false. -<8> The Virtual Function trust mode. Boolean, default to false. == Configuring additional interfaces using SR-IOV . Create a YAML file for a Pod which references the name of the `NetworkAttachmentDefinition` and requests one `openshift.io/sriov` resource: + +[NOTE] +===== +The SR-IoV admission controller will inject the `resource` field automatically +if a NetworkAttachmentDefinition CR of SRIOV CNI plugin is referred in the Pod +annotation. +===== ++ [source,yaml] ---- apiVersion: v1 @@ -435,11 +446,6 @@ spec: - name: sriovsamplepod command: ["/bin/bash", "-c", "sleep 2000000000000"] image: centos/tools - resources: - requests: - openshift.io/sriov: '1' - limits: - openshift.io/sriov: '1' ---- . Run the following command to create the `sriovsamplepod` Pod: From f9f2340be69407311552d18e518b9cf284b6703b Mon Sep 17 00:00:00 2001 From: Peng Liu Date: Mon, 9 Sep 2019 15:07:40 +0800 Subject: [PATCH 3/3] Add RDMA and DPDK example, change namespace to sriov-network-operator --- modules/nw-multinetwork-sriov.adoc | 125 ++++++++++++++++++++++------- 1 file changed, 95 insertions(+), 30 deletions(-) diff --git a/modules/nw-multinetwork-sriov.adoc b/modules/nw-multinetwork-sriov.adoc index 14accab6fb99..459a973d9f84 100644 --- a/modules/nw-multinetwork-sriov.adoc +++ b/modules/nw-multinetwork-sriov.adoc @@ -16,19 +16,6 @@ endif::openshift-origin[] {product-title} nodes, which enables you to attach SR-IOV virtual function (VF) interfaces to Pods in addition to other network interfaces. -Two components are required to provide this capability: the SR-IOV network -device plug-in and the SR-IOV CNI plug-in. - -* The SR-IOV network device plug-in is a Kubernetes device plug-in for -discovering, advertising, and allocating SR-IOV network virtual function (VF) -resources. Device plug-ins are used in Kubernetes to enable the use of limited -resources, typically in physical devices. Device plug-ins give the Kubernetes -scheduler awareness of which resources are exhausted, allowing Pods to be -scheduled to worker nodes that have sufficient resources available. - -* The SR-IOV CNI plug-in plumbs VF interfaces allocated from the SR-IOV device -plug-in directly into a Pod. - You can use the {product-title} console to install SR-IOV, by deploying, the SR-IOV Network Operator. The SR-IOV Network Operator creates and manages the components of the SR-IOV stack end-to-end. The operator @@ -38,11 +25,27 @@ provides following features: * Initialize the supported SR-IOV NIC models on nodes. * Provision the SR-IOV network device plugin on nodes. * Provision the SR-IOV CNI plugin executable on nodes. -* Provision the SR-IOV admission controller in cluster. +* Provision the Network Resources Injector in cluster. * Manage configuration of SR-IOV network device plugin. * Generate NetworkAttachmentDefinition custom resources for the SR-IOV CNI plugin. +Here's the function of each abovementioned SR-IOV components . + +* The SR-IOV network device plug-in is a Kubernetes device plug-in for +discovering, advertising, and allocating SR-IOV network virtual function (VF) +resources. Device plug-ins are used in Kubernetes to enable the use of limited +resources, typically in physical devices. Device plug-ins give the Kubernetes +scheduler awareness of which resources are exhausted, allowing pods to be +scheduled to worker nodes that have sufficient resources available. + +* The SR-IOV CNI plug-in plumbs VF interfaces allocated from the SR-IOV device +plug-in directly into a pod. + +* The Network Resources Injector is a Kubernetes Dynamic Admission Controller +Webhook, which provides functionality of patching Kubernetes pod specifications +with requests and limits of custom network resources such as SR-IOV VFs. + [NOTE] ==== The SR-IOV admission controller is enabled by default, and cannot be disabled by @@ -80,14 +83,14 @@ You can also create the Namespaces in the web console using the *Administration* -> *Namespaces* page. ==== -.. Create Namespace `openshift-sriov-network-operator` for the SR-IOV Network +.. Create Namespace `sriov-network-operator` for the SR-IOV Network Operator (for example, `sriov-namespace.yaml`): + ---- apiVersion: v1 kind: Namespace metadata: - name: openshift-sriov-network-operator + name: sriov-network-operator labels: openshift.io/run-level: "1" ---- @@ -114,10 +117,10 @@ apiVersion: operators.coreos.com/v1 kind: OperatorGroup metadata: name: sriov-network-operators - namespace: openshift-sriov-network-operator + namespace: sriov-network-operator spec: targetNamespaces: - - openshift-sriov-network-operator + - sriov-network-operator ---- .. Run the following command to create the OperatorGroup: @@ -151,7 +154,7 @@ apiVersion: operators.coreos.com/v1alpha1 kind: Subscription metadata: name: sriov-network-operator-subsription - namespace: openshift-sriov-network-operator + namespace: sriov-network-operator spec: channel: preview <2> name: sriov-network-operator @@ -250,7 +253,7 @@ metadata: creationTimestamp: "2019-08-27T06:01:36Z" generation: 1 name: node-25 - namespace: openshift-sriov-network-operator + namespace: sriov-network-operator ownerReferences: - apiVersion: sriovnetwork.openshift.io/v1 blockOwnerDeletion: true @@ -319,8 +322,8 @@ Resource of `SriovNetworkNodePolicy` may trigger operator to drain the nodes, and in some cases reboot the nodes. The whole process may take several minutes. All configuration changes shall -have been applied until all the pods in `openshift-sriov-network-operator` -namespace are in `Running` status. +have been applied until all the pods in `sriov-network-operator` namespace are +in `Running` status. ===== The following is an example of a typical Custom Resource for @@ -332,7 +335,7 @@ apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetworkNodePolicy metadata: name: policy-example <1> - namespace: openshift-sriov-network-operator <2> + namespace: sriov-network-operator <2> spec: resourceName: sriov <3> nodeSelector: @@ -351,8 +354,8 @@ spec: <1> The name of the CR. <2> The namespace of the CR, this must be in the same namespace of operator. <3> The resource name of SR-IOV device plugin. Prefix `openshift.io/` will be -added when it's referred in Pod spec. It's allowed to create multiple CRs -of `SriovNetworkNodePolicy` for one resource name. +added when it's referred in pod spec. It's allowed to create multiple CRs of +`SriovNetworkNodePolicy` for one resource name. <4> The node selector to select which node to be configured. User can choose to label the nodes manually or with tools like Kubernetes Node Feature Discovery. Only SR-IOV network devices on selected nodes will be configured. And the SR-IOV @@ -375,7 +378,14 @@ recommended, please make sure they point to the identical devices. <13> The driver type of the virtual functions. Allowed value "netdevice", "vfio-pci". Defaults to "netdevice". <14> The RDMA mode. Defaults to false. In this release, only RoCE mode are -supported, and only for Mellanox NICs. +supported, and only for Mellanox NICs. + ++ +[NOTE] +===== +When `RDMA` flag is configured to true, it doesn't prevent user from using RDMA +enabled VF as a normal network device. A device can be used in either mode. +===== == Configuring SR-IOV networks @@ -390,7 +400,7 @@ modifying a CR are provided in this documentation as appropriate. [NOTE] ===== You shall not modify or delete a Custom Resource of `SriovNetwork`, when it has -been used by any running Pods. +been used by any running pods. ===== The following is an example of a typical Custom Resource for @@ -402,7 +412,7 @@ apiVersion: sriovnetwork.openshift.io/v1 kind: SriovNetwork metadata: name: sriov-conf <1> - namespace: openshift-sriov-network-operator <2> + namespace: sriov-network-operator <2> spec: networkNamespace: default <3> ipam: | <4> @@ -423,13 +433,13 @@ the `resourceName` defined in the `SriovNetworkNodePolicy` CR. == Configuring additional interfaces using SR-IOV -. Create a YAML file for a Pod which references the name of the +. Create a YAML file for a pod which references the name of the `NetworkAttachmentDefinition` and requests one `openshift.io/sriov` resource: + [NOTE] ===== The SR-IoV admission controller will inject the `resource` field automatically -if a NetworkAttachmentDefinition CR of SRIOV CNI plugin is referred in the Pod +if a NetworkAttachmentDefinition CR of SRIOV CNI plugin is referred in the pod annotation. ===== + @@ -459,3 +469,58 @@ $ oc create -f sriovsamplepod.yaml ---- $ oc exec sriovsamplepod -- ip a ---- + +User can also run RDMA or DPDK application in a pod with SR-IOV VF attached. The +following is an example of a pod with a VF in RDMA mode: + +[source,yaml] +---- +apiVersion: v1 +kind: Pod +metadata: + name: rdma-app + annotations: + k8s.v1.cni.cncf.io/networks: sriov-rdma-mlnx +spec: + containers: + - name: testpmd + image: + imagePullPolicy: Never + securityContext: + capabilities: + add: ["IPC_LOCK"] + command: ["sleep", "infinity"] +---- + +This is an example of a pod with VF in DPDK mode: + +[source,yaml] +---- +apiVersion: v1 +kind: Pod +metadata: + name: dpdk-app + annotations: + k8s.v1.cni.cncf.io/networks: sriov-dpdk-net +spec: + containers: + - name: testpmd + image: + securityContext: + capabilities: + add: ["IPC_LOCK"] + volumeMounts: + - mountPath: /dev/hugepages + name: hugepage + resources: + requests: + memory: 1Gi + hugepages-1Gi: 4Gi + limits: + hugepages-1Gi: 4Gi + command: ["sleep", "infinity"] + volumes: + - name: hugepage + emptyDir: + medium: HugePages +----