Skip to content

FIPS support design #2544

@cyli

Description

@cyli

#2535 starts to add fernet encryption to support a FIPS-compliant raft encryption algorithm. #2246 has already added the PKCS8 key encryption format, which is FIPS-compliant, which will store keys in a new format, but be backwards compatible with older versions of swarm, unless FIPS mode is enabled via environment variable, in which case it will enforce that all keys MUST use the new encryption format.

However while working on this, I realized it may not make sense to have a mixed cluster of FIPS and non-FIPS nodes, if what you desire is FIPS compliance. Therefore, it may make sense to specify an entire cluster as requiring FIPS-compliance. To that end, this is the design document as to how that will work. The assumption is that the docker engine can be run in FIPS mode or not, based on an env variable toggle, or some other toggle. We want to prevent someone from accidentally restarting a cluster node in non-FIPS mode, if the cluster is mandatory FIPS-compliant.

The proposal is to:

  1. Add a boolean to the Cluster object, FIPS, that specifies whether the cluster is FIPS enabled or not. This is set when the cluster is first created, and can't be changed afterward, because it definitely does not make sense to go from non-FIPS to FIPS, since that means that it's possible you have some old raft data lying around that is not FIPS compliant. For simplicity, I suggest we don't allow migration from FIPS->non-FIPS as well, although loosening compliance requirements is easier than tightening them.

  2. Add a boolean to the NodeDescription object, FIPS, that specifies whether a given node is FIPS enabled or not. Agents will have to self-report their FIPSness - we have no way of enforcing that they are FIPS compliant. But the dispatcher will terminate connections from an agent that is not FIPS compliant.

1. When a manager starts up, as soon as it loads the cluster object, it checks to see if the cluster requires FIPS and ensures that it is running in FIPS mode. If not, the manager will refuse to complete starting up.

1. The swarm token version is bumped to indicate the FIPSness of the cluster. If a node is not running in FIPS mode, and it is told to join a FIPS cluster, it will refuse to join. So the swarm token will now look like SWMTKN-2-<0/1 FIPS>-<root digest>-<secret>.

  1. Add a field to the TLS certificate (maybe an invalid DNS SAN? Maybe overload one of the subject fields?) to indicate FIPSness. When a node starts up, it matches this field against it's configured FIPSness, and errors if there's a mismatch.

  2. Add an extra check in all TLS connections for FIPSness

The massive branch that does all if this is at https://github.com/cyli/swarmkit/tree/fernet-encryption-inprocess.

  1. I also plan on changing the existing FIPS code to not be based on an env var, but to take a config value that is propagated through the necessary components of node. This makes it easier to test mixed clusters. This is a large-ish change, though: [fips] Remove the GOFIPS env var check #2562

Question: FIPS seems to be one possible axis of compliance. Are we going to want to enforce compliance along some other axis, and prevent nodes from being able to join the cluster? If so, would it make sense to have a compliance object, which includes FIPS, as opposed to just a bool?

Thoughts? @docker/core-swarmkit-maintainers @docker/security-team @stevvooe

Additional TODOS:

  • If a node key is not PKCS8 in FIPS mode, fail to start up - this should never have happened
  • If the user provides us with a root CA key for root rotation that is not PKCS8, convert it to PKCS8 in the control API

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions