ca: more informational error messages and debug logs#1912
Conversation
…ot certificate so it's more obvious whether a swarm token is mistyped (for instance, copy-and-pasted wrong). Signed-off-by: cyli <ying.li@docker.com>
| if _, err := X509Cert.Verify(opts); err != nil { | ||
| fmtString := "Jan 02 15:04:05 2006 MST" | ||
| log.G(ctx).WithError(err).Debugf("invalid certificate issued by %s (not before %s, not after %s)", | ||
| X509Cert.Issuer.CommonName, X509Cert.NotBefore.Format(fmtString), X509Cert.NotAfter.Format(fmtString)) |
There was a problem hiding this comment.
I think it would be great to compare NotBefore / NotAfter to the current wall clock time, and include the current time in the error message if it looks like a factor. It may not be obvious when a clock is out of sync.
There was a problem hiding this comment.
Actually, this suggestion is a bit silly for the log message, since it will include a timestamp automatically. But I still like the idea of including this information in the user-facing error message.
| // Check to see if this certificate was signed by our CA, and isn't expired | ||
| if _, err := X509Cert.Verify(opts); err != nil { | ||
| fmtString := "Jan 02 15:04:05 2006 MST" | ||
| log.G(ctx).WithError(err).Debugf("invalid certificate issued by %s (not before %s, not after %s)", |
There was a problem hiding this comment.
I'd think this should be an error or warning log rather than a debug log. This information should be made very visible.
Ideally, we should include most useful information in the error that gets bubbled up to the user, so checking the logs is a last resort. Including NotBefore / NotAfter and the current system time in the user-facing error when time seems like a factor may be a good idea here.
There was a problem hiding this comment.
Have removed the log message, since it gets logged elsewhere, and am just returning the timestamps in the error.
0030297 to
5ac3fc2
Compare
|
|
| now := time.Now().UTC() | ||
| if now.Before(cert.NotBefore) { | ||
| return errors.Wrap(err, fmt.Sprintf("certificate not valid before %s, and it is currently %s", | ||
| cert.NotBefore.UTC().Format(time.RFC1123), now.Format(time.RFC1123))) |
There was a problem hiding this comment.
You can use errors.Wrapf to simplify this slightly.
There was a problem hiding this comment.
Ah right, thanks, not sure why I forgot about that function. :)
| cert.NotBefore.UTC().Format(time.RFC1123), now.Format(time.RFC1123))) | ||
| } | ||
| return errors.Wrap(err, fmt.Sprintf("certificate expires at %s, and it is currently %s", | ||
| cert.NotAfter.UTC().Format(time.RFC1123), now.Format(time.RFC1123))) |
There was a problem hiding this comment.
You can use errors.Wrapf to simplify this slightly.
… with an expiry error. Signed-off-by: cyli <ying.li@docker.com>
5ac3fc2 to
aefc3e9
Compare
| // whether a certificate is not yet valid or expired, we also need to perform the expiry checks ourselves. | ||
| func verifyCertificate(cert *x509.Certificate, opts x509.VerifyOptions) error { | ||
| _, err := cert.Verify(opts) | ||
| if invalidErr, ok := err.(x509.CertificateInvalidError); ok && invalidErr.Reason == x509.Expired { |
There was a problem hiding this comment.
Just want to confirm that x509.Expired is returned when NotBefore is violated as well as the NotAfter case.
There was a problem hiding this comment.
|
LGTM |
| split := strings.Split(token, "-") | ||
| if len(split) != 4 || split[0] != "SWMTKN" || split[1] != "1" { | ||
| if len(split) != 4 || split[0] != "SWMTKN" || split[1] != "1" || len(split[2]) != base36DigestLen || len(split[3]) != maxGeneratedSecretLength { | ||
| return "", errors.New("invalid join token") |
There was a problem hiding this comment.
Perhaps not for this PR, but should we mention what it's expected to look like?
"token should start with SWMTKN, followed by a 36 character alphanumeric code"
There was a problem hiding this comment.
I wonder whether that may be too much information to give in an error message, just because the the swarm token is: SWMTKN-<version>-<other data>, and depending on what version the token is, the rest of the token may be different.
There was a problem hiding this comment.
Although at this point, there is only 1 version, so maybe we can deal with messaging other versions later :)
There was a problem hiding this comment.
Yes, I wasn't sure either; also not if that should be done in swarmkit, or "prettied" in docker
|
I think we should not expose the inner workings of the token.
|
This checks the format of the swarm token so that there would be a more obvious error if the token were mis-typed (for instance, copy-pasting the join token but leaving off the last character).
This also prints out more debugging information if a node certificate fails to validate (such as the issuer, and the "not before" and "not after" dates, making certificate expiry errors easier to debug.
cc @thaJeztah