Vault-driven Consul TTL checks#1349
Conversation
There was a problem hiding this comment.
Any way we can auto detect this rather than force all of our Consul users -- most of whom won't be on 0.6.4 -- to have to change their configuration files?
There was a problem hiding this comment.
Also, I think we need to make registration opt-in. Aside from potentially causing problems for people upgrading, we don't want to cause issues for people that already have configured Consul health checks advertising vault
There was a problem hiding this comment.
re: first point: there isn't any builtin way to do this, no. We could finger print the Consul server and then change our behavior based on the support of the Consul Servers and agent, but that functionality doesn't exist out of the box anywhere atm.
re: point #2, the worst that would happen is a double registration.
Let's touch base re: this today to hash out next steps on these points (I wrestled with them before hitting enter).
There was a problem hiding this comment.
Re: #1, revised the API call because we're never passing more than 1K of output. There is no longer a dependency on Consul 0.6.4.
Hook asynchronous notifications into Core to change the status of vault based on its active/standby, and sealed/unsealed status.
Note: Godeps.json not updated
Vault will now register itself with Consul. The active node can be found using `active.vault.service.consul`. All standby vaults are available via `standby.vault.service.consul`. All unsealed vaults are considered healthy and available via `vault.service.consul`. Change in status and registration is event driven and should happen at the speed of a write to Consul (~network RTT + ~1x fsync(2)). Healthy/active: ``` curl -X GET 'http://127.0.0.1:8500/v1/health/service/vault?pretty' && echo; [ { "Node": { "Node": "vm1", "Address": "127.0.0.1", "TaggedAddresses": { "wan": "127.0.0.1" }, "CreateIndex": 3, "ModifyIndex": 20 }, "Service": { "ID": "vault:127.0.0.1:8200", "Service": "vault", "Tags": [ "active" ], "Address": "127.0.0.1", "Port": 8200, "EnableTagOverride": false, "CreateIndex": 17, "ModifyIndex": 20 }, "Checks": [ { "Node": "vm1", "CheckID": "serfHealth", "Name": "Serf Health Status", "Status": "passing", "Notes": "", "Output": "Agent alive and reachable", "ServiceID": "", "ServiceName": "", "CreateIndex": 3, "ModifyIndex": 3 }, { "Node": "vm1", "CheckID": "vault-sealed-check", "Name": "Vault Sealed Status", "Status": "passing", "Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server", "Output": "", "ServiceID": "vault:127.0.0.1:8200", "ServiceName": "vault", "CreateIndex": 19, "ModifyIndex": 19 } ] } ] ``` Healthy/standby: ``` [snip] "Service": { "ID": "vault:127.0.0.2:8200", "Service": "vault", "Tags": [ "standby" ], "Address": "127.0.0.2", "Port": 8200, "EnableTagOverride": false, "CreateIndex": 17, "ModifyIndex": 20 }, "Checks": [ { "Node": "vm2", "CheckID": "serfHealth", "Name": "Serf Health Status", "Status": "passing", "Notes": "", "Output": "Agent alive and reachable", "ServiceID": "", "ServiceName": "", "CreateIndex": 3, "ModifyIndex": 3 }, { "Node": "vm2", "CheckID": "vault-sealed-check", "Name": "Vault Sealed Status", "Status": "passing", "Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server", "Output": "", "ServiceID": "vault:127.0.0.2:8200", "ServiceName": "vault", "CreateIndex": 19, "ModifyIndex": 19 } ] } ] ``` Sealed: ``` "Checks": [ { "Node": "vm2", "CheckID": "serfHealth", "Name": "Serf Health Status", "Status": "passing", "Notes": "", "Output": "Agent alive and reachable", "ServiceID": "", "ServiceName": "", "CreateIndex": 3, "ModifyIndex": 3 }, { "Node": "vm2", "CheckID": "vault-sealed-check", "Name": "Vault Sealed Status", "Status": "critical", "Notes": "Vault service is healthy when Vault is in an unsealed status and can become an active Vault server", "Output": "Vault Sealed", "ServiceID": "vault:127.0.0.2:8200", "ServiceName": "vault", "CreateIndex": 19, "ModifyIndex": 38 } ] ```
Useful if the HOME envvar is not set because `vault` was launched in a clean environment (e.g. `env -i vault ...`).
Brought to you by: Dept of 2nd thoughts before pushing enter on `git push`
The rest of the tests here use spaces, not tabs
Hide all Consul checks behind `CONSUL_HTTP_ADDR` env vs `CONSUL_ADDR` which is non-standard.
Consul service registration for Vault requires Consul 0.6.4.
If the local Consul agent is not available while attempting to step down from active or up to active, retry once a second. Allow for concurrent changes to the state with a single registration updater. Fix standby initialization.
Consul is never going to pass in more than 1K of output. This mitigates the pre-0.6.4 concern.
Consul is never going to pass in more than 1K of output. This mitigates the pre-0.6.4 concern.
Vault will now register itself with Consul. The active node can be found using
active.vault.service.consul. All standby vaults are available viastandby.vault.service.consul. All unsealed vaults are considered healthy and available viavault.service.consul. Change in status and registration is event driven and should happen at the speed of a write to Consul (~network RTT +fsync(2)of the Consul Servers).Healthy/active:
Healthy/standby:
Sealed: