prometheus · mxinden · Mar 5, 2019 · Feb 21, 2019 · brian-brazil · Feb 21, 2019
diff --git a/doc/design/secure-cluster-traffic.md b/doc/design/secure-cluster-traffic.md
@@ -0,0 +1,117 @@
+# Secure Alertmanager cluster traffic
+
+Type: Design document
+
+Date: 2019-02-21
+
+Author: Max Inden <IndenML@gmail.com>
+
+
+## Status Quo
+
+Alertmanager supports [high
+availability](https://github.com/prometheus/alertmanager/blob/master/README.md#high-availability)
+by interconnecting multiple Alertmanager instances building an Alertmanager
+cluster. Instances of a cluster communicate on top of a gossip protocol managed
+via Hashicorps [_Memberlist_](https://github.com/hashicorp/memberlist) library.
+_Memberlist_ uses two channels to communicate: TCP for reliable and UDP for
+best-effort communication.
+
+Alertmanager instances use the gossip layer to:
+
+- Keep track of membership
+- Replicate silence creation, update and deletion
+- Replicate notification log
+
+As of today the communication between Alertmanager instances in a cluster is
+sent in clear-text.
+
+
+## Goal
+
+Instances in a cluster should communicate among each other in a secure fashion.
+Alertmanager should guarantee confidentiality, integrity and client authenticity
+for each message touching the wire. While this would improve the security of
+single datacenter deployments, one could see this as a necessity for
+wide-area-network deployments.
+
+
+## Non-Goal
+
+Even though solutions might also be applicable to the API endpoints exposed by
+Alertmanager, it is not the goal of this design document to secure the API
+endpoints.
+
+
+## Proposed Solution - TLS Memberlist
+
+_Memberlist_ enables users to implement their own [transport
+layer](https://godoc.org/github.com/hashicorp/memberlist#Transport) without the
+need of forking the library itself. That transport layer needs to support
+reliable as well as best-effort communication. Instead of using TCP and UDP like
+the default transport layer of _Memberlist_, the suggestion is to only use TCP
+for both reliable as well as best-effort communication. On top of that TCP
+layer, one can use mutual TLS to secure all communication. A proof-of-concept
+implementation can be found here:
+https://github.com/mxinden/memberlist-tls-transport.
+
+The data gossiped between instances does not have a low-latency requirement that
+TCP could not fulfill, same would apply for the relatively low data throughput
+requirements of Alertmanager.
+
+TCP connections could be kept alive beyond a single message to reduce latency as
+well as handshake overhead costs. While this is feasible in a 3-instance
+Alertmanager cluster, the discussed custom implementation would need to limit
+the amount of open connections for clusters with many instances (#connections =
+n*(n-1)/2).
+
+As of today, Alertmanager already forces _Memberlist_ to use the reliable TCP
+instead of the best-effort UDP connection to gossip large notification logs and
+silences between instances. The reason is, that those packets would otherwise
+exceed the [MTU](https://en.wikipedia.org/wiki/Maximum_transmission_unit) of
+most UDP setups. Splitting packets is not supported by _Memberlist_ and was not
+considered worth the effort to be implemented in Alertmanager either. For more
+info see this [Github
+issue](https://github.com/prometheus/alertmanager/issues/1412).
+
+With the last [Prometheus developer
+summit](https://docs.google.com/document/d/1-C5PycocOZEVIPrmM1hn8fBelShqtqiAmFptoG4yK70/edit)
+in mind, the Prometheus projects preferred security mechanism seems to be mutual
+TLS. Having Alertmanager use the same mechanism would ease deployment with the
+rest of the Prometheus stack.
+
+As a side effect (benefit) Alertmanager would only need a single open port (TCP
+traffic) instead of two open ports (TCP and UDP traffic) for cluster
+communication. This does not affect the API endpoint which remains a separate
+TCP port.
+
+
+## Alternative Solutions
+
+### Symmetric Memberlist
+
+_Memberlist_ supports [symmetric key
+encryption](https://godoc.org/github.com/hashicorp/memberlist#Keyring) via
+AES-128, AES-192 or AES-256 ciphers. One can specify multiple keys for rolling
+updates. Securing the cluster traffic via symmetric encryption would just
+involve small configuration changes in the Alertmanager code base.
+
+
+### Replace Memberlist
+
+Coordinating membership might not be required by the Alertmanager cluster
+component. Instead this could be bound to static configuration or e.g. DNS
+service discovery. On the other hand, gossiping silences and notifications is
+ideally done in an eventual consistent gossip fashion, given that Alertmanager
+is supposed to scale beyond a 3-instance cluster and beyond local-area-network
+deployments. With these requirements in mind, replacing _Memberlist_ with an
+entirely self-built communication layer is a great undertaking.
+
+
+### TLS Memberlist with DTLS
+
+Instead of redirecting all best-effort traffic via the reliable channel as
+proposed above, one could also secure the best-effort channel itself using UDP
+and [DTLS](https://en.wikipedia.org/wiki/Datagram_Transport_Layer_Security) in
+addition to securing the reliable traffic via TCP and TLS. DTLS is not supported
+by the Golang standard library.