Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions doc/design/secure-cluster-traffic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Secure Alertmanager cluster traffic

Type: Design document

Date: 2019-02-21

Author: Max Inden <IndenML@gmail.com>


## Status Quo

Alertmanager supports [high
availability](https://github.com/prometheus/alertmanager/blob/master/README.md#high-availability)
by interconnecting multiple Alertmanager instances building an Alertmanager
cluster. Instances of a cluster communicate on top of a gossip protocol managed
via Hashicorps [_Memberlist_](https://github.com/hashicorp/memberlist) library.
_Memberlist_ uses two channels to communicate: TCP for reliable and UDP for
best-effort communication.

Alertmanager instances use the gossip layer to:

- Keep track of membership
- Replicate silence creation, update and deletion
- Replicate notification log

As of today the communication between Alertmanager instances in a cluster is
sent in clear-text.


## Goal

Instances in a cluster should communicate among each other in a secure fashion.
Alertmanager should guarantee confidentiality, integrity and client authenticity
for each message touching the wire. While this would improve the security of
single datacenter deployments, one could see this as a necessity for
wide-area-network deployments.


## Non-Goal

Even though solutions might also be applicable to the API endpoints exposed by
Alertmanager, it is not the goal of this design document to secure the API
endpoints.


## Proposed Solution - TLS Memberlist

_Memberlist_ enables users to implement their own [transport
layer](https://godoc.org/github.com/hashicorp/memberlist#Transport) without the
need of forking the library itself. That transport layer needs to support
reliable as well as best-effort communication. Instead of using TCP and UDP like
the default transport layer of _Memberlist_, the suggestion is to only use TCP
for both reliable as well as best-effort communication. On top of that TCP
layer, one can use mutual TLS to secure all communication. A proof-of-concept
implementation can be found here:
https://github.com/mxinden/memberlist-tls-transport.

The data gossiped between instances does not have a low-latency requirement that
TCP could not fulfill, same would apply for the relatively low data throughput
requirements of Alertmanager.

TCP connections could be kept alive beyond a single message to reduce latency as
well as handshake overhead costs. While this is feasible in a 3-instance
Alertmanager cluster, the discussed custom implementation would need to limit
the amount of open connections for clusters with many instances (#connections =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is one per other AM really a problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memberlist wants to have one connection as a reliable connection all to itself. Thereby we need at least two, one reliable TCP and one pseudo-best-effort connection unless we want to go down the road of multiplexing a single TCP connection.

@brian-brazil what maximum cluster size would you expect in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle someone might run two per datacenter, and tens of datacenters isn't that unusual. Say 100?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I will make sure to include that in the performance testing (in case we decide for this route).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "full sync" tcp request happens relatively infrequently, and send reliable is only used for especially large gossip messages (which is probably also relatively infrequent. it happens <<1% of the time at SC). Practically speaking, each instance would only maintain one connection to the other instances.

n*(n-1)/2).

As of today, Alertmanager already forces _Memberlist_ to use the reliable TCP
instead of the best-effort UDP connection to gossip large notification logs and
silences between instances. The reason is, that those packets would otherwise
exceed the [MTU](https://en.wikipedia.org/wiki/Maximum_transmission_unit) of
most UDP setups. Splitting packets is not supported by _Memberlist_ and was not
considered worth the effort to be implemented in Alertmanager either. For more
info see this [Github
issue](https://github.com/prometheus/alertmanager/issues/1412).

With the last [Prometheus developer
summit](https://docs.google.com/document/d/1-C5PycocOZEVIPrmM1hn8fBelShqtqiAmFptoG4yK70/edit)
in mind, the Prometheus projects preferred security mechanism seems to be mutual
TLS. Having Alertmanager use the same mechanism would ease deployment with the
rest of the Prometheus stack.

As a side effect (benefit) Alertmanager would only need a single open port (TCP
traffic) instead of two open ports (TCP and UDP traffic) for cluster
communication. This does not affect the API endpoint which remains a separate
TCP port.


## Alternative Solutions

### Symmetric Memberlist

_Memberlist_ supports [symmetric key
encryption](https://godoc.org/github.com/hashicorp/memberlist#Keyring) via
AES-128, AES-192 or AES-256 ciphers. One can specify multiple keys for rolling
updates. Securing the cluster traffic via symmetric encryption would just
involve small configuration changes in the Alertmanager code base.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both methods require generating a key, what is the downside of this method vs. the proposed method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this would be a valid approach -- but we would need to

  • add amtool genkey command
  • specify in the doc that we use that library and that this form of encryption could change in the future

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it'd be a different way of doing auth than we're going to use elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we contribute our approach upstream?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both methods require generating a key, what is the downside of this method vs. the proposed method?

@stuartnelson3 sorry for not covering that properly in the document:

  • asymmetric vs symmetric: TLS gives users more possible trust structures e.g. different certificate hierarchies, enabling users to exclude a specific (bad) alertmanager instance.
  • default vs one-off: Symmetric crypto is easier to setup in itself, but probably not the default security option for most users, hence a one-off solution. I would expect most operators to already have a public key infrastructure for tls in place (please correct me if I am wrong).
  • replay attacks: Given that memberlists symmetric crypto operates on unordered channel (UDP) I don't see how it can prevent replay attacks. TLS runs on top of TCP which would discard out of order messages of a replay attack.
  • consistency with Prometheus: As @brian-brazil said, the suggested method would keep Alertmanager consistent with the rest of the stack.

What are your thoughts @stuartnelson3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we contribute our approach upstream?

Which one?

  • Symetric: It is already part of memberlists core.

  • TLSTransport: The TLSTransport logic implementing the Transport interface could be suggested to be added as an alternative to Memberlist's NetTransport. As TLSTransport does not alter any Memberlist code, I would say this is not critical.

Does that answer the question @roidelapluie?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was wondering about the cost of developing and maintaining our own transport (and being consistent within the prometheus org) vs. using the keyring (and being inconsistent).

The points you list here seem like enough to warrant creating our own Transport.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replay attacks: Given that memberlists symmetric crypto operates on unordered channel (UDP) I don't see how it can prevent replay attacks. TLS runs on top of TCP which would discard out of order messages of a replay attack.

Re-reading the DTLS RFC, it does prevent replay attacks via an epoch and sequence number. I am sorry for the confusion.



### Replace Memberlist

Coordinating membership might not be required by the Alertmanager cluster
component. Instead this could be bound to static configuration or e.g. DNS
service discovery. On the other hand, gossiping silences and notifications is
ideally done in an eventual consistent gossip fashion, given that Alertmanager
is supposed to scale beyond a 3-instance cluster and beyond local-area-network
deployments. With these requirements in mind, replacing _Memberlist_ with an
entirely self-built communication layer is a great undertaking.


### TLS Memberlist with DTLS

Instead of redirecting all best-effort traffic via the reliable channel as
proposed above, one could also secure the best-effort channel itself using UDP
and [DTLS](https://en.wikipedia.org/wiki/Datagram_Transport_Layer_Security) in
addition to securing the reliable traffic via TCP and TLS. DTLS is not supported
by the Golang standard library.