Catch unknown matcher types in gossipped silences#2479
Catch unknown matcher types in gossipped silences#2479beorn7 wants to merge 1 commit intorelease-0.21from
Conversation
I believe that gossibping between AM instances with different versions of the silence protobuf could create a situation where an unknown matcher type won't be recognized, thereby possibly inverting the behavior of the silence. This makes the matching more robust by error'ing out when facing an unknown matcher type. This has come to my attention during the recent work on silences with negative matching. To make cluster upgrades more robust, this fix is needed, and therefore I'm proposing it for a bugfix release of v0.21. Signed-off-by: beorn7 <beorn@grafana.com>
|
Thanks everyone for your review. I have played with this, the unmodified AM, and the new silences with negative matchers. There is good and bad news: The good news is that even without this change, the AM isn't just inverting the meaning of a silence. A silence with a negative matcher will make it into the storage via gossipping, but it will not silence anything. The bad news is that with or without this change, the UI just breaks because new silences with negative matching cannot be unmarshaled from protobuf, so the API returns a 500, which breaks the display of silences as a whole. We could certainly tweak v0.21 to treat the new silences more gracefully, but I think the important part is that the new silences don't silence anything silently (no pun intended) they are not supposed to silence. Even if we created a more graceful degradation, the solution would still be to update all instances in the cluster. So arguably, breaking less gracefully is even better because it is easier to notice. Therefore, I'll close this and won't cut a v0.21.1. (Conveniently, the protobuf CVE-2021-3121 has just turned out to not affect Alertmanager.). If you disagree with my verdict, please follow up. |
This has been discussed in #2479. Even if the conclusion there was that we don't need this in a bugfix release, it's still better to have this kind of robustness. So this introduces the same check into the main branch. Signed-off-by: beorn7 <beorn@grafana.com>
This has been discussed in #2479. Even if the conclusion there was that we don't need this in a bugfix release, it's still better to have this kind of robustness. So this introduces the same check into the main branch. Signed-off-by: beorn7 <beorn@grafana.com>
I believe that gossibping between AM instances with different versions
of the silence protobuf could create a situation where an unknown
matcher type won't be recognized, thereby possibly inverting the
behavior of the silence.
This makes the matching more robust by error'ing out when facing an
unknown matcher type.
This has come to my attention during the recent work on silences with
negative matching. To make cluster upgrades more robust, this fix is
needed, and therefore I'm proposing it for a bugfix release of v0.21.
Signed-off-by: beorn7 beorn@grafana.com