Fix node discovery to ignore unknown DruidServices#12157
Fix node discovery to ignore unknown DruidServices#12157jihoonson merged 8 commits intoapache:masterfrom
Conversation
| public class DiscoveryDruidNode | ||
| { | ||
| private static final Logger LOG = new Logger(DiscoveryDruidNode.class); | ||
| private static final TypeReference<Map<String, Object>> RAW_DRUID_SERVICE_TYPE = |
There was a problem hiding this comment.
I don't see any reference to this static variable, do I miss something?
There was a problem hiding this comment.
Good catch! It is a left over that I forgot to delete after I cleaned up my PR. I will delete it.
| ) | ||
| { | ||
| Map<String, DruidService> services = new HashMap<>(); | ||
| if (rawServices != null && !rawServices.isEmpty()) { |
There was a problem hiding this comment.
nit: can use org.apache.druid.utils.CollectionUtils.isNullOrEmpty to check the rawServices
There was a problem hiding this comment.
Oh, that method cannot be used as Map is not a Collection.
There was a problem hiding this comment.
Oh, right. Maybe we can add some overridden methods to this util class in the future.
asdf2014
left a comment
There was a problem hiding this comment.
👍 Overall LGTM. It seems that after the previous PR was merged, the forbiddenapis plugin checks fine now, but there are some other errors reported during the CI process that don't seem to be related to the forbiddenapis plugin check, as follows:
- Max number of retries[240] exceeded for Task[waiting for SQL metadata refresh]
- Background operation retry gave up ConnectionLossException: KeeperErrorCode = ConnectionLoss
- AssertionError: lists don't have the same size expected [2] but found [0]
- RuntimeException: org.apache.druid.java.util.common.ISE: one or more queries failed
|
@clintropolis, @cryptoe, @FrankChen021, @asdf2014 thank you for the review. The change in this PR revealed a bug in |
|
@jihoonson I see. The I don't have a better suggestion considering the backward compatibility. I'm wondering if there's any way that we can check or avoid such conflict. |
|
Yeah, I think it makes sense to use a special name as the subtype key such as "_type" or "@type" in the future. We cannot easily modify existing ones unfortunately though since it will break rolling upgrade due to the same reason as why I could not just get rid of the "type" from DataNodeService. |
|
@clintropolis thank you for the review. @cryptoe @FrankChen021 @asdf2014 do you have more comments? |
|
@FrankChen021 thank you! |
Description
This PR fixes a bug that
DiscoveryDruidNodecannot be created while deserializing it from a JSON when there is an unknownDruidService. This bug can be seen when you do rolling update. Suppose you have a customMyDruidServicethat is added in a new version,v2. While you are upgrading your cluster fromv1tov2, other nodes who watch the node that is upgraded first and announcesMyDruidServicewill not understand whatMyDruidServiceis until they are upgraded tov2. In this case, this bug currently can fail node discovery. This PR fixes the bug by ignoring unknownDruidServicesduring deserialization ofDiscoveryDruidNode.Key changed/added classes in this PR
DiscoveryDruidNodeThis PR has: