Skip to content

K8s MM-less Hadoop ingestion fails on K8sDiscoveryModule #16717

@trompa

Description

@trompa

Affected Version

30.0.0

Description

Dev environment with k8s, mm-less cluster.
coordinator as Overlord

indexer settings:

        druid.indexer.runner.capacity: {{ .Values.nodes.coordinators.runners  }}
        druid.indexer.runner.namespace: {{ .Release.Namespace }}
        druid.indexer.runner.type: k8s
        druid.indexer.task.encapsulatedTask: true
        druid.indexer.storage.type: metadata

S3 and kafka ingestion work fine, they spin a k8s job and ingest properly.

now testing the hadoop ingestion with EMR 7.0.0
java 17 bod on pods ( default druid image ) and EMR nodes

After setting some jobProperties:

        "yarn.resourcemanager.address": "ip-X-X-X-X.eu-west-1.compute.internal:8032",
        "yarn.resourcemanager.hostname": "ip-X-X-X-X..eu-west-1.compute.internal",
        "yarn.nodemanager.hostname": "ip-X-X-X-X.eu-west-1.compute.internal",
        "yarn.nodemanager.amrmproxy.address": "ip-X-X-X-X..eu-west-1.compute.internal:8049",
        "fs.defaultFS": "hdfs://ip-X-X-X-X..eu-west-1.compute.internal:8020",
        "mapreduce.job.classloader":"true",
        "mapreduce.job.classloader.system.classes":"-javax.validation.,-javax.el.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop.",
         "fs.s3.aws.credentials.provider": "com.amazonaws.auth.WebIdentityTokenCredentialsProvider",
        "fs.s3a.aws.credentials.provider": "com.amazonaws.auth.WebIdentityTokenCredentialsProvider",
mapreduce.framework.name": "yarn",
        "yarn.application.classpath": "$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,/usr/lib/hadoop-lzo/lib/*,/usr/share/aws/emr/emrfs/conf,/usr/share/aws/emr/emrfs/lib/*,/usr/share/aws/emr/emrfs/auxlib/*,/usr/share/aws/emr/lib/*,/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar,/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar,/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar,/usr/share/aws/emr/cloudwatch-sink/lib/*,/usr/lib/hudi/cli/lib/*",
        "yarn.app.mapreduce.am.command-opts": "--add-opens=java.base/java.lang=ALL-UNNAMED",
        "hadoop.rpc.protection":"privacy"

I managed to make it to:

  • scan the input data on S3
  • send the job to EMR
  • fix javax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and server Error

But now i am stuck on:

2024-07-10T12:34:54,783 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_1720614656962_0001
2024-07-10T12:35:03,871 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1720614656962_0001 running in uber mode : false
2024-07-10T12:35:03,872 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 0% reduce 0%
2024-07-10T12:35:18,003 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1720614656962_0001_m_000007_0, Status : FAILED
Error: com.google.inject.CreationException: Unable to create injector, see the following errors:

1) Problem parsing object at prefix[druid.discovery.k8s]: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
 at [Source: UNKNOWN; line: -1, column: -1].
  at org.apache.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:151) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
  at org.apache.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:151) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
  while locating org.apache.druid.k8s.discovery.K8sDiscoveryConfig
    for the 1st parameter of org.apache.druid.k8s.discovery.PodInfo.<init>(PodInfo.java:35)
  at org.apache.druid.k8s.discovery.PodInfo.class(PodInfo.java:35)
  while locating org.apache.druid.k8s.discovery.PodInfo
    for field at org.apache.druid.k8s.discovery.K8sDiscoveryModule$DruidLeaderSelectorProvider.podInfo(K8sDiscoveryModule.java:103)
  at org.apache.druid.k8s.discovery.K8sDiscoveryModule.configure(K8sDiscoveryModule.java:89) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
Caused by: java.lang.IllegalArgumentException: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
 at [Source: UNKNOWN; line: -1, column: -1]
	at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:4314)
	at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:4245)
	at org.apache.druid.guice.JsonConfigurator.configurate(JsonConfigurator.java:131)
	at org.apache.druid.guice.JsonConfigProvider.get(JsonConfigProvider.java:241)
	at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:81)
	at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision(InternalFactoryToInitializableAdapter.java:53)
	at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:61)
	at com.google.inject.internal.InternalFactoryToInitializableAdapter.get(InternalFactoryToInitializableAdapter.java:45)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
	at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
	at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38)
	at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62)
	at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:110)
	at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:90)
	at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:268)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
	at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
	at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
	at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
	at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
	at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:54)
	at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:132)
	at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:93)
	at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:80)
	at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085)
	at com.google.inject.internal.MembersInjectorImpl.injectAndNotify(MembersInjectorImpl.java:80)
	at com.google.inject.internal.Initializer$InjectableReference.get(Initializer.java:223)
	at com.google.inject.internal.Initializer.injectAll(Initializer.java:132)
	at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:174)
	at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:110)
	at com.google.inject.Guice.createInjector(Guice.java:99)
	at com.google.inject.Guice.createInjector(Guice.java:73)
	at com.google.inject.Guice.createInjector(Guice.java:62)
	at org.apache.druid.initialization.ExtensionInjectorBuilder.build(ExtensionInjectorBuilder.java:49)
	at org.apache.druid.initialization.ServerInjectorBuilder.build(ServerInjectorBuilder.java:118)
	at org.apache.druid.initialization.ServerInjectorBuilder.makeServerInjector(ServerInjectorBuilder.java:73)
	at org.apache.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:63)
	at org.apache.druid.indexer.HadoopDruidIndexerConfig.<clinit>(HadoopDruidIndexerConfig.java:109)
	at org.apache.druid.indexer.DetermineHashedPartitionsJob$DetermineHashedPartitionsPartitioner.setConf(DetermineHashedPartitionsJob.java:481)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:80)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:141)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:725)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
 at [Source: UNKNOWN; line: -1, column: -1]
	at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
	at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:1907)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:587)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:610)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:293)
	at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:288)
	at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:521)
	at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:363)
	at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:196)
	at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:4309)
	... 51 more
Caused by: java.lang.IllegalArgumentException: null/empty clusterIdentifier
	at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
	at org.apache.druid.k8s.discovery.K8sDiscoveryConfig.<init>(K8sDiscoveryConfig.java:75)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
	at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:124)
	at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:291)
	... 58 more

From what i understand, the job on Yarn should not need this config, but still tries to inject it.
Cannot find a way to inject fake values or skip this.

I know MMless and K8s are still experimental, so might be something not yet really tested. Is there any way to fix this?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions