Affected Version
30.0.0
Description
Dev environment with k8s, mm-less cluster.
coordinator as Overlord
indexer settings:
druid.indexer.runner.capacity: {{ .Values.nodes.coordinators.runners }}
druid.indexer.runner.namespace: {{ .Release.Namespace }}
druid.indexer.runner.type: k8s
druid.indexer.task.encapsulatedTask: true
druid.indexer.storage.type: metadata
S3 and kafka ingestion work fine, they spin a k8s job and ingest properly.
now testing the hadoop ingestion with EMR 7.0.0
java 17 bod on pods ( default druid image ) and EMR nodes
After setting some jobProperties:
"yarn.resourcemanager.address": "ip-X-X-X-X.eu-west-1.compute.internal:8032",
"yarn.resourcemanager.hostname": "ip-X-X-X-X..eu-west-1.compute.internal",
"yarn.nodemanager.hostname": "ip-X-X-X-X.eu-west-1.compute.internal",
"yarn.nodemanager.amrmproxy.address": "ip-X-X-X-X..eu-west-1.compute.internal:8049",
"fs.defaultFS": "hdfs://ip-X-X-X-X..eu-west-1.compute.internal:8020",
"mapreduce.job.classloader":"true",
"mapreduce.job.classloader.system.classes":"-javax.validation.,-javax.el.,java.,javax.,org.apache.commons.logging.,org.apache.log4j.,org.apache.hadoop.",
"fs.s3.aws.credentials.provider": "com.amazonaws.auth.WebIdentityTokenCredentialsProvider",
"fs.s3a.aws.credentials.provider": "com.amazonaws.auth.WebIdentityTokenCredentialsProvider",
mapreduce.framework.name": "yarn",
"yarn.application.classpath": "$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,$HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,$HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*,/usr/lib/hadoop-lzo/lib/*,/usr/share/aws/emr/emrfs/conf,/usr/share/aws/emr/emrfs/lib/*,/usr/share/aws/emr/emrfs/auxlib/*,/usr/share/aws/emr/lib/*,/usr/share/aws/emr/ddb/lib/emr-ddb-hadoop.jar,/usr/share/aws/emr/goodies/lib/emr-hadoop-goodies.jar,/usr/share/aws/emr/kinesis/lib/emr-kinesis-hadoop.jar,/usr/share/aws/emr/cloudwatch-sink/lib/*,/usr/lib/hudi/cli/lib/*",
"yarn.app.mapreduce.am.command-opts": "--add-opens=java.base/java.lang=ALL-UNNAMED",
"hadoop.rpc.protection":"privacy"
I managed to make it to:
- scan the input data on S3
- send the job to EMR
- fix
javax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and server Error
But now i am stuck on:
2024-07-10T12:34:54,783 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_1720614656962_0001
2024-07-10T12:35:03,871 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1720614656962_0001 running in uber mode : false
2024-07-10T12:35:03,872 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 0% reduce 0%
2024-07-10T12:35:18,003 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1720614656962_0001_m_000007_0, Status : FAILED
Error: com.google.inject.CreationException: Unable to create injector, see the following errors:
1) Problem parsing object at prefix[druid.discovery.k8s]: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
at [Source: UNKNOWN; line: -1, column: -1].
at org.apache.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:151) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
at org.apache.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:151) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
while locating org.apache.druid.k8s.discovery.K8sDiscoveryConfig
for the 1st parameter of org.apache.druid.k8s.discovery.PodInfo.<init>(PodInfo.java:35)
at org.apache.druid.k8s.discovery.PodInfo.class(PodInfo.java:35)
while locating org.apache.druid.k8s.discovery.PodInfo
for field at org.apache.druid.k8s.discovery.K8sDiscoveryModule$DruidLeaderSelectorProvider.podInfo(K8sDiscoveryModule.java:103)
at org.apache.druid.k8s.discovery.K8sDiscoveryModule.configure(K8sDiscoveryModule.java:89) (via modules: com.google.inject.util.Modules$OverrideModule -> org.apache.druid.k8s.discovery.K8sDiscoveryModule)
Caused by: java.lang.IllegalArgumentException: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
at [Source: UNKNOWN; line: -1, column: -1]
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:4314)
at com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:4245)
at org.apache.druid.guice.JsonConfigurator.configurate(JsonConfigurator.java:131)
at org.apache.druid.guice.JsonConfigProvider.get(JsonConfigProvider.java:241)
at com.google.inject.internal.ProviderInternalFactory.provision(ProviderInternalFactory.java:81)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.provision(InternalFactoryToInitializableAdapter.java:53)
at com.google.inject.internal.ProviderInternalFactory.circularGet(ProviderInternalFactory.java:61)
at com.google.inject.internal.InternalFactoryToInitializableAdapter.get(InternalFactoryToInitializableAdapter.java:45)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.SingleParameterInjector.inject(SingleParameterInjector.java:38)
at com.google.inject.internal.SingleParameterInjector.getAll(SingleParameterInjector.java:62)
at com.google.inject.internal.ConstructorInjector.provision(ConstructorInjector.java:110)
at com.google.inject.internal.ConstructorInjector.construct(ConstructorInjector.java:90)
at com.google.inject.internal.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:268)
at com.google.inject.internal.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:46)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1092)
at com.google.inject.internal.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:40)
at com.google.inject.internal.SingletonScope$1.get(SingletonScope.java:194)
at com.google.inject.internal.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:41)
at com.google.inject.internal.SingleFieldInjector.inject(SingleFieldInjector.java:54)
at com.google.inject.internal.MembersInjectorImpl.injectMembers(MembersInjectorImpl.java:132)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:93)
at com.google.inject.internal.MembersInjectorImpl$1.call(MembersInjectorImpl.java:80)
at com.google.inject.internal.InjectorImpl.callInContext(InjectorImpl.java:1085)
at com.google.inject.internal.MembersInjectorImpl.injectAndNotify(MembersInjectorImpl.java:80)
at com.google.inject.internal.Initializer$InjectableReference.get(Initializer.java:223)
at com.google.inject.internal.Initializer.injectAll(Initializer.java:132)
at com.google.inject.internal.InternalInjectorCreator.injectDynamically(InternalInjectorCreator.java:174)
at com.google.inject.internal.InternalInjectorCreator.build(InternalInjectorCreator.java:110)
at com.google.inject.Guice.createInjector(Guice.java:99)
at com.google.inject.Guice.createInjector(Guice.java:73)
at com.google.inject.Guice.createInjector(Guice.java:62)
at org.apache.druid.initialization.ExtensionInjectorBuilder.build(ExtensionInjectorBuilder.java:49)
at org.apache.druid.initialization.ServerInjectorBuilder.build(ServerInjectorBuilder.java:118)
at org.apache.druid.initialization.ServerInjectorBuilder.makeServerInjector(ServerInjectorBuilder.java:73)
at org.apache.druid.initialization.Initialization.makeInjectorWithModules(Initialization.java:63)
at org.apache.druid.indexer.HadoopDruidIndexerConfig.<clinit>(HadoopDruidIndexerConfig.java:109)
at org.apache.druid.indexer.DetermineHashedPartitionsJob$DetermineHashedPartitionsPartitioner.setConf(DetermineHashedPartitionsJob.java:481)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:80)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:141)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:725)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: com.fasterxml.jackson.databind.exc.ValueInstantiationException: Cannot construct instance of `org.apache.druid.k8s.discovery.K8sDiscoveryConfig`, problem: null/empty clusterIdentifier
at [Source: UNKNOWN; line: -1, column: -1]
at com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
at com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:1907)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:587)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:610)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:293)
at com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:288)
at com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:202)
at com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:521)
at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1405)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:363)
at com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:196)
at com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:4309)
... 51 more
Caused by: java.lang.IllegalArgumentException: null/empty clusterIdentifier
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:143)
at org.apache.druid.k8s.discovery.K8sDiscoveryConfig.<init>(K8sDiscoveryConfig.java:75)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:499)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:480)
at com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:124)
at com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:291)
... 58 more
From what i understand, the job on Yarn should not need this config, but still tries to inject it.
Cannot find a way to inject fake values or skip this.
I know MMless and K8s are still experimental, so might be something not yet really tested. Is there any way to fix this?
Affected Version
30.0.0
Description
Dev environment with k8s, mm-less cluster.
coordinator as Overlord
indexer settings:
S3 and kafka ingestion work fine, they spin a k8s job and ingest properly.
now testing the hadoop ingestion with EMR 7.0.0
java 17 bod on pods ( default druid image ) and EMR nodes
After setting some jobProperties:
I managed to make it to:
javax.security.sasl.SaslException: DIGEST-MD5: No common protection layer between client and serverErrorBut now i am stuck on:
From what i understand, the job on Yarn should not need this config, but still tries to inject it.
Cannot find a way to inject fake values or skip this.
I know MMless and K8s are still experimental, so might be something not yet really tested. Is there any way to fix this?