druid-orc-extensions hadoop-common dependency is broken or maybe the extension isn't properly documented
Affected Version
0.13.0-incubating
Description
Using this modules:
druid.extensions.loadList=["mysql-metadata-storage", "druid-kafka-indexing-service", "druid-orc-extensions", "druid-hdfs-storage"]
du extensions/*
20032 extensions/druid-cassandra-storage
46928 extensions/druid-hdfs-storage
4240 extensions/druid-kafka-indexing-service
168 extensions/druid-lookups-cached-global
56640 extensions/druid-orc-extensions
136 extensions/druid-s3-extensions
1968 extensions/mysql-metadata-storage
Posting this task:
{
"type": "index_parallel",
"spec": {
"dataSchema": {
"dataSource": "my_orc_test",
"metricsSpec": [
{
"type": "count",
"name": "count"
}
],
"granularitySpec": {
"segmentGranularity": "DAY",
"queryGranularity": "second",
"intervals" : [ "2018-07-10/2018-07-11" ]
},
"parser": {
"type": "orc",
"parseSpec": {
"format": "timeAndDims",
"timestampSpec": {
"column": "time",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [
"tag"
],
"dimensionExclusions": [],
"spatialDimensions": []
}
},
"typeString": "struct<time:string,tag:string>",
"mapFieldNameFormat": "<PARENT>_<CHILD>"
}
},
"ioConfig": {
"type": "index_parallel",
"firehose": {
"type": "local",
"baseDir": "./",
"filter": "*.orc"
}
}
}
}
Got error from the spawned subtask:
2019-04-10T22:16:10,413 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught Throwable while running task[AbstractTask{id='index_sub_my_orc_test_2019-04-10T22:16:02.661Z', groupId='index_parallel_my_orc_test_2019-04-10T22:15:55.541Z', taskResource=TaskResource{availabilityGroup='index_sub_my_orc_test_2019-04-10T22:16:02.661Z', requiredCapacity=1}, dataSource='my_orc_test', context={}}]
java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
The log also says the needed dependency jar is loaded beforehand:
2019-04-10T22:16:04,921 INFO [main] org.apache.druid.initialization.Initialization - added URL[file:/Users/egorryashin/a-druid-0.13-i/extensions/druid-hdfs-storage/hadoop-common-2.8.3.jar] for extension[druid-hdfs-storage]
The task doesn't work neither with druid-hdfs-storage loaded nor without it.
I spotted that while I was investigating #6925
druid-orc-extensionshadoop-commondependency is broken or maybe the extension isn't properly documentedAffected Version
0.13.0-incubating
Description
Using this modules:
druid.extensions.loadList=["mysql-metadata-storage", "druid-kafka-indexing-service", "druid-orc-extensions", "druid-hdfs-storage"]Posting this task:
{ "type": "index_parallel", "spec": { "dataSchema": { "dataSource": "my_orc_test", "metricsSpec": [ { "type": "count", "name": "count" } ], "granularitySpec": { "segmentGranularity": "DAY", "queryGranularity": "second", "intervals" : [ "2018-07-10/2018-07-11" ] }, "parser": { "type": "orc", "parseSpec": { "format": "timeAndDims", "timestampSpec": { "column": "time", "format": "auto" }, "dimensionsSpec": { "dimensions": [ "tag" ], "dimensionExclusions": [], "spatialDimensions": [] } }, "typeString": "struct<time:string,tag:string>", "mapFieldNameFormat": "<PARENT>_<CHILD>" } }, "ioConfig": { "type": "index_parallel", "firehose": { "type": "local", "baseDir": "./", "filter": "*.orc" } } } }Got error from the spawned subtask:
The log also says the needed dependency jar is loaded beforehand:
The task doesn't work neither with
druid-hdfs-storageloaded nor without it.I spotted that while I was investigating #6925