Unable to ingest nested json data when trying to use flattenSpec with JSONPath length() function.
Description
The ingestion process fails with the following stacktrace:
2021-05-22T01:56:09,614 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.common.task.IndexTask - Encountered exception in BUILD_SEGMENTS.
java.lang.ClassCastException: java.lang.Integer cannot be cast to com.fasterxml.jackson.databind.JsonNode
at org.apache.druid.java.util.common.parsers.JSONFlattenerMaker.lambda$makeJsonPathExtractor$2(JSONFlattenerMaker.java:89) ~[druid-core-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.java.util.common.parsers.ObjectFlatteners$1$1.get(ObjectFlatteners.java:116) ~[druid-core-0.21.0-iap3.jar:0.21.0-iap3]
at java.util.Collections$UnmodifiableMap.get(Collections.java:1456) ~[?:1.8.0_262]
at org.apache.druid.data.input.MapBasedRow.getRaw(MapBasedRow.java:87) ~[druid-core-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.incremental.IncrementalIndex.toIncrementalIndexRow(IncrementalIndex.java:544) ~[druid-processing-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.incremental.IncrementalIndex.add(IncrementalIndex.java:480) ~[druid-processing-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.realtime.plumber.Sink.add(Sink.java:179) ~[druid-server-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.add(AppenderatorImpl.java:261) ~[druid-server-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.append(BaseAppenderatorDriver.java:409) ~[druid-server-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.segment.realtime.appenderator.BatchAppenderatorDriver.add(BatchAppenderatorDriver.java:114) ~[druid-server-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.InputSourceProcessor.process(InputSourceProcessor.java:106) ~[druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:878) ~[druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.IndexTask.runTask(IndexTask.java:494) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runSequential(ParallelIndexSupervisorTask.java:964) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask.runTask(ParallelIndexSupervisorTask.java:445) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423) [druid-indexing-service-0.21.0-iap3.jar:0.21.0-iap3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_262]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_262]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_262]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_262]
Input:
{
...
"flattenSpec": {
"fields": [
{
"type": "path",
"name": "count",
"expr": "$.team.players.length()"
}
]
}
...
}
Replacing json-path in flattenSpec with the following jackson-jq expression does not hit the same problem.
{
...
"flattenSpec": {
"fields": [
{
"type": "jq",
"name": "count",
"expr": ".team.players | length"
}
]
}
...
}
We want to use json-path instead of jq since it's applicable to non-JSON files as well.
Affected Version
Imply version 2021.01-2 LTS
Unable to ingest nested json data when trying to use flattenSpec with JSONPath
length()function.Description
The ingestion process fails with the following stacktrace:
Input:
Replacing json-path in flattenSpec with the following jackson-jq expression does not hit the same problem.
We want to use json-path instead of jq since it's applicable to non-JSON files as well.
Affected Version
Imply version
2021.01-2 LTS