Affected Version
0.17.0/master
Description
Azure deep storage does not work with datasource name containing non-ASCII chars. For example, if datasource name contains " Россия 한국 中国!?"
Exception stack
2020-03-13T04:01:58,626 INFO [[single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z]-appenderator-merge] org.apache.druid.storage.azure.AzureDataSegmentPusher - Uploading [/tmp/persistent/task/single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z/work/persist/wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2020-03-13T04:00:29.411Z_4/merged] to Azure.
2020-03-13T04:01:58,630 INFO [[single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z]-appenderator-merge] org.apache.druid.storage.azure.AzureDataSegmentPusher - DataSegment: [wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4]
2020-03-13T04:01:58,747 INFO [[single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z]-appenderator-merge] org.apache.druid.storage.azure.AzureDataSegmentPusher - Deleting zipped index File[/tmp/index4275469617247018689.zip]
2020-03-13T04:01:58,748 WARN [[single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z]-appenderator-merge] org.apache.druid.java.util.common.RetryUtils - Retrying (1 of 4) in 1,347ms.
java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at org.apache.druid.storage.azure.AzureDataSegmentPusher.push(AzureDataSegmentPusher.java:143) ~[?:?]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:791) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:787) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:657) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) [guava-16.0.1.jar:?]
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) [guava-16.0.1.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at java.net.URI$Parser.fail(URI.java:2848) ~[?:1.8.0_232]
at java.net.URI$Parser.checkChars(URI.java:3021) ~[?:1.8.0_232]
at java.net.URI$Parser.parseHierarchical(URI.java:3105) ~[?:1.8.0_232]
at java.net.URI$Parser.parse(URI.java:3063) ~[?:1.8.0_232]
at java.net.URI.<init>(URI.java:588) ~[?:1.8.0_232]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.uploadDataSegment(AzureDataSegmentPusher.java:188) ~[?:?]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.lambda$push$0(AzureDataSegmentPusher.java:138) ~[?:?]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.storage.azure.AzureUtils.retryAzureOperation(AzureUtils.java:101) ~[?:?]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.push(AzureDataSegmentPusher.java:137) ~[?:?]
... 11 more
2020-03-13T04:02:13,499 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id='single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z', groupId='index_parallel_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_cfmbhbem_2020-03-13T04:00:29.405Z', taskResource=TaskResource{availabilityGroup='single_phase_sub_task_wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?_elgboiac_2020-03-13T04:01:40.410Z', requiredCapacity=1}, dataSource='wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?', context={forceTimeChunkLock=true}}]
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:428) ~[druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:217) ~[druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:123) ~[druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:421) [druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:393) [druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
at org.apache.druid.segment.realtime.appenderator.BatchAppenderatorDriver.pushAndClear(BatchAppenderatorDriver.java:150) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.BatchAppenderatorDriver.pushAllAndClear(BatchAppenderatorDriver.java:131) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:418) ~[druid-indexing-service-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
... 8 more
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:822) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:657) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?]
... 3 more
Caused by: java.lang.RuntimeException: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at org.apache.druid.storage.azure.AzureDataSegmentPusher.push(AzureDataSegmentPusher.java:143) ~[?:?]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:791) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:787) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:657) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?]
... 3 more
Caused by: java.net.URISyntaxException: Illegal character in path at index 57: wikipedia_index_test_f586e6e7-286a-4e0e-9197-d738b656b595 Россия 한국 中国!?/20130831T000000.000Z_20130901T000000.000Z/2020-03-13T04_00_29.411Z/4/index.zip
at java.net.URI$Parser.fail(URI.java:2848) ~[?:1.8.0_232]
at java.net.URI$Parser.checkChars(URI.java:3021) ~[?:1.8.0_232]
at java.net.URI$Parser.parseHierarchical(URI.java:3105) ~[?:1.8.0_232]
at java.net.URI$Parser.parse(URI.java:3063) ~[?:1.8.0_232]
at java.net.URI.<init>(URI.java:588) ~[?:1.8.0_232]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.uploadDataSegment(AzureDataSegmentPusher.java:188) ~[?:?]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.lambda$push$0(AzureDataSegmentPusher.java:138) ~[?:?]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.storage.azure.AzureUtils.retryAzureOperation(AzureUtils.java:101) ~[?:?]
at org.apache.druid.storage.azure.AzureDataSegmentPusher.push(AzureDataSegmentPusher.java:137) ~[?:?]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:791) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:787) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:657) ~[druid-server-0.18.0-SNAPSHOT.jar:0.18.0-SNAPSHOT]
at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?]
... 3 more
Affected Version
0.17.0/master
Description
Azure deep storage does not work with datasource name containing non-ASCII chars. For example, if datasource name contains " Россия 한국 中国!?"
Exception stack