Skip to content

Kafka Indexing Service - Issue when data-source name is in Cyrillic #6718

@ciukstar

Description

@ciukstar

Data is not ingested through Kafka Indexing Service if data source name is in Cyrillic.

Dec 10 03:58:37 ub druid-overload.sh[1450]: 2018-12-10T00:58:37,143 INFO [KafkaIndexTaskClient-Средняя заработная плата-0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://ub:8101
Dec 10 03:58:37 ub druid-overload.sh[1450]: 2018-12-10T00:58:37,151 WARN [KafkaIndexTaskClient-Средняя заработная плата-0] io.druid.indexing.kafka.KafkaIndexTaskClient - Expected worker to have taskId [index_kafka_Средняя заработная плата_419f54082b9c6c0_ckoppdmd] but has taskId [index_kafka_                        _419f54082b9c6c0_ckoppdmd], will retry in [5]s
Dec 10 03:58:42 ub druid-overload.sh[1450]: 2018-12-10T00:58:42,156 INFO [KafkaIndexTaskClient-Средняя заработная плата-0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://ub:8101
Dec 10 03:58:42 ub druid-overload.sh[1450]: 2018-12-10T00:58:42,177 WARN [KafkaIndexTaskClient-Средняя заработная плата-0] io.druid.indexing.kafka.KafkaIndexTaskClient - Expected worker to have taskId [index_kafka_Средняя заработная плата_419f54082b9c6c0_ckoppdmd] but has taskId [index_kafka_                        _419f54082b9c6c0_ckoppdmd], will retry in [5]s

Here is supervisor spec:

{
    "type": "kafka",
    "dataSchema": {
        "dataSource": "Средняя заработная плата",
        "parser": {
            "type": "string",
            "parseSpec": {
                "format": "json",
                "timestampSpec": {
                    "column": "Период",
                    "format": "auto"
                },
                "dimensionsSpec": {
                    "dimensions": [
                        "Страна",
                        "Федеральный округ",
                        "Субъект",
                        "Отрасль",
                        "Группа учреждений",
                        "Район",
                        "Организация",
                        "Должность",
                        "Категория должностей"
                    ]
                }
            }
        },
        "metricsSpec": [
            {
                "type": "doubleSum",
                "name": "Средняя численность списочного со",
                "fieldName": "Средняя численность списочного со"
            },
            {
                "type": "doubleSum",
                "name": "Окладная часть",
                "fieldName": "Окладная часть"
            },
            {
                "type": "doubleSum",
                "name": "Компенсац характера",
                "fieldName": "Компенсац характера"
            },
            {
                "type": "doubleSum",
                "name": "Прочие",
                "fieldName": "Прочие"
            },
            {
                "type": "doubleSum",
                "name": "Бюджет",
                "fieldName": "Бюджет"
            },
            {
                "type": "doubleSum",
                "name": "Приносящая доход деятельность",
                "fieldName": "Приносящая доход деятельность"
            },
            {
                "type": "doubleSum",
                "name": "Зарплата общ",
                "fieldName": "Зарплата общ"
            },
            {
                "type": "doubleSum",
                "name": "Стимулир характера",
                "fieldName": "Стимулир характера"
            },
            {
                "type": "doubleSum",
                "name": "Бюджет окладная часть",
                "fieldName": "Бюджет окладная часть"
            },
            {
                "type": "doubleSum",
                "name": "Бюджет компенсац характера",
                "fieldName": "Бюджет компенсац характера"
            },
            {
                "type": "doubleSum",
                "name": "Бюджет стимулир характера",
                "fieldName": "Бюджет стимулир характера"
            },
            {
                "type": "doubleSum",
                "name": "Бюджет прочие",
                "fieldName": "Бюджет прочие"
            }
        ],
        "granularitySpec": {
            "type": "uniform",
            "segmentGranularity": "MONTH",
            "queryGranularity": "NONE",
            "rollup": true
        },
        "transformSpec": {
            "transforms": [
                {
                    "type": "expression",
                    "name": "Страна",
                    "expression": "concat('Российская Федерация')"
                },
                {
                    "type": "expression",
                    "name": "Федеральный округ",
                    "expression": "concat('Северо-Западный федеральный округ')"
                },
                {
                    "type": "expression",
                    "name": "Субъект",
                    "expression": "concat('Санкт-Петербург')"
                }
            ]
        }
    },
    "tuningConfig": {
        "type": "kafka",
        "reportParseExceptions": true
    },
    "ioConfig": {
        "topic": "mean-wages",
        "replicas": 2,
        "taskDuration": "PT5M",
        "completionTimeout": "PT10M",
        "consumerProperties": {
            "bootstrap.servers": "localhost:9092"
        }
    }
}
$ java -version
openjdk version "1.8.0_191"
OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-0ubuntu0.18.04.1-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
$ uname -a
Linux ub 4.15.0-42-generic #45-Ubuntu SMP Thu Nov 15 19:32:57 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ localectl status
   System Locale: LANG=en_US.UTF-8
                  LANGUAGE=en_US
       VC Keymap: n/a
      X11 Layout: us,ru
       X11 Model: pc105
     X11 Variant: ,
     X11 Options: grp:toggle,grp_led:scroll

Same error for:

$ localectl status
   System Locale: LANG=ru_RU.UTF-8
       VC Keymap: n/a
      X11 Layout: us
       X11 Model: pc105
$ lsb_release -a
LSB Version:	core-9.20170808ubuntu1-noarch:security-9.20170808ubuntu1-noarch
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions