Skip to content

Kafka Indexing Service task pause forever until timeout if no events in taskduration? #5656

@Oooocean

Description

@Oooocean

We wanna use kafka indexing serivce for streaming events. But I encounter some problem even in very simple case.

  1. Install druid on one machine. (use imply-2.5.8.tar.gz)
  2. Install kafka on same machine.
  3. create topic, and submit a spec:
    {
    "type": "kafka",
    "dataSchema": {
    "dataSource": "metrics-ocean4",
    "parser": {
    "type": "string",
    "parseSpec": {
    "format": "json",
    "timestampSpec": {
    "column": "timestamp",
    "format": "auto"
    },
    "dimensionsSpec": {
    "dimensions": [],
    "dimensionExclusions": [
    "timestamp",
    "value"
    ]
    }
    }
    },
    "metricsSpec": [
    {
    "name": "count",
    "type": "count"
    },
    {
    "name": "value_sum",
    "fieldName": "value",
    "type": "doubleSum"
    },
    {
    "name": "value_min",
    "fieldName": "value",
    "type": "doubleMin"
    },
    {
    "name": "value_max",
    "fieldName": "value",
    "type": "doubleMax"
    }
    ],
    "granularitySpec": {
    "type": "uniform",
    "segmentGranularity": "MINUTE",
    "queryGranularity": "NONE"
    }
    },
    "tuningConfig": {
    "type": "kafka",
    "maxRowsPerSegment": 5000000,
    "intermediatePersistPeriod": "PT5M",
    "basePersistDirectory": "/home/qspace/data/mmgamedruiddata/tmp"
    },
    "ioConfig": {
    "topic": "oceantopic4",
    "consumerProperties": {
    "bootstrap.servers": "10.242.20.145:9092"
    },
    "taskCount": 1,
    "replicas": 1,
    "taskDuration": "PT10M",
    "completionTimeout": "PT10M"
    }
    }
    Note that I set task duration to 10M, because I wanna hava a quick test. But my tasks mostly failed in publishing status.

image
Failed task log seems that stop at
"2018-04-17T11:25:33,184 INFO [task-runner-0-priority-0] io.druid.indexing.kafka.KafkaIndexTask - Pausing ingestion until resumed" and never resume.

I set process overlord log to debug. When index task(middleMgr) paused, overload will send offset/end to index task to publish segments.
If there aren't any envets, offsets would not changed. So overload will not send "offset/end" any more?
If index task(middleMgr) cannot receive resume, it will pause forever until taskCompleteTimeout.
Am i miss something? Could anyone help. Thx advance!

SUCC index task overlord: ok_overlord
FAIL index task overload log:
fail_overlord

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions