Skip to content

Cell contains [LJava.lang.String@ instead of data when constructing array via transformSpec during ingestion #8733

@fstolba

Description

@fstolba

Affected Version

0.16.0

Description

The cluster runs as single-server-large. I'm trying to combine multiple fields into an array during native batch ingestion.

The task spec looks like this:

{
  "type": "index_parallel",
  "spec": {
    "ioConfig" : {
      "type": "index_parallel",
      "firehose" : {
        "type"    : "ingestSegment",
        "dataSource"   : "source-datasource",
        "interval": "2019-10-11T00:00/2019-10-11T02:00"
      },
      "appendToExisting" : false
    },
    "tuningConfig" : {
      "type" : "index_parallel",
      "maxNumSubTasks": 6
    },
    "dataSchema": {
      "dataSource": "destination-datasource",
      "parser": {
        "type": "string",
        "parseSpec": {
          "format": "json",
          "timestampSpec": {
            "column": "STAMP_UPDATED",
            "format": "auto"
          },
          "dimensionsSpec": {
            "dimensions": [
              "NAME",
              "LOCATION",
              "TAG_IN",
              "TAG_OUT",
              "TAG_IN_NAME",
              "TAG_OUT_NAME",
              "TAG"
            ]
          }
        }
      },
      "metricsSpec": [
        { "type": "longSum", "name": "VALUE", "fieldName": "VALUE" }
      ],
      "transformSpec": {
        "transforms": [
          { "type": "expression", "name": "TAG_IN_NAME", "expression": "lookup(TAG_IN, 'tags')" },
          { "type": "expression", "name": "TAG_OUT_NAME", "expression": "lookup(TAG_OUT, 'tags')" },
          { "type": "expression", "name": "TAG", "expression": "array(TAG_IN_NAME, TAG_OUT_NAME)" }
        ]
      },
      "granularitySpec": {
        "type": "uniform",
        "segmentGranularity": "day",
        "queryGranularity": "fifteen_minute",
        "rollup": true
      }
    }
  }
}

Expected behaviour

The resulting records containing an array composed of the values of the specified fields.

NAME | LOCATION | TAG_IN | TAG_OUT | TAG_IN_NAME | TAG_OUT_NAME | TAG
---- | -------- | ------ | ------- | ----------- | ------------ | ---
Foo  | Bar      | 345    | 456     | Super       | Duper        | ["Super", "Duper"]

Actual behaviour

The resulting cells don't contain an array of the actual values but rather values like [Ljava.lang.String;@768e1c0 where the part after the @ changes. This also happens when using kafka indexing tasks as opposed to native batch ingestion.

NAME | LOCATION | TAG_IN | TAG_OUT | TAG_IN_NAME | TAG_OUT_NAME | TAG
---- | -------- | ------ | ------- | ----------- | ------------ | ---
Foo  | Bar      | 345    | 456     | Super       | Duper        | [Ljava.lang.String;@768e1c0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions