Skip to content

Conversation

@selsong
Copy link
Contributor

@selsong selsong commented Aug 6, 2025

Description

Implement timechart in PPL to accept aggregation span logic and aggregation functions. This includes support for parameters such as span, limit, useother, a single aggregation function, by field. This does not include pivot formatting.

Time Binning

  • Currently the default behavior for time binning is span=1m because the bin command is being implemented and has not been merged yet. Once bin is merged, the default behavior for time binning will be bins=100. Timechart will then also be modified in a 2nd PR to support all bin options including span, bin, minspan, aligntime.

Limitations: (to be resolved in follow up PR)

  • Pivot is not supported in timechart.
  • Only a single aggregation function is supported per timechart command.
  • The bins parameter and other bin options are not supported since the bin command is not implemented yet.

Related Issues

Resolves #3965

1. single aggregation function

Query:

source=events | timechart span=1m avg(cpu_usage)

Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "avg(cpu_usage)",
      "type": "float"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      45.2
    ],
    [
      "2024-07-01 00:01:00",
      38.7
    ],
    [
      "2024-07-01 00:02:00",
      55.3
    ],
    [
      "2024-07-01 00:03:00",
      42.1
    ],
    [
      "2024-07-01 00:04:00",
      41.8
    ]
  ],
  "total": 5,
  "size": 5
}

Logical Plan:

"LogicalSystemLimit(sort0=[$0], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], dir0=[ASC])\n    LogicalAggregate(group=[{1}], agg#0=[AVG($0)])\n      LogicalProject(cpu_usage=[$2], $f2=[SPAN($1, 1, 'm')])\n        CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n"

Physical Plan:

"EnumerableLimit(fetch=[10000])\n  CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#377:LogicalAggregate.NONE.[](input=RelSubset#376,group={1},agg#0=AVG($0)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"$f2\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f1\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

2. aggregation with by field

Query:

source=events | timechart span=1h count() by host

Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "host",
      "type": "string"
    },
    {
      "name": "count",
      "type": "bigint"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      "web-01",
      1
    ],
    [
      "2024-07-01 00:01:00",
      "web-02",
      1
    ],
    [
      "2024-07-01 00:02:00",
      "web-01",
      1
    ],
    [
      "2024-07-01 00:03:00",
      "db-01",
      1
    ],
    [
      "2024-07-01 00:04:00",
      "web-02",
      1
    ]
  ],
  "total": 5,
  "size": 5
}%   

Logical Plan:

 LogicalSystemLimit(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    LogicalAggregate(group=[{0, 1}], count=[SUM($2)])\n      LogicalProject(@timestamp=[$1], host=[CASE(IS NOT NULL($3), $0, 'OTHER')], count=[$2])\n        LogicalJoin(condition=[=($0, $3)], joinType=[left])\n          LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])\n            LogicalProject(host=[$0], $f2=[SPAN($1, 1, 'm')])\n              CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n          LogicalSort(sort0=[$1], dir0=[DESC], fetch=[10])\n            LogicalAggregate(group=[{0}], grand_total=[SUM($2)])\n              LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])\n                LogicalProject(host=[$0], $f2=[SPAN($1, 1, 'm')])\n                  CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n"

Physical Plan:

"EnumerableLimit(fetch=[10000])\n  EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    EnumerableAggregate(group=[{0, 1}], count=[$SUM0($2)])\n      EnumerableCalc(expr#0..4=[{inputs}], expr#5=[IS NOT NULL($t3)], expr#6=['OTHER'], expr#7=[CASE($t5, $t0, $t6)], @timestamp=[$t1], host=[$t7], count=[$t2])\n        EnumerableMergeJoin(condition=[=($0, $3)], joinType=[left])\n          CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#795:LogicalAggregate.NONE.[](input=RelSubset#794,group={0, 1},agg#0=COUNT()), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f2\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f2_0\":{\"value_count\":{\"field\":\"_index\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n          EnumerableSort(sort0=[$0], dir0=[ASC])\n            EnumerableLimit(fetch=[10])\n              EnumerableSort(sort0=[$1], dir0=[DESC])\n                CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#837:LogicalAggregate.NONE.[](input=RelSubset#794,group={0},grand_total=COUNT())], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\"}}}]},\"aggregations\":{\"grand_total\":{\"value_count\":{\"field\":\"_index\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

3. span parameter

Query:

source=events | timechart span=1s count() by region

Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "region",
      "type": "string"
    },
    {
      "name": "count",
      "type": "bigint"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      "us-east",
      1
    ],
    [
      "2024-07-01 00:01:00",
      "us-west",
      1
    ],
    [
      "2024-07-01 00:02:00",
      "us-east",
      1
    ],
    [
      "2024-07-01 00:03:00",
      "eu-west",
      1
    ],
    [
      "2024-07-01 00:04:00",
      "us-west",
      1
    ]
  ],
  "total": 5,
  "size": 5
}

Logical Plan:

 "LogicalSystemLimit(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    LogicalAggregate(group=[{0, 1}], count=[SUM($2)])\n      LogicalProject(@timestamp=[$1], region=[CASE(IS NOT NULL($3), $0, 'OTHER')], count=[$2])\n        LogicalJoin(condition=[=($0, $3)], joinType=[left])\n          LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])\n            LogicalProject(region=[$3], $f2=[SPAN($1, 1, 's')])\n              CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n          LogicalSort(sort0=[$1], dir0=[DESC], fetch=[10])\n            LogicalAggregate(group=[{0}], grand_total=[SUM($2)])\n              LogicalAggregate(group=[{0, 1}], agg#0=[COUNT()])\n                LogicalProject(region=[$3], $f2=[SPAN($1, 1, 's')])\n                  CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n"

Physical Plan:

"EnumerableLimit(fetch=[10000])\n  EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    EnumerableAggregate(group=[{0, 1}], count=[$SUM0($2)])\n      EnumerableCalc(expr#0..4=[{inputs}], expr#5=[IS NOT NULL($t3)], expr#6=['OTHER'], expr#7=[CASE($t5, $t0, $t6)], @timestamp=[$t1], region=[$t7], count=[$t2])\n        EnumerableMergeJoin(condition=[=($0, $3)], joinType=[left])\n          CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1413:LogicalAggregate.NONE.[](input=RelSubset#1412,group={0, 1},agg#0=COUNT()), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"region\":{\"terms\":{\"field\":\"region.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f2\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1s\"}}}]},\"aggregations\":{\"$f2_0\":{\"value_count\":{\"field\":\"_index\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n          EnumerableSort(sort0=[$0], dir0=[ASC])\n            EnumerableLimit(fetch=[10])\n              EnumerableSort(sort0=[$1], dir0=[DESC])\n                CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1455:LogicalAggregate.NONE.[](input=RelSubset#1412,group={0},grand_total=COUNT())], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"region\":{\"terms\":{\"field\":\"region.keyword\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\"}}}]},\"aggregations\":{\"grand_total\":{\"value_count\":{\"field\":\"_index\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

4. Default limit = 10 and useother = true

For by fields with more than 10 values, timechart will automatically group the extra values into an OTHER field.
Query:

source=events_many_hosts | timechart span=1h avg(cpu_usage) by host

Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "host",
      "type": "string"
    },
    {
      "name": "avg(cpu_usage)",
      "type": "double"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      "OTHER",
      35.900001525878906
    ],
    [
      "2024-07-01 00:00:00",
      "web-01",
      45.20000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-02",
      38.70000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-03",
      55.29999923706055
    ],
    [
      "2024-07-01 00:00:00",
      "web-04",
      42.099998474121094
    ],
    [
      "2024-07-01 00:00:00",
      "web-05",
      41.79999923706055
    ],
    [
      "2024-07-01 00:00:00",
      "web-06",
      39.400001525878906
    ],
    [
      "2024-07-01 00:00:00",
      "web-07",
      48.599998474121094
    ],
    [
      "2024-07-01 00:00:00",
      "web-08",
      44.20000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-09",
      67.80000305175781
    ],
    [
      "2024-07-01 00:00:00",
      "web-11",
      43.099998474121094
    ]
  ],
  "total": 11,
  "size": 11
}

Logical Plan:

"LogicalSystemLimit(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    LogicalAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n      LogicalProject(@timestamp=[$1], host=[CASE(IS NOT NULL($3), $0, 'OTHER')], avg(cpu_usage)=[$2])\n        LogicalJoin(condition=[=($0, $3)], joinType=[left])\n          LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n            LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'h')])\n              CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n          LogicalSort(sort0=[$1], dir0=[DESC], fetch=[10])\n            LogicalAggregate(group=[{0}], grand_total=[SUM($2)])\n              LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n                LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'h')])\n                  CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n",

Physical Plan:

 "EnumerableLimit(fetch=[10000])\n  EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    EnumerableAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n      EnumerableCalc(expr#0..4=[{inputs}], expr#5=[IS NOT NULL($t3)], expr#6=['OTHER'], expr#7=[CASE($t5, $t0, $t6)], @timestamp=[$t1], host=[$t7], avg(cpu_usage)=[$t2])\n        EnumerableMergeJoin(condition=[=($0, $3)], joinType=[left])\n          CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1926:LogicalAggregate.NONE.[](input=RelSubset#1925,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1h\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n          EnumerableSort(sort0=[$0], dir0=[ASC])\n            EnumerableLimit(fetch=[10])\n              EnumerableSort(sort0=[$1], dir0=[DESC])\n                EnumerableAggregate(group=[{0}], grand_total=[SUM($2)])\n                  CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1926:LogicalAggregate.NONE.[](input=RelSubset#1925,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1h\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

5. limit parameter

Query:

source=events_many_hosts |  timechart span=1m limit=3 avg(response_time) by host

Lowest sum of aggregation function results are grouped into the OTHER column so only top 3 hosts by avg(response_time) are shown, displayed in their original order.
Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "host",
      "type": "string"
    },
    {
      "name": "avg(cpu_usage)",
      "type": "double"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      "OTHER",
      330.4000015258789
    ],
    [
      "2024-07-01 00:00:00",
      "web-03",
      55.29999923706055
    ],
    [
      "2024-07-01 00:00:00",
      "web-07",
      48.599998474121094
    ],
    [
      "2024-07-01 00:00:00",
      "web-09",
      67.80000305175781
    ]
  ],
  "total": 4,
  "size": 4
}

Logical Plan:

 "LogicalSystemLimit(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    LogicalAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n      LogicalProject(@timestamp=[$1], host=[CASE(IS NOT NULL($3), $0, 'OTHER')], avg(cpu_usage)=[$2])\n        LogicalJoin(condition=[=($0, $3)], joinType=[left])\n          LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n            LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'm')])\n              CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n          LogicalSort(sort0=[$1], dir0=[DESC], fetch=[3])\n            LogicalAggregate(group=[{0}], grand_total=[SUM($2)])\n              LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n                LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'm')])\n                  CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n",

Physical Plan:

"EnumerableLimit(fetch=[10000])\n  EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    EnumerableAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n      EnumerableCalc(expr#0..4=[{inputs}], expr#5=[IS NOT NULL($t3)], expr#6=['OTHER'], expr#7=[CASE($t5, $t0, $t6)], @timestamp=[$t1], host=[$t7], avg(cpu_usage)=[$t2])\n        EnumerableMergeJoin(condition=[=($0, $3)], joinType=[left])\n          CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1663:LogicalAggregate.NONE.[](input=RelSubset#1662,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n          EnumerableSort(sort0=[$0], dir0=[ASC])\n            EnumerableLimit(fetch=[3])\n              EnumerableSort(sort0=[$1], dir0=[DESC])\n                EnumerableAggregate(group=[{0}], grand_total=[SUM($2)])\n                  CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#1663:LogicalAggregate.NONE.[](input=RelSubset#1662,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

6. useother parameter

Query:

source=events_many_hosts |  timechart span=1m useother=f avg(response_time) by host

Accepts useother=t, f, true, or false. Default limit of 10 is applied since no limit is specified, but OTHER column is omitted since useother is set to false.
Result:

{
  "schema": [
    {
      "name": "@timestamp",
      "type": "timestamp"
    },
    {
      "name": "host",
      "type": "string"
    },
    {
      "name": "avg(cpu_usage)",
      "type": "double"
    }
  ],
  "datarows": [
    [
      "2024-07-01 00:00:00",
      "web-01",
      45.20000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-02",
      38.70000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-03",
      55.29999923706055
    ],
    [
      "2024-07-01 00:00:00",
      "web-04",
      42.099998474121094
    ],
    [
      "2024-07-01 00:00:00",
      "web-05",
      41.79999923706055
    ],
    [
      "2024-07-01 00:00:00",
      "web-06",
      39.400001525878906
    ],
    [
      "2024-07-01 00:00:00",
      "web-07",
      48.599998474121094
    ],
    [
      "2024-07-01 00:00:00",
      "web-08",
      44.20000076293945
    ],
    [
      "2024-07-01 00:00:00",
      "web-09",
      67.80000305175781
    ],
    [
      "2024-07-01 00:00:00",
      "web-11",
      43.099998474121094
    ]
  ],
  "total": 10,
  "size": 10
}

Logical Plan:

 "LogicalSystemLimit(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])\n  LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    LogicalFilter(condition=[<>($1, 'OTHER')])\n      LogicalAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n        LogicalProject(@timestamp=[$1], host=[CASE(IS NOT NULL($3), $0, 'OTHER')], avg(cpu_usage)=[$2])\n          LogicalJoin(condition=[=($0, $3)], joinType=[left])\n            LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n              LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'm')])\n                CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n            LogicalSort(sort0=[$1], dir0=[DESC], fetch=[10])\n              LogicalAggregate(group=[{0}], grand_total=[SUM($2)])\n                LogicalAggregate(group=[{0, 2}], agg#0=[AVG($1)])\n                  LogicalProject(host=[$1], cpu_usage=[$0], $f3=[SPAN($2, 1, 'm')])\n                    CalciteLogicalIndexScan(table=[[OpenSearch, events]])\n",

Physical Plan:

"EnumerableLimit(fetch=[10000])\n  EnumerableSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])\n    EnumerableAggregate(group=[{0, 1}], avg(cpu_usage)=[SUM($2)])\n      EnumerableCalc(expr#0..4=[{inputs}], expr#5=[IS NOT NULL($t0)], expr#6=['OTHER'], expr#7=[CASE($t5, $t2, $t6)], @timestamp=[$t3], host=[$t7], avg(cpu_usage)=[$t4])\n        EnumerableMergeJoin(condition=[=($0, $2)], joinType=[inner])\n          EnumerableSort(sort0=[$0], dir0=[ASC])\n            EnumerableCalc(expr#0..1=[{inputs}], expr#2=[IS NOT NULL($t0)], proj#0..1=[{exprs}], $condition=[$t2])\n              EnumerableLimit(fetch=[10])\n                EnumerableSort(sort0=[$1], dir0=[DESC])\n                  EnumerableAggregate(group=[{0}], grand_total=[SUM($2)])\n                    CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[AGGREGATION->rel#584:LogicalAggregate.NONE.[](input=RelSubset#583,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n          CalciteEnumerableIndexScan(table=[[OpenSearch, events]], PushDownContext=[[FILTER-><>($1, 'OTHER'), AGGREGATION->rel#798:LogicalAggregate.NONE.[](input=RelSubset#797,group={0, 2},agg#0=AVG($1)), SORT->[0]], OpenSearchRequestBuilder(sourceBuilder={\"from\":0,\"size\":0,\"timeout\":\"1m\",\"query\":{\"bool\":{\"must\":[{\"exists\":{\"field\":\"host\",\"boost\":1.0}}],\"must_not\":[{\"term\":{\"host.keyword\":{\"value\":\"OTHER\",\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"sort\":[{\"_doc\":{\"order\":\"asc\"}}],\"aggregations\":{\"composite_buckets\":{\"composite\":{\"size\":1000,\"sources\":[{\"host\":{\"terms\":{\"field\":\"host.keyword\",\"missing_bucket\":true,\"missing_order\":\"last\",\"order\":\"asc\"}}},{\"$f3\":{\"date_histogram\":{\"field\":\"@timestamp\",\"missing_bucket\":true,\"missing_order\":\"first\",\"order\":\"asc\",\"fixed_interval\":\"1m\"}}}]},\"aggregations\":{\"$f2\":{\"avg\":{\"field\":\"cpu_usage\"}}}}}}, requestedTotalSize=2147483647, pageSize=null, startFrom=0)])\n"

Performance with Big5 Testing

source=big5 | head <size> | timechart span=1h count()
  • head 10: Avg ≈ 38 ms, P90 ≈ 43 ms
    (span=1h, 1m, 1s, avg metrics.size, count by cloud.region)
  • head 100: Avg ≈ 59 ms, P90 ≈ 69 ms
    (span=1h, 1m, 1s and count by cloud.region)
  • head 1000: Avg ≈ 74 ms, P90 ≈ 97 ms
    (includes spans 1s, 1m, 1h count queries)
  • head 10000: Avg: 310 ms
source=big5 | head <size> | timechart span=1h count() by cloud.region

Head 10: 64 ms
Head 100: 75 ms
Head 1 000: 113 ms
Head 10 000: Avg = 628 ms (p90 ≈ 630 ms)

For both queries:

  • Over head 10000 will hit Circuitbreaker. Tested 20K - 100K each run took roughly 650 to 820 ms before failing due to
Data too large, data for [_id] would be [433994215/413.8mb], which is larger than the limit of [429496729/409.5mb]]]

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Selina Song added 10 commits August 11, 2025 14:39
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
@selsong selsong force-pushed the feat/timechart_cmd branch from aabad00 to 4544e80 Compare August 11, 2025 21:44
Selina Song added 4 commits August 11, 2025 15:52
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
@RyanL1997 RyanL1997 added PPL Piped processing language enhancement New feature or request calcite calcite migration releated labels Aug 13, 2025
Selina Song added 4 commits August 13, 2025 11:09
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
@selsong selsong marked this pull request as ready for review August 13, 2025 23:05
Selina Song added 5 commits September 3, 2025 09:28
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>

// Handle no by field case
if (node.getByField() == null) {
String valueFunctionName = getValueFunctionName(node.getAggregateFunction());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/** Get original text in query. */
private String getTextInQuery(ParserRuleContext ctx) {
Token start = ctx.getStart();
Token stop = ctx.getStop();
return query.substring(start.getStartIndex(), stop.getStopIndex() + 1);
}

For stats command, the name is generated directly from the query text when building AST. Maybe we should do similar thing in visitTimechartCommand.


@Test
public void testExplainWithTimechart() throws IOException {
var result = explainQueryToString("source=events | timechart span=1m avg(cpu_usage) by host");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Use zero-filling for count aggregations, standard result for others

Please add explain test case for count aggregation as it gets a totally different plan actually.

Selina Song added 3 commits September 3, 2025 13:20
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
// First rename the timestamp field (2nd to last) to @timestamp
List<String> fieldNames = context.relBuilder.peek().getRowType().getFieldNames();
List<String> renamedFields = new ArrayList<>(fieldNames);
renamedFields.set(fieldNames.size() - 2, "@timestamp");
Copy link
Collaborator

@qianheng-aws qianheng-aws Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why we find that there is flip for group by keys is because of the registry mechanism for aggregate key in Calcite.

https://github.com/apache/calcite/blob/41e1ab8b8c18b186730b30cd2cbebffe807d86f3/core/src/main/java/org/apache/calcite/tools/RelBuilder.java#L2509-L2519

It's actually not flip but appending new expressions(i.e. span(...) here) to the original expressions. So although we constructed group by expr to be [spanExpr, byField], but got [byField, spanExpr] in the end.

Selina Song and others added 5 commits September 3, 2025 15:49
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selsong@amazon.com>
Copy link
Collaborator

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @selsong!
Let's track performance issue separately.

Copy link
Collaborator

@qianheng-aws qianheng-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality of this PR LGTM. Thanks for your contribution! @selsong

@qianheng-aws qianheng-aws merged commit e2678a1 into opensearch-project:main Sep 4, 2025
23 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-3993-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e2678a13fc9a6acc7bfced322e3cc2156a8dec91
# Push it to GitHub
git push --set-upstream origin backport/backport-3993-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-3993-to-2.19-dev.

selsong added a commit to selsong/sql that referenced this pull request Sep 4, 2025
* WIP: Support timechart grammar / AST

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Support span=unit in timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Return correct column format after span=unit

Signed-off-by: Selina Song <selsong@amazon.com>

* sort by @timestamp, group by aggregate function

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: pivot table by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Correct pivot format for by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix aggregation composite bucket limit
Signed-off-by: Selina Song <selsong@amazon.com>

* Add timechart.rst doc and PPLQueryAnonymizer test

Signed-off-by: Selina Song <selsong@amazon.com>

* Add ExplainIT for timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Restore reverse ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Update explainIT timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* spotlessApply formatting

Signed-off-by: Selina Song <selsong@amazon.com>

* format IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit=0 means no limit, show all values, no other

Signed-off-by: Selina Song <selsong@amazon.com>

* Add useother parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up format, fix column order

Signed-off-by: Selina Song <selsong@amazon.com>

* add test for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Increase test coverage for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename bin option expression, modify constructor

Signed-off-by: Selina Song <selsong@amazon.com>

* Make test data smaller, update IT, rst

Signed-off-by: Selina Song <selsong@amazon.com>

* add explain output to IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limitation to rst, format, comment

Signed-off-by: Selina Song <selsong@amazon.com>

* Add test coverage formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Refactor formatter for clarity

Signed-off-by: Selina Song <selsong@amazon.com>

* fix NPE

Signed-off-by: Selina Song <selsong@amazon.com>

* add import

Signed-off-by: Selina Song <selsong@amazon.com>

* update mapping to match smaller dataset

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix explainIT, improve code structure

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless

Signed-off-by: Selina Song <selsong@amazon.com>

* Support diff position parameters, count default 0 not null, update doc rst

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count aggr type int

Signed-off-by: Selina Song <selsong@amazon.com>

* rename @timestamp column

Signed-off-by: Selina Song <selsong@amazon.com>

* move parameter extraction to PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count type test, doc rst format

Signed-off-by: Selina Song <selsong@amazon.com>

* add test coverage

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: SQL query with limit useother

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryService

Signed-off-by: Selina Song <selsong@amazon.com>

* add limit useother to SQL query

Signed-off-by: Selina Song <selsong@amazon.com>

* SQL query working, WIP column rename

Signed-off-by: Selina Song <selsong@amazon.com>

* use loadIndex in IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename fields to match, update doc

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryResult

Signed-off-by: Selina Song <selsong@amazon.com>

* revert gradle build

Signed-off-by: Selina Song <selsong@amazon.com>

* fix format

Signed-off-by: Selina Song <selsong@amazon.com>

* Add count fill zero, update toSQL tests, doc

Signed-off-by: Selina Song <selsong@amazon.com>

* fix nits

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix rename aggregation only no by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* revert OS Exec Engine edits

Signed-off-by: Selina Song <selsong@amazon.com>

* correct OS Exec Engine revert to 5ec9603

Signed-off-by: Selina Song <selsong@amazon.com>

* restore OS Exec Engine

Signed-off-by: Selina Song <selsong@amazon.com>

* update QueryAnonymizer Test to reflect default

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Add null=1, Other fill zero wip

Signed-off-by: Selina Song <selsong@amazon.com>

* Add doctest and update ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Replace detectFieldIndices function

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix Other in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix OTHER in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

* update doc with null example

Signed-off-by: Selina Song <selsong@amazon.com>

* null not included in limit calc

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test format

Signed-off-by: Selina Song <selsong@amazon.com>

* Update ExplainIT with count

Signed-off-by: Selina Song <selsong@amazon.com>

* remove unused code

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix SQL tests nullable

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

---------

Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selinasong6@gmail.com>
Co-authored-by: Selina Song <selsong@amazon.com>
(cherry picked from commit e2678a1)
selsong added a commit to selsong/sql that referenced this pull request Sep 4, 2025
* WIP: Support timechart grammar / AST

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Support span=unit in timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Return correct column format after span=unit

Signed-off-by: Selina Song <selsong@amazon.com>

* sort by @timestamp, group by aggregate function

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: pivot table by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Correct pivot format for by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix aggregation composite bucket limit
Signed-off-by: Selina Song <selsong@amazon.com>

* Add timechart.rst doc and PPLQueryAnonymizer test

Signed-off-by: Selina Song <selsong@amazon.com>

* Add ExplainIT for timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Restore reverse ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Update explainIT timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* spotlessApply formatting

Signed-off-by: Selina Song <selsong@amazon.com>

* format IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit=0 means no limit, show all values, no other

Signed-off-by: Selina Song <selsong@amazon.com>

* Add useother parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up format, fix column order

Signed-off-by: Selina Song <selsong@amazon.com>

* add test for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Increase test coverage for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename bin option expression, modify constructor

Signed-off-by: Selina Song <selsong@amazon.com>

* Make test data smaller, update IT, rst

Signed-off-by: Selina Song <selsong@amazon.com>

* add explain output to IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limitation to rst, format, comment

Signed-off-by: Selina Song <selsong@amazon.com>

* Add test coverage formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Refactor formatter for clarity

Signed-off-by: Selina Song <selsong@amazon.com>

* fix NPE

Signed-off-by: Selina Song <selsong@amazon.com>

* add import

Signed-off-by: Selina Song <selsong@amazon.com>

* update mapping to match smaller dataset

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix explainIT, improve code structure

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless

Signed-off-by: Selina Song <selsong@amazon.com>

* Support diff position parameters, count default 0 not null, update doc rst

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count aggr type int

Signed-off-by: Selina Song <selsong@amazon.com>

* rename @timestamp column

Signed-off-by: Selina Song <selsong@amazon.com>

* move parameter extraction to PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count type test, doc rst format

Signed-off-by: Selina Song <selsong@amazon.com>

* add test coverage

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: SQL query with limit useother

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryService

Signed-off-by: Selina Song <selsong@amazon.com>

* add limit useother to SQL query

Signed-off-by: Selina Song <selsong@amazon.com>

* SQL query working, WIP column rename

Signed-off-by: Selina Song <selsong@amazon.com>

* use loadIndex in IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename fields to match, update doc

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryResult

Signed-off-by: Selina Song <selsong@amazon.com>

* revert gradle build

Signed-off-by: Selina Song <selsong@amazon.com>

* fix format

Signed-off-by: Selina Song <selsong@amazon.com>

* Add count fill zero, update toSQL tests, doc

Signed-off-by: Selina Song <selsong@amazon.com>

* fix nits

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix rename aggregation only no by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* revert OS Exec Engine edits

Signed-off-by: Selina Song <selsong@amazon.com>

* correct OS Exec Engine revert to 5ec9603

Signed-off-by: Selina Song <selsong@amazon.com>

* restore OS Exec Engine

Signed-off-by: Selina Song <selsong@amazon.com>

* update QueryAnonymizer Test to reflect default

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Add null=1, Other fill zero wip

Signed-off-by: Selina Song <selsong@amazon.com>

* Add doctest and update ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Replace detectFieldIndices function

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix Other in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix OTHER in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

* update doc with null example

Signed-off-by: Selina Song <selsong@amazon.com>

* null not included in limit calc

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test format

Signed-off-by: Selina Song <selsong@amazon.com>

* Update ExplainIT with count

Signed-off-by: Selina Song <selsong@amazon.com>

* remove unused code

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix SQL tests nullable

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

---------

Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selinasong6@gmail.com>
Co-authored-by: Selina Song <selsong@amazon.com>
(cherry picked from commit e2678a1)
selsong added a commit to selsong/sql that referenced this pull request Sep 4, 2025
* WIP: Support timechart grammar / AST

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Support span=unit in timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Return correct column format after span=unit

Signed-off-by: Selina Song <selsong@amazon.com>

* sort by @timestamp, group by aggregate function

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: pivot table by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Correct pivot format for by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix aggregation composite bucket limit
Signed-off-by: Selina Song <selsong@amazon.com>

* Add timechart.rst doc and PPLQueryAnonymizer test

Signed-off-by: Selina Song <selsong@amazon.com>

* Add ExplainIT for timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Restore reverse ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Update explainIT timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* spotlessApply formatting

Signed-off-by: Selina Song <selsong@amazon.com>

* format IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit=0 means no limit, show all values, no other

Signed-off-by: Selina Song <selsong@amazon.com>

* Add useother parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up format, fix column order

Signed-off-by: Selina Song <selsong@amazon.com>

* add test for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Increase test coverage for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename bin option expression, modify constructor

Signed-off-by: Selina Song <selsong@amazon.com>

* Make test data smaller, update IT, rst

Signed-off-by: Selina Song <selsong@amazon.com>

* add explain output to IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limitation to rst, format, comment

Signed-off-by: Selina Song <selsong@amazon.com>

* Add test coverage formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Refactor formatter for clarity

Signed-off-by: Selina Song <selsong@amazon.com>

* fix NPE

Signed-off-by: Selina Song <selsong@amazon.com>

* add import

Signed-off-by: Selina Song <selsong@amazon.com>

* update mapping to match smaller dataset

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix explainIT, improve code structure

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless

Signed-off-by: Selina Song <selsong@amazon.com>

* Support diff position parameters, count default 0 not null, update doc rst

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count aggr type int

Signed-off-by: Selina Song <selsong@amazon.com>

* rename @timestamp column

Signed-off-by: Selina Song <selsong@amazon.com>

* move parameter extraction to PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count type test, doc rst format

Signed-off-by: Selina Song <selsong@amazon.com>

* add test coverage

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: SQL query with limit useother

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryService

Signed-off-by: Selina Song <selsong@amazon.com>

* add limit useother to SQL query

Signed-off-by: Selina Song <selsong@amazon.com>

* SQL query working, WIP column rename

Signed-off-by: Selina Song <selsong@amazon.com>

* use loadIndex in IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename fields to match, update doc

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryResult

Signed-off-by: Selina Song <selsong@amazon.com>

* revert gradle build

Signed-off-by: Selina Song <selsong@amazon.com>

* fix format

Signed-off-by: Selina Song <selsong@amazon.com>

* Add count fill zero, update toSQL tests, doc

Signed-off-by: Selina Song <selsong@amazon.com>

* fix nits

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix rename aggregation only no by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* revert OS Exec Engine edits

Signed-off-by: Selina Song <selsong@amazon.com>

* correct OS Exec Engine revert to 5ec9603

Signed-off-by: Selina Song <selsong@amazon.com>

* restore OS Exec Engine

Signed-off-by: Selina Song <selsong@amazon.com>

* update QueryAnonymizer Test to reflect default

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Add null=1, Other fill zero wip

Signed-off-by: Selina Song <selsong@amazon.com>

* Add doctest and update ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Replace detectFieldIndices function

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix Other in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix OTHER in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

* update doc with null example

Signed-off-by: Selina Song <selsong@amazon.com>

* null not included in limit calc

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test format

Signed-off-by: Selina Song <selsong@amazon.com>

* Update ExplainIT with count

Signed-off-by: Selina Song <selsong@amazon.com>

* remove unused code

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix SQL tests nullable

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

---------

Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selinasong6@gmail.com>
Co-authored-by: Selina Song <selsong@amazon.com>
(cherry picked from commit e2678a1)
qianheng-aws pushed a commit that referenced this pull request Sep 5, 2025
…4232)

* Support timechart command with Calcite (#3993)

* WIP: Support timechart grammar / AST

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Support span=unit in timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Return correct column format after span=unit

Signed-off-by: Selina Song <selsong@amazon.com>

* sort by @timestamp, group by aggregate function

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: pivot table by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Correct pivot format for by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix aggregation composite bucket limit
Signed-off-by: Selina Song <selsong@amazon.com>

* Add timechart.rst doc and PPLQueryAnonymizer test

Signed-off-by: Selina Song <selsong@amazon.com>

* Add ExplainIT for timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* Restore reverse ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Update explainIT timechart

Signed-off-by: Selina Song <selsong@amazon.com>

* spotlessApply formatting

Signed-off-by: Selina Song <selsong@amazon.com>

* format IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limit=0 means no limit, show all values, no other

Signed-off-by: Selina Song <selsong@amazon.com>

* Add useother parameter

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up format, fix column order

Signed-off-by: Selina Song <selsong@amazon.com>

* add test for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Increase test coverage for formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename bin option expression, modify constructor

Signed-off-by: Selina Song <selsong@amazon.com>

* Make test data smaller, update IT, rst

Signed-off-by: Selina Song <selsong@amazon.com>

* add explain output to IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Add limitation to rst, format, comment

Signed-off-by: Selina Song <selsong@amazon.com>

* Add test coverage formatter

Signed-off-by: Selina Song <selsong@amazon.com>

* Refactor formatter for clarity

Signed-off-by: Selina Song <selsong@amazon.com>

* fix NPE

Signed-off-by: Selina Song <selsong@amazon.com>

* add import

Signed-off-by: Selina Song <selsong@amazon.com>

* update mapping to match smaller dataset

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix explainIT, improve code structure

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless

Signed-off-by: Selina Song <selsong@amazon.com>

* Support diff position parameters, count default 0 not null, update doc rst

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count aggr type int

Signed-off-by: Selina Song <selsong@amazon.com>

* rename @timestamp column

Signed-off-by: Selina Song <selsong@amazon.com>

* move parameter extraction to PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* clean up PPLService

Signed-off-by: Selina Song <selsong@amazon.com>

* fix count type test, doc rst format

Signed-off-by: Selina Song <selsong@amazon.com>

* add test coverage

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: SQL query with limit useother

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryService

Signed-off-by: Selina Song <selsong@amazon.com>

* add limit useother to SQL query

Signed-off-by: Selina Song <selsong@amazon.com>

* SQL query working, WIP column rename

Signed-off-by: Selina Song <selsong@amazon.com>

* use loadIndex in IT

Signed-off-by: Selina Song <selsong@amazon.com>

* Rename fields to match, update doc

Signed-off-by: Selina Song <selsong@amazon.com>

* revert QueryResult

Signed-off-by: Selina Song <selsong@amazon.com>

* revert gradle build

Signed-off-by: Selina Song <selsong@amazon.com>

* fix format

Signed-off-by: Selina Song <selsong@amazon.com>

* Add count fill zero, update toSQL tests, doc

Signed-off-by: Selina Song <selsong@amazon.com>

* fix nits

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix rename aggregation only no by field

Signed-off-by: Selina Song <selsong@amazon.com>

* Update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* revert OS Exec Engine edits

Signed-off-by: Selina Song <selsong@amazon.com>

* correct OS Exec Engine revert to 5ec9603

Signed-off-by: Selina Song <selsong@amazon.com>

* restore OS Exec Engine

Signed-off-by: Selina Song <selsong@amazon.com>

* update QueryAnonymizer Test to reflect default

Signed-off-by: Selina Song <selsong@amazon.com>

* WIP: Add null=1, Other fill zero wip

Signed-off-by: Selina Song <selsong@amazon.com>

* Add doctest and update ExplainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Replace detectFieldIndices function

Signed-off-by: Selina Song <selsong@amazon.com>

* update explainIT

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix Other in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix OTHER in zero fill case

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

* update doc with null example

Signed-off-by: Selina Song <selsong@amazon.com>

* null not included in limit calc

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test

Signed-off-by: Selina Song <selsong@amazon.com>

* update SQL test format

Signed-off-by: Selina Song <selsong@amazon.com>

* Update ExplainIT with count

Signed-off-by: Selina Song <selsong@amazon.com>

* remove unused code

Signed-off-by: Selina Song <selsong@amazon.com>

* Fix SQL tests nullable

Signed-off-by: Selina Song <selsong@amazon.com>

* spotless format

Signed-off-by: Selina Song <selsong@amazon.com>

---------

Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selinasong6@gmail.com>
Co-authored-by: Selina Song <selsong@amazon.com>
(cherry picked from commit e2678a1)

* restore parser

Signed-off-by: Selina Song <selsong@amazon.com>

---------

Signed-off-by: Selina Song <selsong@amazon.com>
Signed-off-by: Selina Song <selinasong6@gmail.com>
Co-authored-by: Selina Song <selsong@amazon.com>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC] Support timechart command in PPL

9 participants