Skip to content

Fix row schema of nested queries with ORDER BY clause to get proper row type during planning validation#13173

Closed
somu-imply wants to merge 10 commits intoapache:masterfrom
somu-imply:plan_error
Closed

Fix row schema of nested queries with ORDER BY clause to get proper row type during planning validation#13173
somu-imply wants to merge 10 commits intoapache:masterfrom
somu-imply:plan_error

Conversation

@somu-imply
Copy link
Copy Markdown
Contributor

@somu-imply somu-imply commented Oct 3, 2022

Queries like

SELECT __time,channel FROM wikipedia
WHERE
(
channel in
(
SELECT  channel FROM wikipedia
))
ORDER BY __time DESC

was failing in phase of partial query generation. The change in this PR just makes DruidOuterQueryRel eagerly 'validate' the subquery by explaining it, so that the exception happens early and the 'is valid' check catches and returns false, which allows the planner to proceed to other plans.

#12418 tried to address an issue of using alias with outer query with an order by clause. The reason was that during the query explaining part the source relation type was being mismatched. In the first run it was set to the name of the column while in the next run it was set to the alias name. To mitigate the difference we set the Data source of the subquery to use a QueryDataSource from a TableDataSource which introduced the defect shown here.

To solve this we move back to the old implementation and added the check for the rowtype with an additional toDriodQueryForExplain call that would ensure setting the correct row types. This additional call does not impact performance as the planning time remained almost similar. A run of our benchmarks for planning indicate that they remain in the same ball park.

Before this change

Benchmark             (query)  (rowsPerSegment)  (storageType)  (vectorize)  Mode  Cnt    Score   Error  Units
SqlBenchmark.planSql        0           5000000           mmap        force  avgt   10    1.425 ± 1.138  ms/op
SqlBenchmark.planSql        1           5000000           mmap        force  avgt   10    1.094 ± 0.070  ms/op
SqlBenchmark.planSql        2           5000000           mmap        force  avgt   10    1.220 ± 0.300  ms/op
SqlBenchmark.planSql        3           5000000           mmap        force  avgt   10    1.364 ± 0.150  ms/op
SqlBenchmark.planSql        4           5000000           mmap        force  avgt   10    2.393 ± 0.941  ms/op
SqlBenchmark.planSql        5           5000000           mmap        force  avgt   10    1.919 ± 0.198  ms/op
SqlBenchmark.planSql        6           5000000           mmap        force  avgt   10    1.884 ± 0.141  ms/op
SqlBenchmark.planSql        7           5000000           mmap        force  avgt   10    2.197 ± 0.369  ms/op
SqlBenchmark.planSql        8           5000000           mmap        force  avgt   10    3.206 ± 0.356  ms/op
SqlBenchmark.planSql        9           5000000           mmap        force  avgt   10    2.439 ± 0.245  ms/op
SqlBenchmark.planSql       10           5000000           mmap        force  avgt   10    0.981 ± 0.017  ms/op
SqlBenchmark.planSql       11           5000000           mmap        force  avgt   10    1.030 ± 0.049  ms/op
SqlBenchmark.planSql       12           5000000           mmap        force  avgt   10    0.769 ± 0.016  ms/op
SqlBenchmark.planSql       13           5000000           mmap        force  avgt   10    1.185 ± 0.113  ms/op
SqlBenchmark.planSql       14           5000000           mmap        force  avgt   10    1.240 ± 0.053  ms/op
SqlBenchmark.planSql       15           5000000           mmap        force  avgt   10    0.791 ± 0.016  ms/op
SqlBenchmark.planSql       16           5000000           mmap        force  avgt   10    0.963 ± 0.038  ms/op
SqlBenchmark.planSql       17           5000000           mmap        force  avgt   10    1.096 ± 0.008  ms/op
SqlBenchmark.planSql       18           5000000           mmap        force  avgt   10    1.289 ± 0.058  ms/op
SqlBenchmark.planSql       19           5000000           mmap        force  avgt   10  138.326 ± 5.060  ms/op
SqlBenchmark.planSql       20           5000000           mmap        force  avgt   10    0.932 ± 0.029  ms/op

After the change

Benchmark             (query)  (rowsPerSegment)  (storageType)  (vectorize)  Mode  Cnt    Score   Error  Units
SqlBenchmark.planSql        0           5000000           mmap        force  avgt   10    0.727 ± 0.060  ms/op
SqlBenchmark.planSql        1           5000000           mmap        force  avgt   10    0.714 ± 0.023  ms/op
SqlBenchmark.planSql        2           5000000           mmap        force  avgt   10    0.830 ± 0.098  ms/op
SqlBenchmark.planSql        3           5000000           mmap        force  avgt   10    1.007 ± 0.110  ms/op
SqlBenchmark.planSql        4           5000000           mmap        force  avgt   10    1.413 ± 0.219  ms/op
SqlBenchmark.planSql        5           5000000           mmap        force  avgt   10    1.356 ± 0.088  ms/op
SqlBenchmark.planSql        6           5000000           mmap        force  avgt   10    1.804 ± 0.217  ms/op
SqlBenchmark.planSql        7           5000000           mmap        force  avgt   10    1.697 ± 0.251  ms/op
SqlBenchmark.planSql        8           5000000           mmap        force  avgt   10    2.368 ± 0.486  ms/op
SqlBenchmark.planSql        9           5000000           mmap        force  avgt   10    2.430 ± 0.229  ms/op
SqlBenchmark.planSql       10           5000000           mmap        force  avgt   10    1.244 ± 0.404  ms/op
SqlBenchmark.planSql       11           5000000           mmap        force  avgt   10    0.947 ± 0.133  ms/op
SqlBenchmark.planSql       12           5000000           mmap        force  avgt   10    0.746 ± 0.064  ms/op
SqlBenchmark.planSql       13           5000000           mmap        force  avgt   10    1.485 ± 0.176  ms/op
SqlBenchmark.planSql       14           5000000           mmap        force  avgt   10    1.608 ± 0.059  ms/op
SqlBenchmark.planSql       15           5000000           mmap        force  avgt   10    0.965 ± 0.032  ms/op
SqlBenchmark.planSql       16           5000000           mmap        force  avgt   10    1.137 ± 0.072  ms/op
SqlBenchmark.planSql       17           5000000           mmap        force  avgt   10    1.346 ± 0.039  ms/op
SqlBenchmark.planSql       18           5000000           mmap        force  avgt   10    1.539 ± 0.062  ms/op
SqlBenchmark.planSql       19           5000000           mmap        force  avgt   10  154.463 ± 5.760  ms/op
SqlBenchmark.planSql       20           5000000           mmap        force  avgt   10    1.041 ± 0.024  ms/op

Separate test cases has been updated for this failed test case and also for #13163

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@somu-imply somu-imply changed the title Updating rules to handle queries correctly Updating planner to handle queries correctly Oct 3, 2022
+ ")";
final String legacyExplanation =
"DruidOuterQueryRel(query=[{\"queryType\":\"groupBy\",\"dataSource\":{\"type\":\"query\",\"query\":{\"queryType\":\"scan\",\"dataSource\":{\"type\":\"table\",\"name\":\"__subquery__\"},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"resultFormat\":\"list\",\"granularity\":{\"type\":\"all\"}}},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"granularity\":{\"type\":\"all\"},\"dimensions\":[],\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"limitSpec\":{\"type\":\"NoopLimitSpec\"},\"context\":{\"defaultTimeout\":300000,\"maxScatterGatherBytes\":9223372036854775807,\"sqlCurrentTimestamp\":\"2000-01-01T00:00:00Z\",\"sqlQueryId\":\"dummy\",\"vectorize\":\"false\",\"vectorizeVirtualColumns\":\"false\"}}], signature=[{a0:LONG}])\n"
"DruidOuterQueryRel(query=[{\"queryType\":\"timeseries\",\"dataSource\":{\"type\":\"table\",\"name\":\"__subquery__\"},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"granularity\":{\"type\":\"all\"},\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"context\":{\"defaultTimeout\":300000,\"maxScatterGatherBytes\":9223372036854775807,\"sqlCurrentTimestamp\":\"2000-01-01T00:00:00Z\",\"sqlQueryId\":\"dummy\",\"vectorize\":\"false\",\"vectorizeVirtualColumns\":\"false\"}}], signature=[{a0:LONG}])\n"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the change in the subquery to move to a table data source from a query data source and calcite planning it differently

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly simpler. But, is it correct? Do we have a test that runs this same query, but we validate the results? The query in question is:

SELECT COUNT(*)
FROM (
  SELECT DISTINCT dim2
  FROM druid.foo
  WHERE SUBSTRING(dim2, 1, 1) IN (
    SELECT SUBSTRING(dim1, 1, 1) FROM druid.foo WHERE dim1 IS NOT NULL
  )
)

The DISTINCT requires grouping. For the new plan to be valid, we'd have to know that the Timeseries query can do the grouping. yet, Timeseries, I believe, only knows to group on time. So, the new plan may not actually produce the correct results. (That said, this stuff is complex, and perhaps there is something I'm missing.)

Copy link
Copy Markdown
Contributor Author

@somu-imply somu-imply Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I validated the result for both the plans it returned the same value of count. Also the outer query now is a timeseries but one of the inner queries still does a group by in Line 113. So the grouping and counting should be intact imo

@somu-imply somu-imply marked this pull request as ready for review October 3, 2022 22:42
Copy link
Copy Markdown
Contributor

@paul-rogers paul-rogers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@somu-imply, Calcite is hard, thanks for digging in to track down the issue. I wonder, however, if we've actually found the root cause as some of the changes seem a bit off.

It would be super-helpful if you could add a bit more text to the description. First, what result do you see today? Why is that wrong? What is the result we expect? Why are we not getting that result? That then sets up up to explain why your fix corrects the issue and now does deliver the correct results.

);
return null;
}
if (!dataSource.isConcrete()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unfortunate limitation. There is another PR to fix this limitation. I wonder, why was the check removed in this PR?

Copy link
Copy Markdown
Contributor Author

@somu-imply somu-imply Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was introduced during #12418 as we moved to using a QueryDataSource. A TableDataSource always return true for isConcrete. So that part of the code can be avoided

// fallback for cases other than group by inner query with order by with alias on __time column
sourceRowSignature = RowSignatures.fromRelDataType(
sourceRel.getRowType().getFieldNames(),
sourceRel.getRowType()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the sourceRel return the wrong row type? A naive reading of this says that we ask the source rel for its row type. Each rel can have only one "output" row type. Why do we get the wrong one if the source rel is a DruidRel? Is there a bug earlier in the food chain? Said another way, is the getRowType() method wrong for DruidRel?

Copy link
Copy Markdown
Contributor Author

@somu-imply somu-imply Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a case encountered during aliasing. If a query was specified with an alias say for example select __time as t, dim1 as d from foo for the inner query the field name was changing to t from __time for that reason the fallback was used. #12418 should have some more details

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If aliasing is used, Calcite should ensure that the signature of the inner SELECT provides the alias names. If not, we broke something since this is pretty basic SQL functionality.

If, on the other hand, we want the column name __time, even if the user said __time AS t, then we're using non-standard SQL behavior. We can unpack the alias to work back to the base column if we need to know that this is the __time column. However, we should not need to: the outer query should not, in general, special-case __time. Our implementation might, given that native queries are focused on the event use case.

Suppose we have to know that t is really __time. The proper thing to do is to check if the column is an alias. If so, get the base column as above. It may also be possible to attach traits to the column. Columns an have traits such as their cardinality, etc. We might be able to add a trait saying the column kind (__time, dimension, measure or whatever.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another concern. If we return as the row signature the signature of the source, we may miss computed columns:

SELECT * FROM
  SELECT __time AS t, myCol as c, a + b AS d FROM foo

Here, the signature of the inner query is (t TIMESTAMP, c VARCHAR, d BIGINT), say. If we get the source signature (based on foo), we would omit the computed column d, which is likely to cause cascading issues.

Let's be sure to test this case to validate the workaround.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the test

call.transformTo(outerQueryRel);
try {
call.transformTo(outerQueryRel);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be super-helpful to include comments on code such as this. Calcite is hard enough to understand already: every hint helps.

This occurs in the sort rule: Sort <-- DruidRel --> DruidRel(with sort). Before, if the replacement failed, we'd just throw the exception. Here, we're catching an exception and ignoring the rewrite.

This seems a bit fishy. First, That sort has to be pushed into the query, right? We have no way to express a sort as an "operator" without it being part of a native query.

Second, waiting for it to fail, then ignoring the error, seems...odd. Should we not check whatever the condition is we want to ignore, and avoid doing the rewrite in that case? But, that raises the above question: if we don't rewrite away that sort here, when will we do it?

Point is: Calcite is really complex. Sometimes the fix is not where you first find the problem, it can be multiple steps back where we introduced a fault.

Copy link
Copy Markdown
Contributor

@gianm gianm Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right to be suspicious. This try/catch is papering over some other bug. We should never be doing a call.transformTo that isn't going to work: our rules and our calls to isValidDruidQuery() are meant to be validating what is OK before we attempt to do the transformation. So, instead of doing the try/catch here, it's better to fix the bug that caused us to get to this line in the first place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments. I'll investigate further

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the try/catch thing as this is getting fixed upstream

+ ")";
final String legacyExplanation =
"DruidOuterQueryRel(query=[{\"queryType\":\"groupBy\",\"dataSource\":{\"type\":\"query\",\"query\":{\"queryType\":\"scan\",\"dataSource\":{\"type\":\"table\",\"name\":\"__subquery__\"},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"resultFormat\":\"list\",\"granularity\":{\"type\":\"all\"}}},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"granularity\":{\"type\":\"all\"},\"dimensions\":[],\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"limitSpec\":{\"type\":\"NoopLimitSpec\"},\"context\":{\"defaultTimeout\":300000,\"maxScatterGatherBytes\":9223372036854775807,\"sqlCurrentTimestamp\":\"2000-01-01T00:00:00Z\",\"sqlQueryId\":\"dummy\",\"vectorize\":\"false\",\"vectorizeVirtualColumns\":\"false\"}}], signature=[{a0:LONG}])\n"
"DruidOuterQueryRel(query=[{\"queryType\":\"timeseries\",\"dataSource\":{\"type\":\"table\",\"name\":\"__subquery__\"},\"intervals\":{\"type\":\"intervals\",\"intervals\":[\"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z\"]},\"granularity\":{\"type\":\"all\"},\"aggregations\":[{\"type\":\"count\",\"name\":\"a0\"}],\"context\":{\"defaultTimeout\":300000,\"maxScatterGatherBytes\":9223372036854775807,\"sqlCurrentTimestamp\":\"2000-01-01T00:00:00Z\",\"sqlQueryId\":\"dummy\",\"vectorize\":\"false\",\"vectorizeVirtualColumns\":\"false\"}}], signature=[{a0:LONG}])\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is certainly simpler. But, is it correct? Do we have a test that runs this same query, but we validate the results? The query in question is:

SELECT COUNT(*)
FROM (
  SELECT DISTINCT dim2
  FROM druid.foo
  WHERE SUBSTRING(dim2, 1, 1) IN (
    SELECT SUBSTRING(dim1, 1, 1) FROM druid.foo WHERE dim1 IS NOT NULL
  )
)

The DISTINCT requires grouping. For the new plan to be valid, we'd have to know that the Timeseries query can do the grouping. yet, Timeseries, I believe, only knows to group on time. So, the new plan may not actually produce the correct results. (That said, this stuff is complex, and perhaps there is something I'm missing.)

public void testOrderByAlongWithInternalScanQuery()
{
testQuery(
"select __time as t, m1 from druid.foo where (m1 in (select distinct m1 from druid.foo)) order by 1 limit 1",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new test. The PR description provides a different query. However, I see no test for that specific query. Does such a test already exist? If so, then this change should have required a change to the expected results. Else, should we add a test for the actual case that this PR claims to fix?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test mimicking the actual query

@paul-rogers
Copy link
Copy Markdown
Contributor

One other thought: the title of this PR is a bit generic. This PR handles a very specific use case. Part of the challenge of reviewing is the need to reverse-engineer the use case from the code. Would be super-helpful for the title to say, "Fix row schema for nested queries with ..." or whatever the specific issue is.

@somu-imply somu-imply changed the title Updating planner to handle queries correctly Fix row schema of nested queries to get proper row type during planning validation Oct 6, 2022
@abhishekagarwal87
Copy link
Copy Markdown
Contributor

It will likely fix #13163

@abhishekagarwal87
Copy link
Copy Markdown
Contributor

@somu-imply - I looked at that benchmark. None of those queries seem to have any complex sub-query. It would be better to hand-craft a couple of nested group-by queries that are going to hit the code path you changed. The nested group-by can have a large in-filter (1000). This is when we know how much time is getting added to planning.

@somu-imply
Copy link
Copy Markdown
Contributor Author

Will do so @abhishekagarwal87 . Based on comments from Paul and Gian I'll look into fixing the DruidRel so that the try catch can be avoided

@paul-rogers
Copy link
Copy Markdown
Contributor

@somu-imply, thanks much for the detailed explanation! It strikes me that the problem is in how we validate queries. We really should not have to do an artificial EXPLAIN to get the right results. The way it should work is:

  • Calcite validates the query. This will check all types, etc.
  • Convert the query into a logical plan (the Rel nodes)
  • Resolve the type (i.e. row schema) for each Rel
  • Convert the Rel nodes into a native query.

As a result, every step should have a correct row type (schema). If not, all heck breaks loose same as if we randomly changed some numbers in an algorithm and expected correct answers.

If we have to jimmy the code to get the right answers, it means something is wrong somewhere in that list above. Unfortunately, Calcite is horribly complex. (Not Calcite's fault, all SQL planners are horribly complex.) So, it is a real PITA to find where things go wrong.

As a result, we may have to do a hack just to keep things moving. But, the need for the hack suggests there are deeper issues, and we should track those down at some point, else they'll just pop up again elsewhere.

@somu-imply
Copy link
Copy Markdown
Contributor Author

somu-imply commented Oct 12, 2022

@abhishekagarwal87 I have updated the SqlBenchmarks with 2 additional cases which hit the new code in DruidOuterQueryRel. These cases have inner query with large in filters (~1000) based on

("dimSequential", ValueType.STRING, false, 1, null, 0, 1000),

I tested the working build (before 2022.04) and the current with the changes.

Before

Benchmark             (query)  (rowsPerSegment)  (vectorize)  Mode  Cnt  Score   Error  Units
SqlBenchmark.planSql       21           5000000        false  avgt    5  2.819 ± 0.142  ms/op
SqlBenchmark.planSql       22           5000000        false  avgt    5  2.920 ± 0.306  ms/op

After

Benchmark             (query)  (rowsPerSegment)  (storageType)  (vectorize)  Mode  Cnt  Score   Error  Units
SqlBenchmark.planSql       21           5000000           mmap        false  avgt    5  3.261 ± 0.299  ms/op
SqlBenchmark.planSql       22           5000000           mmap        false  avgt    5  3.215 ± 0.236  ms/op

There seems to be not much increase in planning time. I added another nested query with group by as well which also had a similar planning time

SqlBenchmark.planSql       23           5000000           mmap        force  avgt    5  3.627 ± 0.910  ms/op

//23: Group by and Order by with alias with large in filter nested query
"SELECT __time as t, dimSequential from foo "
+ " where dimSequential in (select dimSequential from foo where "
+ " dimSequential in (select dimSequential from foo)) "
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A large foo table here does not affect planning time. But if it was instead a filter with inline data like dimSequential in ('1', '2', '3' .. '100') that we know does increase the planning time.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @abhishekagarwal87 I have updated the benchmark tests by adding 2 queries with ~1000 values of IN filters.

Before (using 0.22.0 as baseline)

Benchmark             (query)  (rowsPerSegment)  (vectorize)  Mode  Cnt    Score    Error  Units
SqlBenchmark.planSql       24           5000000        false  avgt    5  189.154 ± 36.029  ms/op
SqlBenchmark.planSql       25           5000000        false  avgt    5  135.821 ± 22.867  ms/op

After (this PR build )

Benchmark             (query)  (rowsPerSegment)  (storageType)  (vectorize)  Mode  Cnt    Score    Error  Units
SqlBenchmark.planSql       24           5000000           mmap        false  avgt    5  229.036 ± 49.337  ms/op
SqlBenchmark.planSql       25           5000000           mmap        false  avgt    5  145.159 ± 28.671  ms/op

There has been an increase of ~40ms for a query with 1000 IN filters where in the other nested query the increase has been ~10ms. Please suggest if this much change is acceptable.

@somu-imply
Copy link
Copy Markdown
Contributor Author

@paul-rogers I have removed the dubious try-catch which was trying to patch things as the rowtype should be set correctly before the code reaches there. But doing additional work in DruidOuterQueryRel leads to an increase in planning time. The travis passed on the last run and my last commit just had benchmarks updated so I expect travis to pass again

@somu-imply somu-imply changed the title Fix row schema of nested queries to get proper row type during planning validation Fix row schema of nested queries with ORDER BY clause to get proper row type during planning validation Oct 12, 2022
@somu-imply
Copy link
Copy Markdown
Contributor Author

somu-imply commented Mar 22, 2023

Closing this in favor of #13965 which fixes the planning errors on the most updated code. All the tests added here are also a part of that PR as well

@somu-imply somu-imply closed this Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Join queries with order by clause doesn't work

5 participants