Skip to content

MSQ window functions: Reject MVDs during window processing#17036

Merged
cryptoe merged 3 commits intoapache:masterfrom
Akshat-Jain:msq-wf-reject-mvds
Sep 23, 2024
Merged

MSQ window functions: Reject MVDs during window processing#17036
cryptoe merged 3 commits intoapache:masterfrom
Akshat-Jain:msq-wf-reject-mvds

Conversation

@Akshat-Jain
Copy link
Copy Markdown
Contributor

Description

This PR aims to reject MVDs when used as partition by columns during window function processing for MSQ engine.

Prior to this change, a MSQ query like the following was failing with ClassCastException:

select cityName, countryName, array_to_mv(array[1,length(cityName)]),
row_number() over (partition by  array_to_mv(array[1,length(cityName)]) order by countryName, cityName)
from wikipedia
where countryName in ('Austria', 'Republic of Korea') and cityName is not null
order by 1, 2, 3

This is a follow-up on #17002, where a similar thing was done for sql-native engine.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@github-actions github-actions Bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Sep 12, 2024
@Akshat-Jain
Copy link
Copy Markdown
Contributor Author

Based on offline discussion, we can wait for GlueingPartitioningOperator to be used, which would make it return a proper error message. So I'll close this PR.

@Akshat-Jain Akshat-Jain reopened this Sep 19, 2024
@Akshat-Jain
Copy link
Copy Markdown
Contributor Author

We had some queries like the following giving incorrect results, because the MVDs (List object) were getting converted to string:

image

As per offline discussion with @sreemanamala, have opened this PR and have added another check in RearrangedRowsAndColumns.

This is a temporary measure and would be removed once we have the GlueingPartitioningOperator approach as mentioned previously.

@abhishekagarwal87 abhishekagarwal87 modified the milestones: 30.0.1, 31.0.0 Sep 19, 2024
Copy link
Copy Markdown
Member

@kgyrtkirk kgyrtkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how frequently people will run into this - but the updates which supposed to be made to this part will make this check as well go away.
Let's have it for now

{
return accessor.getObject(pointers[start + rowNum]);
Object value = accessor.getObject(pointers[start + rowNum]);
if (ColumnType.STRING.equals(getType()) && value instanceof List) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we require this check for druid 31, we can live with this "hackery". Once #17038 gets merged, please remove this check.
cc @Akshat-Jain

@cryptoe cryptoe merged commit 40414cf into apache:master Sep 23, 2024
Akshat-Jain added a commit to Akshat-Jain/druid that referenced this pull request Sep 23, 2024
)

* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
abhishekagarwal87 pushed a commit that referenced this pull request Sep 25, 2024
…17127)

* MSQ window functions: Reject MVDs during window processing

* MSQ window functions: Reject MVDs during window processing

* Remove parameterization from MSQWindowTest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants