Skip to content

fix(drilling): drill by pagination works with MSSQL data source#27442

Closed
sfirke wants to merge 11 commits into
apache:masterfrom
sfirke:mssql-drill-offset
Closed

fix(drilling): drill by pagination works with MSSQL data source#27442
sfirke wants to merge 11 commits into
apache:masterfrom
sfirke:mssql-drill-offset

Conversation

@sfirke
Copy link
Copy Markdown
Member

@sfirke sfirke commented Mar 8, 2024

SUMMARY

fixes #24072 in which Microsoft SQL Server requires an order-by column in order for pagination to succeed. In doing so, now the drilled data is always sorted by the 1st column in ascending order. Previously it retained the dataset's sort order.

Better fixes outside of my ability might be:

  • Implement this on a per-database level
  • Sort by row number or the equivalent to functionally do nothing and preserve the underlying data order
  • Improve user controls for how drilled data displays, allowing the user to control which columns are shown and what the sort order is

But I think the harm introduced (re-sorting the drilled data) is outweighed by fixing this feature for MSSQL.

TESTING INSTRUCTIONS

  • CI
  • Sam deployed and manually tested the docker image

BEFORE (4.0.0rc1

Drilling.in.4.0.0.take.2-20240313_160744-Meeting.Recording.mp4

AFTER

Post-Patch.MSSQL.Drilling-20240313_160519-Meeting.Recording.mp4

ADDITIONAL INFORMATION

Description by Korbit AI

What change is being made?

Update the drill-to-detail functionality to ensure drill by pagination works with MSSQL data sources, including adjustments to test expectations and query ordering.

Why are these changes being made?

These changes fix an issue where the drill feature did not work correctly with MSSQL due to ordering issues, by integrating proper ordering logic in the query and aligning test data with the new expected behavior. This approach ensures compatibility with additional data sources such as MSSQL, while maintaining existing functionality.

Is this description stale? Ask me to generate a new description by commenting /korbit-generate-pr-description

@sfirke sfirke changed the title fix(drilling): drill by pagination works with MSSQL data source DON'T MERGE - fix(drilling): drill by pagination works with MSSQL data source Mar 8, 2024
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.69%. Comparing base (ad9024b) to head (2e580d6).
Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #27442      +/-   ##
==========================================
+ Coverage   67.39%   69.69%   +2.30%     
==========================================
  Files        1909     1909              
  Lines       74744    74744              
  Branches     8327     8327              
==========================================
+ Hits        50371    52095    +1724     
+ Misses      22323    20599    -1724     
  Partials     2050     2050              
Flag Coverage Δ
javascript 57.22% <ø> (ø)
mysql 78.03% <100.00%> (ø)
postgres 78.14% <100.00%> (ø)
presto 53.69% <50.00%> (?)
python 83.06% <100.00%> (+4.77%) ⬆️
sqlite 77.57% <100.00%> (ø)
unit 56.68% <50.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@john-bodley
Copy link
Copy Markdown
Member

@sfirke once this is ready for review would you mind adding the relevant reviewers?

@michael-s-molina
Copy link
Copy Markdown
Member

michael-s-molina commented Mar 12, 2024

@sfirke the CI error you're getting

superset/common/query_actions.py:156: error: List item 0 has incompatible type "Tuple[Union[AdhocColumn, str], bool]"; expected "Tuple[Union[AdhocMetric, str], bool]" [list-item]

is because you're passing a Column here

query_obj.orderby = [(query_obj.columns[0], True)]

instead of a Metric which is what OrderBy expects.

Column = Union[AdhocColumn, str]
Metric = Union[AdhocMetric, str]
OrderBy = tuple[Metric, bool]

Comment thread superset/common/query_actions.py Outdated
@sfirke sfirke changed the title DON'T MERGE - fix(drilling): drill by pagination works with MSSQL data source fix(drilling): drill by pagination works with MSSQL data source Mar 13, 2024
@sfirke sfirke changed the title fix(drilling): drill by pagination works with MSSQL data source fix(drilling): drill by pagination works with MSSQL data source (still testing) Mar 13, 2024
@sfirke sfirke marked this pull request as ready for review March 13, 2024 16:37
@sfirke

This comment was marked as off-topic.

@rusackas

This comment was marked as off-topic.

@sfirke

This comment was marked as off-topic.

@sfirke sfirke changed the title fix(drilling): drill by pagination works with MSSQL data source (still testing) fix(drilling): drill by pagination works with MSSQL data source Mar 13, 2024
@mistercrunch

This comment was marked as off-topic.

@sfirke

This comment was marked as off-topic.

@rusackas

This comment was marked as off-topic.

@mistercrunch

This comment was marked as off-topic.

@sfirke
Copy link
Copy Markdown
Member Author

sfirke commented Mar 15, 2024

I'm stuck, it's unclear to me why this test is failing:

 =================================== FAILURES ===================================
___________ TestPostChartDataApi.test_chart_data_applied_time_extras ___________

self = <tests.integration_tests.charts.data.api_tests.TestPostChartDataApi testMethod=test_chart_data_applied_time_extras>

    @pytest.mark.usefixtures("load_birth_names_dashboard_with_slices")
    def test_chart_data_applied_time_extras(self):
        """
        Chart data API: Test chart data query with applied time extras
        """
        self.query_context_payload["queries"][0]["applied_time_extras"] = {
            "__time_range": "100 years ago : now",
            "__time_origin": "now",
        }
        rv = self.post_assert_metric(CHART_DATA_URI, self.query_context_payload, "data")
>       self.assertEqual(rv.status_code, 200)
E       AssertionError: 400 != 200

And why is test-postgres-presto passing but test-postgres-hive not? Update: Is that test just broken on master branch right now? I see it failed on #27520 😭

@github-actions github-actions Bot added api Related to the REST API doc Namespace | Anything related to documentation plugins github_actions Pull requests that update GitHub Actions code labels Mar 21, 2024
@sfirke sfirke closed this Mar 21, 2024
@sfirke sfirke reopened this Mar 21, 2024
@github-actions github-actions Bot removed api Related to the REST API doc Namespace | Anything related to documentation plugins github_actions Pull requests that update GitHub Actions code labels Mar 21, 2024
@sfirke sfirke force-pushed the mssql-drill-offset branch from c3f714a to c638709 Compare March 21, 2024 14:16
@rusackas
Copy link
Copy Markdown
Member

Looks like this is pretty stuck. Maybe it needs a rebase? I'll convert it to draft for now, but mark it as ready if/when you think it might be.

@rusackas
Copy link
Copy Markdown
Member

/korbit-review

@rusackas rusackas marked this pull request as draft March 20, 2025 01:24
Copy link
Copy Markdown

@korbit-ai korbit-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've completed my review and didn't find any issues.

Suppressed issues based on your team's Korbit activity
This issue Is similar to Because

line 182:

The code assumes query_obj.columns will always have at least one element, which could lead to an IndexError if the columns list is empty.

Missing empty array check for queriesResponse

Similar issues were not addressed in the past

When you react to issues (for example, an upvote or downvote) or you fix them, Korbit will tune future reviews based on these signals.

Files scanned
File Path Reviewed
superset/superset_typing.py
superset/common/query_actions.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • On any given comment that Korbit raises on your pull request, you can have a discussion with Korbit by replying to the comment.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Documentation
Logging
Error Handling
Readability
Design
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

@sfirke
Copy link
Copy Markdown
Member Author

sfirke commented Mar 20, 2025 via email

@korbit-ai
Copy link
Copy Markdown

korbit-ai Bot commented Mar 20, 2025

@sfirke I am looking at your pull request. The description will be updated shortly. In the meantime, please do not edit the description until I have finished writing mine.

Copy link
Copy Markdown

@korbit-ai korbit-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Functionality Unsafe Column List Access ▹ view
Files scanned
File Path Reviewed
superset/superset_typing.py
superset/common/query_actions.py

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • On any given comment that Korbit raises on your pull request, you can have a discussion with Korbit by replying to the comment.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Current Korbit Configuration

General Settings
Setting Value
Review Schedule Automatic excluding drafts
Max Issue Count 10
Automatic PR Descriptions
Issue Categories
Category Enabled
Documentation
Logging
Error Handling
Readability
Design
Performance
Security
Functionality

Feedback and Support

Note

Korbit Pro is free for open source projects 🎉

Looking to add Korbit to your team? Get started with a free 2 week trial here

else:
qry_obj_cols.append(o.column_name)
query_obj.columns = qry_obj_cols
query_obj.orderby = [(query_obj.columns[0], True)]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe Column List Access category Functionality

Tell me more
What is the issue?

The code assumes query_obj.columns is non-empty when setting the orderby clause, which could lead to an IndexError if columns list is empty.

Why this matters

If no columns are present in the datasource or query object, accessing index 0 will crash the application with an IndexError.

Suggested change ∙ Feature Preview

Add a guard clause to check for columns before setting the orderby:

if query_obj.columns:
    query_obj.orderby = [(query_obj.columns[0], True)]
else:
    query_obj.orderby = []
Provide feedback to improve future suggestions

Nice Catch Incorrect Not in Scope Not in coding standard Other

💬 Looking for more details? Reply to this comment to chat with Korbit.

@rusackas
Copy link
Copy Markdown
Member

@sfirke is this still WIP?

@sfirke
Copy link
Copy Markdown
Member Author

sfirke commented Jul 25, 2025

@rusackas yes, alas. I will still get to this sometime, but not imminently.

@sfirke
Copy link
Copy Markdown
Member Author

sfirke commented Aug 6, 2025

@rusackas I have superseded this pull request with a new version, #34583, that is passing CI. It needs a review but maybe we can get this into 6.0 and 5.0.1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The second page of drill-to-detail with MSSQL data source errors"requires an order_by when using an OFFSET or a non-simple LIMIT clause"

6 participants