fix(drilling): drill by pagination works with MSSQL data source via db_engine_spec#34583
fix(drilling): drill by pagination works with MSSQL data source via db_engine_spec#34583sfirke wants to merge 11 commits into
Conversation
There was a problem hiding this comment.
Review by Korbit AI
Korbit automatically attempts to detect when you fix issues in new commits.
| Category | Issue | Status |
|---|---|---|
| Missing OFFSET verification before query adjustment ▹ view | ✅ Fix detected | |
| Inefficient SQLAlchemy Query Inspection ▹ view | ✅ Fix detected | |
| Non-Deterministic ORDER BY ▹ view | ✅ Fix detected |
Files scanned
| File Path | Reviewed |
|---|---|
| superset/db_engine_specs/mssql.py | ✅ |
| superset/models/core.py | ✅ |
Explore our documentation to understand the languages and file types we support and the files we ignore.
Check out our docs on how you can make Korbit work best for you and your team.
| # Try to access the internal query structure safely | ||
| has_offset = getattr(qry, "_offset", None) is not None | ||
| has_order_by = getattr(qry, "_order_by", None) is not None |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| # Add a default ORDER BY clause using a string literal | ||
| # that works with window functions | ||
| # This preserves data order without permutation | ||
| qry = qry.order_by(cast(literal("1"), String)) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
| if hasattr(self.db_engine_spec, "adjust_query_for_offset"): | ||
| qry = self.db_engine_spec.adjust_query_for_offset(qry) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #34583 +/- ##
===========================================
+ Coverage 0 72.88% +72.88%
===========================================
Files 0 572 +572
Lines 0 41415 +41415
Branches 0 4365 +4365
===========================================
+ Hits 0 30186 +30186
- Misses 0 10074 +10074
- Partials 0 1155 +1155
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull Request Overview
Fixes issue #24072 where drill-by pagination fails with MSSQL data sources by adding ORDER BY clauses when OFFSET is used without ORDER BY. MSSQL requires ORDER BY clauses with OFFSET operations, which was causing pagination to fail in drill-through functionality.
- Adds
adjust_query_for_offsetmethod to MSSQL engine spec that automatically adds ORDER BY when needed - Modifies
compile_sqla_queryin Database model to call the adjustment method before query compilation - Implements comprehensive test coverage for both the new MSSQL functionality and the database compilation integration
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| superset/db_engine_specs/mssql.py | Adds adjust_query_for_offset method to handle OFFSET without ORDER BY by adding appropriate ordering |
| superset/models/core.py | Modifies compile_sqla_query to call engine spec's adjust_query_for_offset when query has OFFSET |
| tests/unit_tests/db_engine_specs/test_mssql.py | Comprehensive tests for the new MSSQL adjust_query_for_offset method covering various scenarios |
| tests/unit_tests/models/core_test.py | Tests for the database compilation integration to ensure the adjustment method is called appropriately |
| qry = qry.order_by(first_column) | ||
| else: | ||
| # Fallback to literal if no columns are available | ||
| qry = qry.order_by(cast(literal("1"), String)) |
There was a problem hiding this comment.
Using a cast literal for ordering is unusual and may cause performance issues. Consider using a more standard approach like text('1') or ordering by a constant expression that doesn't require casting.
| qry = qry.order_by(cast(literal("1"), String)) | |
| qry = qry.order_by(literal(1)) |
| if qry.selected_columns: | ||
| first_column = list(qry.selected_columns)[0] |
There was a problem hiding this comment.
The condition qry.selected_columns may not work as expected for all SQLAlchemy Select objects. This attribute might not exist or behave consistently across different SQLAlchemy versions. Consider using qry.columns or a more robust method to check for selected columns.
| if qry.selected_columns: | |
| first_column = list(qry.selected_columns)[0] | |
| # Try to get the selected columns in a version-independent way | |
| selected_columns = ( | |
| getattr(qry, "selected_columns", None) | |
| or getattr(qry, "exported_columns", None) | |
| or getattr(qry, "_raw_columns", None) | |
| ) | |
| if selected_columns: | |
| first_column = list(selected_columns)[0] |
| if qry.selected_columns: | ||
| first_column = list(qry.selected_columns)[0] | ||
| qry = qry.order_by(first_column) | ||
| else: |
There was a problem hiding this comment.
This was AI's idea. If someone feels confident that columns will always be available in this context - seems likely to me - we could remove the if/else and this backup casting of 1.
|
Closed in favor of #34724 which just merged |
SUMMARY
This is a new attempt at fixing #24072 . It takes a different approach than my first attempt, #27442. This time it makes the ordering change in the Microsoft SQL Server (mssql) spec rather than for all engines.
I checked that this solves the problem in my environment. I would like input on whether my change to
compile_sqla_queryis correct.TESTING INSTRUCTIONS
Deploy to an environment with SQL Server data warehouse, try drilling and flipping through the pages. It works for me, if needed I can produce a video.
ADDITIONAL INFORMATION