chore(native-filters): Fetch only the required dataset fields by john-bodley · Pull Request #23303 · apache/superset

john-bodley · 2023-03-07T21:22:30Z

SUMMARY

This PR partially addresses recommendation #⁠1 in #14383 by requesting—via the FAB API—only the required fields (columns) from the /api/v1/dataset/{pk} RESTful GET endpoint.

Fetching the entirety of dataset (especially a very large datasets) can be rather expensive which. #22413 made significant performance improvements (reducing the response time from > 300 seconds to ~ 5 seconds) however later #23113 was introduced to address some performance tradeoffs (in other contexts) by disabling eager loading (increasing the response time from ~ 5 seconds to ~ 30 seconds). The bulk of this increased cost was from the repeated lazy loading of the back referenced table i.e., self.table, for every dataset column (even though this is always the same table; possibly a SQLAlchemy quirk) for a subset of the TableColumn fields including the advanced data type.

This PR updates—in the context of the native filters modal—the /api/v1/dataset/{pk} requests to fetch only the subset of columns required in the response. This helps to ensure the we're:

Not building overly complex SQLAlchemy queries on the backend.
Reducing the response payload.

For the dataset in question said change reduced the response time from ~ 30 seconds to ~ 3 seconds which significantly improves the UX when interfacing with the modal.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

CI.

ADDITIONAL INFORMATION

Has associated issue: [SIP-64] Migrate filter_box to Dashboard Native Filter Component #14383
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2023-03-07T21:30:21Z

Codecov Report

Merging #23303 (2fd632d) into master (da3791a) will increase coverage by 9.66%.
The diff coverage is 78.61%.

❗ Current head 2fd632d differs from pull request most recent head a1c5cf6. Consider uploading reports for the commit a1c5cf6 to get more accurate results

@@            Coverage Diff             @@
##           master   #23303      +/-   ##
==========================================
+ Coverage   56.27%   65.93%   +9.66%     
==========================================
  Files        1907     1907              
  Lines       73495    73590      +95     
  Branches     7977     7982       +5     
==========================================
+ Hits        41356    48524    +7168     
+ Misses      30091    23018    -7073     
  Partials     2048     2048

Flag	Coverage Δ
javascript	`53.77% <78.46%> (-0.03%)`	⬇️
unit	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...chart-echarts/src/Timeseries/Area/controlPanel.tsx	`40.00% <ø> (ø)`
...charts/src/Timeseries/Regular/Bar/controlPanel.tsx	`35.71% <ø> (ø)`
...harts/src/Timeseries/Regular/Line/controlPanel.tsx	`33.33% <ø> (ø)`
...ts/src/Timeseries/Regular/Scatter/controlPanel.tsx	`40.00% <ø> (ø)`
...src/Timeseries/Regular/SmoothLine/controlPanel.tsx	`40.00% <ø> (ø)`
...chart-echarts/src/Timeseries/Step/controlPanel.tsx	`33.33% <ø> (ø)`
...s/plugin-chart-echarts/src/Timeseries/constants.ts	`100.00% <ø> (ø)`
...gin-chart-echarts/src/Timeseries/transformProps.ts	`57.14% <0.00%> (-1.81%)`	⬇️
...tersConfigModal/FiltersConfigForm/ColumnSelect.tsx	`77.14% <ø> (ø)`
...onfigModal/FiltersConfigForm/FiltersConfigForm.tsx	`55.17% <ø> (-4.49%)`	⬇️
... and 20 more

... and 341 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

john-bodley · 2023-03-08T18:41:40Z

Closing (for now) in favor of #23314.

michael-s-molina

Thank you for the PR @john-bodley. I left some comments.

michael-s-molina · 2023-03-16T13:39:07Z

    if (datasetId != null) {
      cachedSupersetGet({
-        endpoint: `/api/v1/dataset/${datasetId}`,
+        endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({


The columns state is later used to get currentColumn which is then passed to the filterValues function. If you check filterValues references, you will see that is_dttm and type_generic attributes are also used.

michael-s-molina · 2023-03-16T14:38:51Z

      cachedSupersetGet({
-        endpoint: `/api/v1/dataset/${datasetId}`,
+        endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({
+          columns: [


I found other attributes that need to be queried when analyzing the following files:

superset/superset-frontend/src/explore/components/controls/FilterControl/AdhocFilterControl/index.jsx

Line 138 in da3791a

const {

database.id datasource_name schema is_sqllab_view

superset/superset-frontend/src/explore/components/controls/FilterControl/AdhocFilterEditPopoverSimpleTabContent/index.tsx

Line 121 in da3791a

const isColumnBoolean =

columns.type

superset/superset-frontend/packages/superset-ui-chart-controls/src/utils/getTemporalColumns.ts

Line 43 in da3791a

if (isDataset(datasource)) {

columns.is_dttm columns.name main_dttm_col

superset/superset-frontend/packages/superset-ui-chart-controls/src/types.ts

Line 482 in 0c454c6

return !!datasource && 'results' in datasource && 'sql' in datasource;

results sql

@michael-s-molina it seem like results isn't an attribute returned per here.

@john-bodley You're right. Given that results is not listed in show_select_columns, I'm assuming that it's populated elsewhere and you projections won't affect this behavior. Tagging @zhaoyongjie and @villebro in case they know something different.

Agreed, I think this is needed in the Explore control panel only, hence shouldn't affect dashboards/native filters.

michael-s-molina · 2023-03-17T12:27:21Z

@villebro @zhaoyongjie Can you double check if there are no missing attributes or invalid projection names?

villebro

Tested to work as expected and LGTM after @michael-s-molina 's database_name comment has been addressed.

villebro · 2023-03-17T12:42:04Z

+            'columns.is_dttm',
+            'columns.type_generic',


I'm not 100 % sure on this (more like 90 %), but I feel like is_dttm is no longer necessary, and type_generic === GenericDataType.TEMPORAL should always be true for any col that has is_dttm === true, and is the preferred way of checking for temporal columns in the frontend. So if we want to reduce payload size, this is kind of a low hanging fruit.

@villebro thanks for the review. If your hypothesis is correct then it seems like a follow up PR would be to deprecate the is_dttm column everywhere.

villebro · 2023-03-17T12:45:07Z

      cachedSupersetGet({
-        endpoint: `/api/v1/dataset/${datasetId}`,
+        endpoint: `/api/v1/dataset/${datasetId}?q=${rison.encode({
+          columns: [


Agreed, I think this is needed in the Explore control panel only, hence shouldn't affect dashboards/native filters.

…rsConfigModal/FiltersConfigForm/FiltersConfigForm.tsx Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>

john-bodley · 2023-03-17T19:00:08Z

            'columns.verbose_name',
            'database.id',
-            'database_name',
+            'database.database_name',


It's somewhat weird how FAB doesn't barf when you give it invalid columns.

michael-s-molina

LGTM

…#23303) Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com> (cherry picked from commit ffc0a81)

…#23303) Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>

pull-request-size Bot added the size/S label Mar 7, 2023

john-bodley force-pushed the john-bodley--native-filters-dataset-columns branch 2 times, most recently from 31a91aa to 1b53e5d Compare March 7, 2023 23:50

john-bodley mentioned this pull request Mar 8, 2023

fix(native-filters): Caching scope #23314

Merged

9 tasks

john-bodley closed this Mar 8, 2023

john-bodley deleted the john-bodley--native-filters-dataset-columns branch March 8, 2023 18:41

john-bodley restored the john-bodley--native-filters-dataset-columns branch March 14, 2023 22:04

john-bodley reopened this Mar 14, 2023

john-bodley force-pushed the john-bodley--native-filters-dataset-columns branch 5 times, most recently from 419f484 to cce1165 Compare March 15, 2023 17:12

pull-request-size Bot added size/M and removed size/S labels Mar 15, 2023

john-bodley changed the title ~~chore(native-filters): Fetch only the dataset columns~~ chore(native-filters): Fetch only the required dataset columns Mar 15, 2023

john-bodley changed the title ~~chore(native-filters): Fetch only the required dataset columns~~ chore(native-filters): Fetch only the required dataset fields Mar 15, 2023

john-bodley force-pushed the john-bodley--native-filters-dataset-columns branch 3 times, most recently from b602228 to 01ffa3b Compare March 15, 2023 19:00

pull-request-size Bot added size/S and removed size/M labels Mar 15, 2023

john-bodley force-pushed the john-bodley--native-filters-dataset-columns branch from 01ffa3b to 8a64476 Compare March 15, 2023 19:34

john-bodley marked this pull request as ready for review March 15, 2023 20:33

john-bodley requested review from geido, ktmud, michael-s-molina and villebro March 15, 2023 20:34

john-bodley requested a review from rusackas March 16, 2023 01:11

michael-s-molina reviewed Mar 16, 2023

View reviewed changes

chore(native-filters): Fetch only the dataset columns

61dd061

john-bodley force-pushed the john-bodley--native-filters-dataset-columns branch from 8a64476 to 61dd061 Compare March 16, 2023 20:51

pull-request-size Bot added size/M and removed size/S labels Mar 16, 2023

michael-s-molina reviewed Mar 17, 2023

View reviewed changes

Comment thread ...ashboard/components/nativeFilters/FiltersConfigModal/FiltersConfigForm/FiltersConfigForm.tsx Outdated

villebro reviewed Mar 17, 2023

View reviewed changes

Update superset-frontend/src/dashboard/components/nativeFilters/Filte…

a1c5cf6

…rsConfigModal/FiltersConfigForm/FiltersConfigForm.tsx Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>

john-bodley commented Mar 17, 2023

View reviewed changes

john-bodley requested review from michael-s-molina and villebro March 20, 2023 06:08

michael-s-molina approved these changes Mar 20, 2023

View reviewed changes

john-bodley merged commit ffc0a81 into master Mar 20, 2023

john-bodley deleted the john-bodley--native-filters-dataset-columns branch March 20, 2023 18:43

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 3.0.0 First shipped in 3.0.0 labels Mar 13, 2024

qfcwell pushed a commit to qfcwell/superset that referenced this pull request May 12, 2026

chore(native-filters): Fetch only the required dataset fields (apache…

82ba6f7

…#23303) Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>

Conversation

john-bodley commented Mar 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

codecov Bot commented Mar 7, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

john-bodley commented Mar 8, 2023

Uh oh!

michael-s-molina left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

michael-s-molina commented Mar 17, 2023

Uh oh!

villebro left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michael-s-molina left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

john-bodley commented Mar 7, 2023 •

edited

Loading

codecov Bot commented Mar 7, 2023 •

edited

Loading