Skip to content

fix: adds the ability to disallow SQL functions per engine#28639

Merged
dpgaspar merged 5 commits into
apache:masterfrom
preset-io:feat/disallow-sql-functions
May 29, 2024
Merged

fix: adds the ability to disallow SQL functions per engine#28639
dpgaspar merged 5 commits into
apache:masterfrom
preset-io:feat/disallow-sql-functions

Conversation

@dpgaspar
Copy link
Copy Markdown
Member

@dpgaspar dpgaspar commented May 22, 2024

SUMMARY

Adds a new configuration key named DISALLOWED_SQL_FUNCTIONS that defines disallowed function per engine on SQL statements. These functions will be disallowed on SQLLab and Charts.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.43%. Comparing base (76d897e) to head (396a8e0).
Report is 1094 commits behind head on master.

Files with missing lines Patch % Lines
superset/db_engine_specs/base.py 75.00% 1 Missing ⚠️
superset/exceptions.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28639       +/-   ##
===========================================
+ Coverage   60.48%   83.43%   +22.94%     
===========================================
  Files        1931      523     -1408     
  Lines       76236    37605    -38631     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31377    -14737     
+ Misses      28017     6228    -21789     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.99% <28.57%> (-0.18%) ⬇️
javascript ?
postgres 77.22% <71.42%> (?)
presto 53.56% <71.42%> (-0.24%) ⬇️
python 83.43% <92.85%> (+19.95%) ⬆️
sqlite 76.68% <71.42%> (?)
unit 58.95% <92.85%> (+1.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pull-request-size pull-request-size Bot added size/L and removed size/M labels May 22, 2024
@dpgaspar dpgaspar marked this pull request as ready for review May 22, 2024 15:19
@dosubot dosubot Bot added data:databases Related to database configurations and connections sqllab Namespace | Anything related to the SQL Lab labels May 22, 2024
try:
cls.execute(cursor, sql, query.database)
with app.app_context():
cls.execute(cursor, sql, query.database)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting.

try:
cls.execute(cursor, sql, query.database)
with app.app_context():
cls.execute(cursor, sql, query.database)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting.

Comment thread superset/sql_parse.py
:param function_list: The list of functions to search for
:param engine: The engine to use for parsing the SQL statement
"""
return ParsedQuery(sql, engine=engine).check_functions_exist(function_list)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (probably me) will have to convert this to use sqlglot and the SQLStatement class (#26786) but I'm happy to do it, seems simple enough.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do it myself, can I just not use sqlparse? implement something using the same pattern as extract_tables_from_statement?

Comment thread superset/config.py
# A set of disallowed SQL functions per engine. This is used to restrict the use of
# unsafe SQL functions in SQL Lab and Charts. The keys of the dictionary are the engine
# names, and the values are sets of disallowed functions.
DISALLOWED_SQL_FUNCTIONS: dict[str, set[str]] = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure where the best place for this deny list, i.e., here in the configuration or within the extra JSON payload of the database.

Additionally should this be engine (dialect) specific or database specific? If it's the later then maybe the extra JSON payload field is preferable.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON payload at the database level is more dynamic and would avoid having to change the config to add remove disallowed functions. But on the other hand the user that actually registers the db could have intentions to "abuse" these functions.

@dpgaspar dpgaspar merged commit 5dfbab5 into apache:master May 29, 2024
@dpgaspar dpgaspar deleted the feat/disallow-sql-functions branch May 29, 2024 09:51
@michael-s-molina michael-s-molina added the v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch label Jun 26, 2024
@mistercrunch mistercrunch added 🍒 4.0.2 Cherry-picked to 4.0.2 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels labels Jul 24, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
@mistercrunch mistercrunch added the 🚢 4.1.0 First shipped in 4.1.0 label Nov 27, 2024
ratuldawar11 pushed a commit to grofers/superset that referenced this pull request Apr 14, 2026
* fix(permalink): adding anchor to dashboard permalink generation (apache#28744)

* fix: filters not updating with force update when caching is enabled (apache#29291)

(cherry picked from commit 527f1d2)

* fix(sqllab): invalid empty state on switch tab (apache#29278)

* fix(metastore-cache): prune before add (apache#29301)

(cherry picked from commit 172ddb4)

* fix: Remove recursive repr call (apache#29314)

(cherry picked from commit 9444c6b)

* fix: Cannot delete empty column inside a tab using the dashboard editor (apache#29346)

(cherry picked from commit ee52277)

* fix(explore): restored hidden field values has discarded (apache#29349)

(cherry picked from commit 160cece)

* chore: Rename Totals to Summary in table chart (apache#29360)

* fix(revert 27883): Excess padding in horizontal Bar charts (apache#29345)

(cherry picked from commit 708afb7)

* fix(explore): don't respect y-axis formatting (apache#29367)

* fix: adds the ability to disallow SQL functions per engine (apache#28639)

* chore: Adds 4.0.2 RC2 data to CHANGELOG.md

* fixes

* frontend fixes

* fix: cache api

---------

Co-authored-by: Jack <41238731+fisjac@users.noreply.github.com>
Co-authored-by: ka-weihe <k@weihe.dk>
Co-authored-by: JUST.in DO IT <justin.park@airbnb.com>
Co-authored-by: Ville Brofeldt <33317356+villebro@users.noreply.github.com>
Co-authored-by: Jessie R <j@scjr.me>
Co-authored-by: Michael S. Molina <70410625+michael-s-molina@users.noreply.github.com>
Co-authored-by: Daniel Vaz Gaspar <danielvazgaspar@gmail.com>
Co-authored-by: Michael S. Molina <michael.s.molina@gmail.com>
qfcwell pushed a commit to qfcwell/superset that referenced this pull request May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels data:databases Related to database configurations and connections size/L sqllab Namespace | Anything related to the SQL Lab v4.0 Label added by the release manager to track PRs to be included in the 4.0 branch 🍒 4.0.2 Cherry-picked to 4.0.2 🚢 4.1.0 First shipped in 4.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants