feat: rename TABLE_NAMES_CACHE_CONFIG to DATA_CACHE_CONFIG#11509
Conversation
Codecov Report
@@ Coverage Diff @@
## master #11509 +/- ##
==========================================
+ Coverage 62.86% 67.12% +4.26%
==========================================
Files 889 889
Lines 43055 43062 +7
Branches 4017 4017
==========================================
+ Hits 27065 28905 +1840
+ Misses 15811 14060 -1751
+ Partials 179 97 -82
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Import the cache_manager instead of cache to make it clear which cache we are using.
There was a problem hiding this comment.
A small refactor to make the memoize function more flexible.
There was a problem hiding this comment.
Moved over from superset.utils.decorators
There was a problem hiding this comment.
Didn't add backward compatibility because the data objects put into this cache will become much much larger. If a Superset admin has configured in-memory cache for TABLE_NAMES_CACHE_CONFIG, they might run into problems. Therefore it's better to make things break to make them aware of this change.
For most users, the migration is as simple as renaming the config value.
|
I will be on PTO next week, won't have a change to give a full review. 1 suggestion it would be nice to add a unit test that will document the expected behavior and will prevent from the regressions, tests can impersonate the user to hit various endpoints. |
I added two test assertions in cache_tests.py. Do you think we need more? |
45ce586 to
c0eb415
Compare
There was a problem hiding this comment.
Make sure the global default is applied as documented.
93729d6 to
9543b47
Compare
c2a46af to
ca25463
Compare
|
@john-bodley @etr2460 So previously the test Not sure why this cache would fail |
|
Here's a test run that fails when the only change comparing to |
There was a problem hiding this comment.
count = 2 seems to assume some tests run before this test. If you just initialized the database and run this test alone, this test will fail.
graceguo-supercat
left a comment
There was a problem hiding this comment.
please resolve conflicts. otherwise LGTM.
villebro
left a comment
There was a problem hiding this comment.
LGTM, however I didn't quite see how the "timeout search path" happens, is that something happening outside this PR in local configs or implicitly via Flask-Caching?
I think it is referring to this logic here: https://github.com/apache/incubator-superset/blob/600a6fa92a0bbe5bfd93371db5ced6af556c3697/superset/viz.py#L423-L433 |
Ah sorry, I misinterpreted the comment, never mind. |
villebro
left a comment
There was a problem hiding this comment.
This is a good layer of optional additional security, thanks for implementing.
0811d14 to
60283e2
Compare
The corresponding cache will now also cache the query results.
SUMMARY
💥 Breaking Change
Rename config value
TABLE_NAMES_CACHE_CONFIGtoDATA_CACHE_CONFIG, and save datasource query results to this cache, too.Motivation
Companies like Airbnb have different security levels for Superset itself and the datasources Superset connects to: anyone can login to the same Superset, but not everyone have access to all data. Superset has
SecurityManagerto check whether a user has access to certain datasource, but all query results are still stored in the same cache, regardless of users' access levels. If a malicious party obtained access to the cache server via certain Superset user in a high-risk region, they get access to all cached data.By separating cache storage for Superset metadata and the actual data users consume, we can have a better insulation of sensitive data by configuring different cache backend for high-risk regions while keeping the cache for all the Superset metadata in sync.
BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A
TEST PLAN
CI
ADDITIONAL INFORMATION