Add support for period character in table names#7453
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7453 +/- ##
==========================================
+ Coverage 65.17% 65.23% +0.05%
==========================================
Files 433 433
Lines 21428 21433 +5
Branches 2360 2358 -2
==========================================
+ Hits 13966 13981 +15
+ Misses 7342 7332 -10
Partials 120 120
Continue to review full report at Codecov.
|
|
@cgivre please take a look at this PR. If this works out this should make merging your Drill PR slightly easier. |
|
@villebro I'll take a look this weekend. This pretty much will solve the issues with the Drill integration. |
There was a problem hiding this comment.
The reason db wasn't type annotated is because it caused a circular import (models.core already has a reference to db_engine_specs). This should probably be refactored, but is outside the scope of this PR.
|
@villebro I tried this out and it worked really well even without the Drill PR! I think pretty much the only thing that Drill needs now for the integration is the time grains! Thanks for your help with this. |
|
Thanks for verifying that this works @cgivre . @mistercrunch @john-bodley @betodealmeida would really appreciate help reviewing this, as I think this is one of the last hurdles to being able to support engines that rely on the presence of non-standard characters in schema/table names. |
|
Kind reminder to committers that this is pending review, would be great to get this reviewed/merged so Drill integration can be finalized (is blocked by this PR). |
|
@villebro my main comment which is somewhat related to #7490 is given that the cluster/schema/table name construct can be quite complicated and historically we've often flatten these names into a single string (and then split or used regular expressions to extract the components) whether we should move towards using a class (possibly a |
|
Good point @john-bodley , having a dedicated class with proper parsing/formatting functionality would probably be a good idea. Do you feel this should be addressed as part of this PR, or start a new PR for that? |
|
I think a separate PR is fine as it’s probably a large change. |
* Move schema name handling in table names from frontend to backend * Rename all_schema_names to get_all_schema_names * Fix js errors * Fix additional js linting errors * Refactor datasource getters and fix linting errors * Update js unit tests * Add python unit test for get_table_names method * Add python unit test for get_table_names method * Fix js linting error
CATEGORY
SUMMARY
In SQL Lab, table names are currently assumed to follow the following convention:
schema.tableortableThis is handled in the frontend by assuming that a period in the table name always implies a separator between schema and table name. Since there is no standardized way for SQLAlchemy inspectors to return table names (some return
schema.table, others onlytable), they can be in either format. Since some databases (at least Apache Drill and Postgres) support periods in schema and table names, and their respective inspectors don't return schema prefixed table names, this causes problems when querying tables, asTableSelectorstrips away everything before the period character. This PR moves this logic from the frontend to the backend, and makes it possible to configure this behavior per engine.This proposal changes table name handling in the following way:
try_remove_schema_from_table_nameis added todb_engine_specs(defaults toTrue). By default, whenTrue,get_table_names()andget_view_names()checks if a table name starts with the schema name followed by a period, and if so, removes the schema name from the table name. Example:schema.tablebecomestable, whiletableremains unchanged.{'schema': 'schema_name', 'table': 'table_name'}when handed to the frontend. Previously they were either of the formattableorschema.table. This removes any ambiguity in the frontend.schema.table.tableonly in the dropdown.Filtering also supports this, i.e. when no schema is chosen, the filter substring makes the comparison assuming that the table name is
schema.table, making it possible to include the schema name in the filter string.SCREENSHOTS
When no schema is chosen, table names are displayed as

schema.table:If a schema is selected, only the table name is shown (in this case the table name is

test.table, nottablein schematest):TEST PLAN
Tested locally on Postgres and sqlite. Js unit tests updated to correspond to new data structures and python unit tests added to test table name fetching.
ADDITIONAL INFORMATION
REVIEWERS
@cgivre