-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34127][SQL] Support table valued command #31548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #135109 has finished for PR 31548 at commit
|
|
Test build #135117 has finished for PR 31548 at commit
|
|
Test build #135124 has finished for PR 31548 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135132 has finished for PR 31548 at commit
|
|
cc @wangyum |
|
cc @cloud-fan |
| case ShowTableProperties(_, _, _) => true | ||
| case ShowPartitions(_, _, _) => true | ||
| case ShowColumns(_, _, _) => true | ||
| // TODO case ShowViews(_, _, _) => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we file a JIRA and make it id'ed todo? e.g) TODO(SPARK-XXXX): blah blah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
| /** | ||
| * A table-valued command, e.g. | ||
| * {{{ | ||
| * select tableName from command("show tables"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot add a new built-in function for this purpose instead? The implementation will be simpler than the current syntax-based approach, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, this PR references the table-valued-function. So, I call the syntax is table-valued-command.
Second, Users want use filter clause, group by, ... on table-valued-command. In fact, This PR adopt the syntax-based approach is much simpler than built-in function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This implementation can make these command easy to use:
- Save result to table.
- Query the specified column.
- Filter / Group by / Order by the specified column.
This is the syntax of Teradata:
SELECT tbl.DatabaseName,
tbl.TableName,
SUM(spc.CurrentPerm)/1024.00 as TableSize
FROM DBC.TablesV tbl
JOIN DBC.TableSize spc
ON tbl.DatabaseName = spc.DatabaseName
AND tbl.TableName = spc.TableName
WHERE tbl.DatabaseName NOT IN ('All', 'Crashdumps', 'DBC', 'dbcmngr',
'Default', 'External_AP', 'EXTUSER', 'LockLogShredder', 'PUBLIC',
'Sys_Calendar', 'SysAdmin', 'SYSBAR', 'SYSJDBC', 'SYSLIB',
'SystemFe', 'SYSUDTLIB', 'SYSUIF', 'TD_SERVER_DB', 'TDStats',
'TD_SYSGPL', 'TD_SYSXML', 'TDMaps', 'TDPUSER', 'TDQCD',
'tdwm', 'SQLJ', 'TD_SYSFNLIB', 'SYSSPATIAL')
AND TableKind = 'T'
GROUP BY tbl.DatabaseName,
tbl.TableName
ORDER BY TableSize DESC;|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135262 has finished for PR 31548 at commit
|
|
Test build #135265 has finished for PR 31548 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #135422 has finished for PR 31548 at commit
|
|
cc @cloud-fan |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #136929 has finished for PR 31548 at commit
|
|
The use case is totally valid, but we may need more discussion about how to design the API to implement this use case. I have a few thoughts:
Personally I prefer option 2 as it's more predictable. We need extra effort for option 3 to define the behaviors of If we want to add information schema in the future, they are read-only views and it's helpful if option 2 is done and we can already create views with SHOW TABLES, etc. already. |
Hmm.. |
Ah good point. One way is to eagerly execute the command when creating the view, so Another idea is to allow SHOW TABLES etc. as subqueries, e.g. |
Making a non lazy view sounds weird to me. We then create another kind of special view just for this kind of commands. The subquery approach looks more promising to me. |
|
Kubernetes integration test unable to build dist. exiting with code: 1 |
|
Test build #137006 has finished for PR 31548 at commit
|
|
Test build #137585 has finished for PR 31548 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #137988 has finished for PR 31548 at commit
|
|
ping @cloud-fan |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
Some command used to display some metadata, such as:
SHOW TABLES,SHOW TBLPROPERTIESand so no.If the output rows much than screen height, the output very unfriendly to developers.
So we should have a way to filter the output like the behavior of
SELECT ... FROM ... WHERE ....We could reference the implement of table valued function.
Why are the changes needed?
This PR provides a better way to display DDL when output rows much than screen height.
Does this PR introduce any user-facing change?
'No'. Just a new syntax.
How was this patch tested?
Jenkins test.