Skip to content

[Feature] Support SHOW DATA SKEW stmt#6219

Merged
morningman merged 3 commits intoapache:masterfrom
morningman:show_skew
Aug 5, 2021
Merged

[Feature] Support SHOW DATA SKEW stmt#6219
morningman merged 3 commits intoapache:masterfrom
morningman:show_skew

Conversation

@morningman
Copy link
Contributor

@morningman morningman commented Jul 13, 2021

Proposed changes

SHOW DATA SKEW FROM tbl PARTITION(p1)

to view the data distribution of a specified partition

mysql> admin show data skew from tbl1 partition(tbl1);
+-----------+-------------+-------+---------+
| BucketIdx | AvgDataSize | Graph | Percent |
+-----------+-------------+-------+---------+
| 0         | 0           |       | 100.00% |
+-----------+-------------+-------+---------+
1 row in set (0.01 sec)

Also modify the result of admin show replica distribution, add replica size distribution

mysql> admin show replica distribution from tbl1 partition(tbl1);
+-----------+------------+-------------+----------+------------+-----------+-------------+
| BackendId | ReplicaNum | ReplicaSize | NumGraph | NumPercent | SizeGraph | SizePercent |
+-----------+------------+-------------+----------+------------+-----------+-------------+
| 10002     | 1          | 0           | >        | 100.00%    |           | 100.00%     |
+-----------+------------+-------------+----------+------------+-----------+-------------+

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)
  • Code refactor (Modify the code structure, format the code, etc...)
  • Optimization. Including functional usability improvements and performance improvements.
  • Dependency. Such as changes related to third-party components.
  • Other.

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have created an issue on (Fix #ISSUE) and described the bug/feature there in detail
  • Compiling and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • If these changes need document changes, I have updated the document
  • Any dependent changes have been merged

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@morningman morningman added kind/feature Categorizes issue or PR as related to a new feature. area/easy-to-use api-review Categorizes an issue or PR as actively needing an API review. labels Jul 13, 2021
说明:

1. 必须指定且仅指定一个分区。对于非分区表,分区名称同表名。
2. 结果将展示指定分区下,各个分桶的数据量,以及每个分桶数据量在总数据量中的占比。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

注意格式


语法:

ADMIN SHOW DATA SKEW FROM [db_name.]tbl_name [PARTITION (p1)];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果 partition 是必填项,那么不用加 []

KW_REPAIR, KW_REPEATABLE, KW_REPOSITORY, KW_REPOSITORIES, KW_REPLACE, KW_REPLACE_IF_NOT_NULL, KW_REPLICA, KW_RESOURCE, KW_RESOURCES, KW_RESTORE, KW_RETURNS, KW_RESUME, KW_REVOKE,
KW_RIGHT, KW_ROLE, KW_ROLES, KW_ROLLBACK, KW_ROLLUP, KW_ROUTINE, KW_ROW, KW_ROWS,
KW_S3, KW_SCHEMA, KW_SCHEMAS, KW_SECOND, KW_SELECT, KW_SEMI, KW_SERIALIZABLE, KW_SESSION, KW_SET, KW_SETS, KW_SET_VAR, KW_SHOW, KW_SIGNED,
KW_SKEW,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pay attention to the format

dbName = ClusterNamespace.getFullName(getClusterName(), tblRef.getName().getDb());
}

tblRef.getName().setDb(dbName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

analyze tblRef before check partition

@EmmyMiao87 EmmyMiao87 self-assigned this Jul 14, 2021
SHOW DATA SKEW FROM tbl PARTITION(p1)

to view the data distribution of a specified partition
Copy link
Contributor

@EmmyMiao87 EmmyMiao87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 5, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2021

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 5, 2021

PR approved by anyone and no changes requested.

@morningman morningman merged commit 2823e4d into apache:master Aug 5, 2021
@morningman morningman mentioned this pull request Oct 10, 2021
EmmyMiao87 pushed a commit to EmmyMiao87/incubator-doris that referenced this pull request Nov 10, 2021
SHOW DATA SKEW FROM tbl PARTITION(p1)

to view the data distribution of a specified partition

```
mysql> admin show data skew from tbl1 partition(tbl1);
+-----------+-------------+-------+---------+
| BucketIdx | AvgDataSize | Graph | Percent |
+-----------+-------------+-------+---------+
| 0         | 0           |       | 100.00% |
+-----------+-------------+-------+---------+
1 row in set (0.01 sec)
```

Also modify the result of `admin show replica distribution`, add replica size distribution

```
mysql> admin show replica distribution from tbl1 partition(tbl1);
+-----------+------------+-------------+----------+------------+-----------+-------------+
| BackendId | ReplicaNum | ReplicaSize | NumGraph | NumPercent | SizeGraph | SizePercent |
+-----------+------------+-------------+----------+------------+-----------+-------------+
| 10002     | 1          | 0           | >        | 100.00%    |           | 100.00%     |
+-----------+------------+-------------+----------+------------+-----------+-------------+
```

Change-Id: I74e392cec4a72bbb8a4b5ef9e2b7302a062d6b88
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by one committer. area/easy-to-use kind/feature Categorizes issue or PR as related to a new feature. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants