Skip to content

Conversation

@yangzhg
Copy link
Member

@yangzhg yangzhg commented Mar 21, 2022

Proposed changes

  1. add a config string_type_soft_limit to soft limit max length of string type
  2. disable using String type in Key column, partition column and
    distribution column
  3. Remove string type alias BLOB for future use

Problem Summary:

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (Yes/No/I Don't know)
  2. Has unit tests been added: (Yes/No/No Need)
  3. Has document been added or modified: (Yes/No/No Need)
  4. Does it need to update dependencies: (Yes/No)
  5. Are there any changes that cannot be rolled back: (Yes/No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@yangzhg yangzhg added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Mar 21, 2022
@morningman morningman added dev/1.0.0-deprecated should be merged into dev-1.0.0 branch dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels Mar 22, 2022
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need to modify schema change

@yangzhg yangzhg force-pushed the fix_string branch 4 times, most recently from 6505552 to e7b8239 Compare March 22, 2022 07:09
case TYPE_STRING: {
StringValue* str_val = (StringValue*)slot;
if (str_val->len > OLAP_STRING_MAX_LENGTH) {
if (str_val->len > ACTRUAL_STRING_TYPE_MAX_LENGTH) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (str_val->len > ACTRUAL_STRING_TYPE_MAX_LENGTH) {
if (str_val->len > ACTUAL_STRING_TYPE_MAX_LENGTH) {

// bloom filter fpp
static const double BLOOM_FILTER_DEFAULT_FPP = 0.05;

#define ACTRUAL_STRING_TYPE_MAX_LENGTH \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_type_length_soft_limit_bytes can be modified by runtime.
If you defined a macro here, then we can't modify it anymore.

the config supports pre-check, see compaction_task_num_per_disk.
We can use pre-check to check if the config value exceed the OLAP_STRING_MAX_LENGTH.
And then, we can use config::string_type_length_soft_limit_bytes directly.

TypeDescriptor ret;
ret.type = TYPE_STRING;
ret.len = MAX_STRING_LENGTH;
ret.len = ACTRUAL_STRING_TYPE_MAX_LENGTH;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this compatible with the String that has been created before which len is already larger than ACTRUAL_STRING_TYPE_MAX_LENGTH?

throw new AnalysisException("Create olap table should contain distribution desc");
}
distributionDesc.analyze(columnSet);
distributionDesc.analyze(columnSet, columnDefs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another place we need to modify in this CreateTableStmt.
When user create a DUPLICATE KEY table without specify the key column, Doris will auto
select part of columns as KEY column.
And if it select a string column as KEY column, the error will return to user.
But this is not what user expected.
We should:

  1. stop selecting KEY column when met String column.
  2. return error msg to user when the first column is String and give a suggestion to user.

@yangzhg yangzhg force-pushed the fix_string branch 2 times, most recently from 9dc19f9 to 88691eb Compare March 23, 2022 12:22
1. add a config string_type_soft_limit to soft limit max length of string type
2. disable using String type in Key column, partition column and
   distribution column
3. remove String type alias BLOB for futrue use
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 24, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yangzhg yangzhg merged commit cfb57be into apache:master Mar 25, 2022
@morningman morningman removed the dev/1.0.0-deprecated should be merged into dev-1.0.0 branch label Mar 28, 2022
morningman pushed a commit that referenced this pull request Mar 28, 2022
1. add a config string_type_soft_limit to soft limit max length of string type
2. disable using String type in Key column, partition column and
   distribution column
3. remove String type alias BLOB for futrue use
@morningman morningman added dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 and removed dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels Mar 28, 2022
@morningman morningman mentioned this pull request Jun 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants