Skip to content

Conversation

@924060929
Copy link
Contributor

@924060929 924060929 commented Sep 16, 2021

Proposed changes

In the huge data stream load workload, the load is too slow. the cause is skiplist usage % 4 to generate random height, so I change to & 3 to achieve high perfomance.

test doris: 1 fe, 1 be
test table:

CREATE TABLE `test_table` (
  `id` int(11) NULL COMMENT "",
  `uid` varchar(255) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(`id`, `uid`)
DISTRIBUTED BY HASH(`id`) BUCKETS 10
PROPERTIES (
  "replication_num" = "1",
  "in_memory" = "false",
  "storage_format" = "V2"
);

and insert 10 million rows like this

0,74c5ecf4-ef1d-4247-bae8-04ecdda291c5
1,87f02aaf-a5bc-428c-8879-06fd40d68409
2,056dbc2e-4ac8-4d7f-ab7b-3aefd6fa1086
3,d22bfb72-986e-46f2-b4cd-8493a681d640
4,0fe30303-958b-4f10-8641-08a98881e668
5,63de2f71-0751-4005-8d1e-5bf4275d6ce2
6,f35a042b-0343-4a28-9ec9-2eb000a5883b

the origin stream load result is:

{
    "TxnId": 365,
    "Label": "20210916163323",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 10000000,
    "NumberLoadedRows": 10000000,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 448888890,
    "LoadTimeMs": 59205,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 2,
    "ReadDataTimeMs": 28885,
    "WriteDataTimeMs": 59183,
    "CommitAndPublishTimeMs": 19
}

After optimize:

{
    "TxnId": 372,
    "Label": "20210916170250",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 10000000,
    "NumberLoadedRows": 10000000,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 448888890,
    "LoadTimeMs": 19940,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 1,
    "ReadDataTimeMs": 5500,
    "WriteDataTimeMs": 19917,
    "CommitAndPublishTimeMs": 20
}

Types of changes

What types of changes does your code introduce to Doris?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)
  • Code refactor (Modify the code structure, format the code, etc...)
  • Optimization. Including functional usability improvements and performance improvements.
  • Dependency. Such as changes related to third-party components.
  • Other.

Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.

  • I have created an issue on (Fix #ISSUE) and described the bug/feature there in detail
  • I have added tests that prove my fix is effective or that my feature works
  • Compiling and unit tests pass locally with my changes
  • If these changes need document changes, I have updated the document
  • Any dependent changes have been merged

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

Copy link
Contributor

@caiconghui caiconghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 16, 2021
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@924060929
Copy link
Contributor Author

Wait ... some key pr is lossing

@924060929
Copy link
Contributor Author

924060929 commented Sep 17, 2021

In #5093, new config already optimize this case. the root cause is coordinator sleep a long time

olap_table_sink_send_interval_ms=1

Close this pr

@924060929 924060929 closed this Sep 17, 2021
@924060929 924060929 deleted the optimize-stream-load branch September 17, 2021 13:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants