-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Describe the bug
After starting stream load for some time, the transactions’ publish version will become slower and slower.
To Reproduce
We have a cluster with 3 FE and 8 BE, version: 0.11, OS: CentOS 7
FE:CPU cores: 16; Mem: 32
BE: CPU cores: 40; Mem: 128; Disk: 3.6T * 12
This problem occurs when doing large data stream load
- More than 20M messages per minutes.
- Use 20 clients send data to Doris.
- Each client Sends 250K messages every batch.
Expected behavior
After starting stream load for some time, the transactions’ publish version will become very slow.
Through add some logs, I find that Tablet::do_tablet_meta_checkpoint will retain the _meta_lock for a long time.
I1210 17:47:23.343849 191992 tablet.cpp:1270] 26239 do_tablet_meta_checkpoint retain _meta_lock cost: 1570 s, check_rowset_meta cost: 1569 s, remove cost: 0 s, all_rs_metas: 39973
And through perf, I find that rocksdb cost much time in Get:
