-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[enhancement](Load)allow load data to the other partitions when some partitions are restoring #39595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
9a403cc to
02df1a7
Compare
…partitions are restoring
02df1a7 to
e82614f
Compare
|
run buildall |
TPC-H: Total hot run time: 38257 ms |
TPC-DS: Total hot run time: 191544 ms |
ClickBench: Total hot run time: 30.4 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
test case for this pr restore tbl's partition p202408$ RESTORE SNAPSHOT db.tbl_p202408_test check restore job state\G$ SHOW RESTORE\G load to partition p202408, failed with exceptioncurl --location-trusted -u root:"" \
load to partition p202408, successfully$ curl --location-trusted -u root:"" \
|
…partitions are restoring (#39915) If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception. Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it". But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully. To achieve this goal, before checking olap table state, check partition state first. cherry pick from master branch, pr has been merged: #39595 Co-authored-by: shenshoucheng <shenshoucheng@jd.com>
…partitions are restoring (#39595) If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception. Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it". But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully. To achieve this goal, before checking olap table state, check partition state first. ps: set restore status for partitions in this pr:#8245 ## test case for this pr ### restore tbl's partition p202408 $ RESTORE SNAPSHOT db.tbl_p202408_test FROM repo ON( `tbl` PARTITION (p202408) ) PROPERTIES( "backup_timestamp"="2024-08-22-20-32-37", "replication_num" = "1" ); ### check restore job state\G $ SHOW RESTORE\G *************************** 1. row *************************** JobId: 21741 Label: tbl_p202408_test Timestamp: 2024-08-22-20-32-37 State: DOWNLOADING RestoreObjs: { "name": "tbl_p202408_test", "database": "db", "olap_table_list": [ { "name": "tbl", "partition_names": ["p202408"] } ] ### load to partition p202408, failed with exception curl --location-trusted -u root:"" \ > -H "label:tbl_test_load_19" \ > -H "timeout:300" \ > -H "format: parquet" \ > -T data_for_p202408.parquet \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3042, "Label": "tbl_test_load_19", "Comment": "", "TwoPhaseCommit": "false", "Status": "Fail", "Message": "[ANALYSIS_ERROR]TStatus: errCode = 2, detailMessage = Table [zt_order_detail_v3], Partition [p202408] is in restore process. Can not load into it.etc.", "NumberTotalRows": 682, "NumberLoadedRows": 682, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 82025, "LoadTimeMs": 48, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 7, "ReadDataTimeMs": 0, "WriteDataTimeMs": 38, "CommitAndPublishTimeMs": 0 } ### load to partition p202408, successfully $ curl --location-trusted -u root:"" \ > -H "timeout:300" \ > -H "format: json" \ > -H "read_json_by_line:true" \ > -T data_for_p202407.json \ > -XPUT http://fe_ip:8030/api/db/tbl/_stream_load { "TxnId": 3043, "Label": "2f2dae38-a495-4c22-9492-419ea70b724e", "Comment": "", "TwoPhaseCommit": "false", "Status": "Success", "Message": "OK", "NumberTotalRows": 1, "NumberLoadedRows": 1, "NumberFilteredRows": 0, "NumberUnselectedRows": 0, "LoadBytes": 1128, "LoadTimeMs": 51, "BeginTxnTimeMs": 0, "StreamLoadPutTimeMs": 6, "ReadDataTimeMs": 0, "WriteDataTimeMs": 30, "CommitAndPublishTimeMs": 13 } Co-authored-by: shenshoucheng <shenshoucheng@jd.com>
If broker load or stream load task execute in one table that is restoring data, load task will failed with Exception.
Exception info :"Table [xxx] is under restore" or "Table [xxx] is in restore process, can't load into it".
But mostly restoreJob only effects some partitions in this table, not all of them, so that the other partitions still need to load data successfully.
To achieve this goal, before checking olap table state, check partition state first.
ps: set restore status for partitions in this pr:#8245
test case for this pr
restore tbl's partition p202408
$ RESTORE SNAPSHOT db.tbl_p202408_test
FROM repo
ON(
tblPARTITION (p202408))
PROPERTIES(
"backup_timestamp"="2024-08-22-20-32-37",
"replication_num" = "1"
);
check restore job state\G
$ SHOW RESTORE\G
*************************** 1. row ***************************
JobId: 21741
Label: tbl_p202408_test
Timestamp: 2024-08-22-20-32-37
State: DOWNLOADING
RestoreObjs: {
"name": "tbl_p202408_test",
"database": "db",
"olap_table_list": [
{
"name": "tbl",
"partition_names": ["p202408"]
}
]
load to partition p202408, failed with exception
curl --location-trusted -u root:"" \
load to partition p202408, successfully
$ curl --location-trusted -u root:"" \