Skip to content

Conversation

@w41ter
Copy link
Contributor

@w41ter w41ter commented Mar 28, 2025

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

From https://issues.apache.org/jira/browse/THRIFT-5492:

The remainingMessageSize_ is decremented as messages are parsed, for example when parsing a field in the message that is of type binary I see the code calling readStringBody in TBinaryProtocol.tcc:455 which then calls this->trans_->consume(size) which goes to TBufferTransports.h:124 and calls countConsumedMessageBytes which finally decrements remainingMessageSize_.

However, nothing ever resets remainingMessageSize_, so after it is decremented from 10010241024 down to zero it fails with an END_OF_FILE exception.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

From https://issues.apache.org/jira/browse/THRIFT-5492:

The remainingMessageSize_ is decremented as messages are parsed, for
example when parsing a field in the message that is of type binary
I see the code calling readStringBody in TBinaryProtocol.tcc:455 which
then calls this->trans_->consume(size) which goes to TBufferTransports.h:124
and calls countConsumedMessageBytes which finally decrements remainingMessageSize_.

However, nothing ever resets remainingMessageSize_, so after it is
decremented from 100*1024*1024 down to zero it fails with an END_OF_FILE exception.
@Thearas
Copy link
Contributor

Thearas commented Mar 28, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Copy link
Contributor

@dataroaring dataroaring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Mar 28, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@w41ter
Copy link
Contributor Author

w41ter commented Mar 28, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 33950 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f0aca3f074343d49bef28f711fbdc86d1c14bada, data reload: false

------ Round 1 ----------------------------------
q1	25656	5160	5022	5022
q2	2061	290	177	177
q3	10397	1274	684	684
q4	10223	968	507	507
q5	7532	2394	2345	2345
q6	183	165	132	132
q7	903	738	617	617
q8	9319	1242	1102	1102
q9	6720	5058	5087	5058
q10	6804	2302	1906	1906
q11	481	288	267	267
q12	358	360	226	226
q13	17751	3654	3044	3044
q14	220	221	211	211
q15	527	465	479	465
q16	618	615	603	603
q17	560	864	349	349
q18	7519	7273	7149	7149
q19	1217	969	569	569
q20	323	334	196	196
q21	3910	2545	2350	2350
q22	990	1052	971	971
Total cold run time: 114272 ms
Total hot run time: 33950 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5135	5154	5103	5103
q2	228	319	226	226
q3	2102	2643	2255	2255
q4	1402	1774	1369	1369
q5	4389	4455	4427	4427
q6	222	170	126	126
q7	1937	1897	1710	1710
q8	2545	2486	2440	2440
q9	7139	7037	7103	7037
q10	2947	3142	2768	2768
q11	565	507	483	483
q12	682	760	567	567
q13	3449	3836	3226	3226
q14	287	296	272	272
q15	516	479	476	476
q16	648	683	657	657
q17	1128	1518	1414	1414
q18	7702	7391	7373	7373
q19	816	784	878	784
q20	2000	2008	1918	1918
q21	5195	4719	4598	4598
q22	1051	1002	966	966
Total cold run time: 52085 ms
Total hot run time: 50195 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 185737 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f0aca3f074343d49bef28f711fbdc86d1c14bada, data reload: false

query1	995	519	469	469
query2	6540	1921	1898	1898
query3	6802	214	213	213
query4	25871	23795	23118	23118
query5	4627	648	469	469
query6	306	214	193	193
query7	4607	487	268	268
query8	297	254	233	233
query9	8660	2540	2542	2540
query10	479	305	251	251
query11	15353	15001	14979	14979
query12	168	109	102	102
query13	1636	513	406	406
query14	9731	6084	6234	6084
query15	204	186	168	168
query16	7651	641	445	445
query17	1173	694	545	545
query18	1997	396	317	317
query19	189	184	153	153
query20	122	115	114	114
query21	208	117	102	102
query22	4217	4429	4220	4220
query23	33956	32767	32837	32767
query24	8393	2410	2380	2380
query25	532	446	397	397
query26	1230	275	144	144
query27	2912	502	324	324
query28	4529	2419	2382	2382
query29	715	607	432	432
query30	285	215	191	191
query31	921	838	778	778
query32	76	62	65	62
query33	549	375	324	324
query34	774	845	508	508
query35	803	803	775	775
query36	973	989	887	887
query37	114	107	78	78
query38	4302	4117	4166	4117
query39	1467	1383	1398	1383
query40	205	119	105	105
query41	55	52	60	52
query42	121	104	102	102
query43	496	499	474	474
query44	1310	791	803	791
query45	180	172	172	172
query46	835	1049	636	636
query47	1742	1794	1697	1697
query48	365	399	308	308
query49	767	499	431	431
query50	689	735	419	419
query51	4148	4217	4114	4114
query52	111	103	96	96
query53	212	246	177	177
query54	485	496	406	406
query55	88	78	79	78
query56	277	265	291	265
query57	1123	1163	1070	1070
query58	239	248	238	238
query59	2530	2705	2708	2705
query60	276	271	261	261
query61	134	128	122	122
query62	799	751	675	675
query63	219	186	183	183
query64	4283	999	730	730
query65	4310	4206	4251	4206
query66	1077	408	307	307
query67	15850	15605	15387	15387
query68	5170	888	502	502
query69	470	302	258	258
query70	1175	1057	1087	1057
query71	408	299	265	265
query72	5536	4695	4689	4689
query73	657	599	353	353
query74	9021	9003	8826	8826
query75	3232	3264	2704	2704
query76	3326	1178	750	750
query77	464	375	300	300
query78	9803	10079	9270	9270
query79	1607	809	556	556
query80	643	538	444	444
query81	486	260	220	220
query82	197	126	98	98
query83	182	245	161	161
query84	250	91	73	73
query85	737	350	305	305
query86	368	303	280	280
query87	4440	4518	4329	4329
query88	2845	2243	2231	2231
query89	389	310	275	275
query90	1902	208	197	197
query91	140	140	117	117
query92	73	60	59	59
query93	1288	1060	569	569
query94	672	400	301	301
query95	360	265	262	262
query96	484	571	272	272
query97	3132	3216	3123	3123
query98	231	213	212	212
query99	1340	1390	1256	1256
Total cold run time: 268318 ms
Total hot run time: 185737 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.18 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f0aca3f074343d49bef28f711fbdc86d1c14bada, data reload: false

query1	0.04	0.03	0.03
query2	0.12	0.10	0.10
query3	0.24	0.19	0.19
query4	1.59	0.18	0.19
query5	0.58	0.59	0.57
query6	1.21	0.71	0.72
query7	0.02	0.02	0.02
query8	0.04	0.04	0.03
query9	0.57	0.52	0.51
query10	0.57	0.58	0.57
query11	0.16	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.60	0.60
query14	2.78	2.70	2.68
query15	0.93	0.84	0.85
query16	0.38	0.38	0.39
query17	1.02	1.02	1.03
query18	0.21	0.19	0.19
query19	1.93	1.82	1.88
query20	0.01	0.01	0.01
query21	15.35	0.88	0.55
query22	0.75	1.23	0.70
query23	14.84	1.39	0.61
query24	6.73	2.30	0.93
query25	0.46	0.13	0.14
query26	0.69	0.16	0.14
query27	0.05	0.05	0.04
query28	9.59	0.87	0.42
query29	12.56	3.92	3.26
query30	0.24	0.09	0.06
query31	2.83	0.59	0.38
query32	3.22	0.54	0.50
query33	3.01	3.10	3.04
query34	15.61	5.05	4.49
query35	4.53	4.49	4.48
query36	0.66	0.48	0.48
query37	0.08	0.06	0.06
query38	0.05	0.04	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.12
query41	0.09	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 104.75 s
Total hot run time: 31.18 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 50.90% (13625/26770)
Line Coverage 40.27% (118298/293764)
Region Coverage 38.95% (60099/154309)
Branch Coverage 33.85% (30220/89264)

@yiguolei yiguolei merged commit d27e76a into apache:master Mar 29, 2025
28 of 29 checks passed
yiguolei pushed a commit that referenced this pull request Mar 29, 2025
…ransport #49649 (#49656)

Cherry-picked from #49649

Co-authored-by: walter <maochuan@selectdb.com>
dataroaring pushed a commit that referenced this pull request Mar 30, 2025
…ransport #49649 (#49655)

Cherry-picked from #49649

Co-authored-by: walter <maochuan@selectdb.com>
@w41ter w41ter deleted the fix_thrift_eof branch March 31, 2025 01:51
@yiguolei yiguolei mentioned this pull request May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.10-merged dev/3.0.5-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants