Skip to content

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Aug 26, 2025

What problem does this PR solve?

for below sql:

create table t(a int,  b varchar(2));
insert into t values(1, 'abcde');

for doris 2.0, the insert command will throw error 'Insert has filtered data in strict mode', because length of value 'abcde' is bigger than varchar(2).

for mysql, pg, the above sql will also throw error for the same reason.

for doris version >= 2.1.0, it will not throw error, and just truncate 'abcde' to 'ab'.

what's more, for stream load, no matter 2.0, or 2.1, ..., doris will stream load fail.

so doris >= 2.1 may need throw exception for insert value with longer string, but considering 2.1 have released for 1.5+ year, we don't want to make a behaviour change for 2.1, so add a session variable enable_insert_value_auto_cast, when insert value with longer string value, then will have:

  • if enable_insert_value_auto_cast = true (default), the longer string will be truncate and insert succ;
  • if enable_insert_value_auto_cast = false, enable_insert_strict = true(default), will throw exception 'Insert has filtered data in strict mode';
  • if enable_insert_value_auto_cast = false, enable_insert_strict = false, then the longer string will be filtered, other rows will insert succ.

relate PR: #52802

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777 yujun777 marked this pull request as ready for review August 26, 2025 15:28
@yujun777 yujun777 force-pushed the fix-insert-value-strict-mode branch from 3699698 to f20d26e Compare August 27, 2025 03:57
@yujun777 yujun777 changed the title [draft](nereids) revert fix insert into values throw 'Insert has filtered data in strict mode' exception #52802 [feat](nereids) Add session variable enable_insert_value_auto_cast for insert value truncate long string Aug 27, 2025
@yujun777 yujun777 closed this Aug 27, 2025
@yujun777 yujun777 reopened this Aug 27, 2025
} else if (needTruncateStringWhenInsert && sourceLength > targetLength && targetLength >= 0) {
} else if (needTruncateStringWhenInsert
&& sourceLength > targetLength && targetLength >= 0
&& ConnectContext.get() != null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't compute ConnectContext.get() in loop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't compute ConnectContext.get() in loop

fix

@yujun777
Copy link
Contributor Author

run buildall

@yujun777
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 34238 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 25c870d7123162f988195ce1c675226ef5564d44, data reload: false

------ Round 1 ----------------------------------
q1	17650	5403	5140	5140
q2	1998	330	250	250
q3	10209	1287	737	737
q4	10229	1016	522	522
q5	7517	2409	2370	2370
q6	186	174	138	138
q7	937	775	625	625
q8	9355	1394	1120	1120
q9	6968	5179	5166	5166
q10	6956	2397	1974	1974
q11	495	300	281	281
q12	357	357	235	235
q13	17783	3687	3067	3067
q14	238	240	221	221
q15	574	510	489	489
q16	433	434	376	376
q17	600	879	356	356
q18	7724	7216	7027	7027
q19	1350	972	571	571
q20	348	356	229	229
q21	3890	2586	2344	2344
q22	1078	1047	1000	1000
Total cold run time: 106875 ms
Total hot run time: 34238 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5274	5174	5237	5174
q2	250	348	239	239
q3	2184	2684	2333	2333
q4	1398	1822	1359	1359
q5	4243	4502	4580	4502
q6	259	178	138	138
q7	2041	1951	1848	1848
q8	2684	2803	2579	2579
q9	7423	7417	7242	7242
q10	3165	3383	2915	2915
q11	573	532	493	493
q12	698	786	632	632
q13	3721	4003	3353	3353
q14	297	311	305	305
q15	522	478	484	478
q16	464	510	460	460
q17	1195	1666	1392	1392
q18	7903	7823	7723	7723
q19	864	822	894	822
q20	2050	2098	1925	1925
q21	5111	4692	4368	4368
q22	1084	1063	1019	1019
Total cold run time: 53403 ms
Total hot run time: 51299 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187675 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 25c870d7123162f988195ce1c675226ef5564d44, data reload: false

query1	1107	404	417	404
query2	6566	1850	1744	1744
query3	6758	242	228	228
query4	26556	23698	23185	23185
query5	4400	645	524	524
query6	344	241	248	241
query7	4638	507	300	300
query8	310	261	244	244
query9	8629	2956	2951	2951
query10	491	354	299	299
query11	15807	15354	14741	14741
query12	181	124	123	123
query13	1678	593	447	447
query14	9603	5922	5819	5819
query15	230	197	184	184
query16	7686	683	493	493
query17	1258	785	660	660
query18	2065	451	348	348
query19	213	208	185	185
query20	136	132	127	127
query21	229	131	114	114
query22	4077	4308	3960	3960
query23	34190	33668	33689	33668
query24	8307	2403	2408	2403
query25	570	530	448	448
query26	1321	275	173	173
query27	2712	517	352	352
query28	4350	2299	2250	2250
query29	780	607	501	501
query30	289	228	213	213
query31	917	792	741	741
query32	90	84	84	84
query33	600	409	355	355
query34	806	846	523	523
query35	810	821	760	760
query36	975	1074	896	896
query37	131	115	95	95
query38	4124	4079	4073	4073
query39	1472	1412	1602	1412
query40	227	140	134	134
query41	70	70	64	64
query42	127	117	125	117
query43	527	535	487	487
query44	1351	868	874	868
query45	182	181	179	179
query46	887	1021	648	648
query47	1797	1794	1744	1744
query48	407	425	327	327
query49	759	519	426	426
query50	633	689	401	401
query51	4113	4250	4116	4116
query52	119	121	107	107
query53	248	276	200	200
query54	623	624	555	555
query55	99	96	102	96
query56	350	346	326	326
query57	1198	1228	1115	1115
query58	296	286	287	286
query59	2681	2711	2654	2654
query60	368	357	368	357
query61	172	161	167	161
query62	823	749	667	667
query63	235	201	198	198
query64	4427	1146	849	849
query65	4328	4225	4258	4225
query66	1103	437	365	365
query67	15627	15247	15116	15116
query68	8173	944	602	602
query69	569	347	313	313
query70	1248	1141	1186	1141
query71	535	356	322	322
query72	5973	4948	4872	4872
query73	694	574	367	367
query74	8961	9205	8968	8968
query75	3824	3093	2640	2640
query76	3729	1167	745	745
query77	820	411	341	341
query78	9648	9775	8832	8832
query79	2380	839	594	594
query80	643	578	526	526
query81	510	266	229	229
query82	460	148	116	116
query83	265	266	260	260
query84	266	112	94	94
query85	893	483	502	483
query86	395	323	342	323
query87	4267	4334	4219	4219
query88	3687	2258	2235	2235
query89	402	335	298	298
query90	1876	226	223	223
query91	162	166	133	133
query92	97	78	68	68
query93	1788	999	649	649
query94	706	440	332	332
query95	409	342	407	342
query96	489	591	277	277
query97	2661	2704	2569	2569
query98	248	219	214	214
query99	1656	1410	1292	1292
Total cold run time: 277617 ms
Total hot run time: 187675 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.72 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 25c870d7123162f988195ce1c675226ef5564d44, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.06	0.06
query3	0.26	0.08	0.08
query4	1.60	0.12	0.12
query5	0.44	0.42	0.43
query6	1.17	0.64	0.66
query7	0.04	0.03	0.03
query8	0.06	0.04	0.06
query9	0.61	0.54	0.53
query10	0.58	0.58	0.57
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.62	0.63	0.62
query14	0.79	0.85	0.81
query15	0.88	0.85	0.89
query16	0.39	0.43	0.39
query17	1.03	1.03	1.08
query18	0.23	0.21	0.21
query19	1.96	1.90	1.82
query20	0.02	0.01	0.01
query21	15.40	0.95	0.58
query22	0.76	1.14	0.73
query23	14.94	1.41	0.61
query24	6.61	0.87	0.65
query25	0.50	0.14	0.16
query26	0.65	0.16	0.14
query27	0.06	0.06	0.06
query28	9.99	0.93	0.43
query29	12.56	3.89	3.25
query30	3.13	3.05	3.05
query31	2.82	0.59	0.39
query32	3.24	0.56	0.49
query33	3.00	3.11	3.15
query34	16.05	5.49	4.82
query35	4.93	4.91	4.89
query36	0.70	0.51	0.49
query37	0.11	0.08	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.15
query41	0.08	0.03	0.03
query42	0.03	0.04	0.03
query43	0.04	0.04	0.03
Total cold run time: 107.01 s
Total hot run time: 32.72 s

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 33.33% (2/6) 🎉
Increment coverage report
Complete coverage report

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Aug 27, 2025
@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yujun777
Copy link
Contributor Author

run p0

1 similar comment
@yujun777
Copy link
Contributor Author

run p0

@yujun777
Copy link
Contributor Author

run external

@yujun777
Copy link
Contributor Author

run p0

1 similar comment
@yujun777
Copy link
Contributor Author

run p0

@yujun777
Copy link
Contributor Author

run external

@morrySnow morrySnow added the usercase Important user case type label label Aug 28, 2025
@924060929 924060929 merged commit dc57a05 into apache:master Aug 28, 2025
30 of 32 checks passed
github-actions bot pushed a commit that referenced this pull request Aug 28, 2025
…r insert value truncate long string (#55325)

for below sql:

```sql
create table t(a int,  b varchar(2));
insert into t values(1, 'abcde');
```

for doris 2.0, the insert command will throw error 'Insert has filtered
data in strict mode', because length of value 'abcde' is bigger than
varchar(2).

for mysql, pg, the above sql will also throw error for the same reason.

for doris version >= 2.1.0, it will not throw error, and just truncate
'abcde' to 'ab'.

what's more, for stream load, no matter 2.0, or 2.1, ..., doris will
stream load fail.

but considering 2.1 have released for 1.5+ year, we don't want to make a
behaviour change for 2.1, so add a session variable
enable_insert_value_auto_cast, when insert value with longer string
value, then will have:

- if enable_insert_value_auto_cast = true (default), the longer string
will be truncate and insert succ;
- if enable_insert_value_auto_cast = false, enable_insert_strict =
true(default), will throw exception 'Insert has filtered data in strict
mode';
- if enable_insert_value_auto_cast = false, enable_insert_strict =
false, then the longer string will be filtered, other rows will insert
succ.
 
relate PR:  #52802
yiguolei pushed a commit that referenced this pull request Aug 29, 2025
…auto_cast for insert value truncate long string #55325 (#55427)

cherry pick from #55325
zhiqiang-hhhh pushed a commit to zhiqiang-hhhh/doris that referenced this pull request Aug 29, 2025
…r insert value truncate long string (apache#55325)

for below sql:

```sql
create table t(a int,  b varchar(2));
insert into t values(1, 'abcde');
```

for doris 2.0, the insert command will throw error 'Insert has filtered
data in strict mode', because length of value 'abcde' is bigger than
varchar(2).

for mysql, pg, the above sql will also throw error for the same reason.

for doris version >= 2.1.0, it will not throw error, and just truncate
'abcde' to 'ab'.

what's more, for stream load, no matter 2.0, or 2.1, ..., doris will
stream load fail.

but considering 2.1 have released for 1.5+ year, we don't want to make a
behaviour change for 2.1, so add a session variable
enable_insert_value_auto_cast, when insert value with longer string
value, then will have:

- if enable_insert_value_auto_cast = true (default), the longer string
will be truncate and insert succ;
- if enable_insert_value_auto_cast = false, enable_insert_strict =
true(default), will throw exception 'Insert has filtered data in strict
mode';
- if enable_insert_value_auto_cast = false, enable_insert_strict =
false, then the longer string will be filtered, other rows will insert
succ.
 
relate PR:  apache#52802
morrySnow pushed a commit that referenced this pull request Sep 4, 2025
…auto_cast for insert value truncate long string #55325 (#55414)

Cherry-picked from #55325

Co-authored-by: yujun <yujun@selectdb.com>
dataroaring pushed a commit that referenced this pull request Sep 5, 2025
…auto_cast for insert value truncate long string #55325 (#55423)

cherry pick from #55325
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.12-merged dev/3.0.9-merged dev/3.1.1-merged reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants