Skip to content

Conversation

@mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Nov 8, 2024

What problem does this PR solve?

ColumnComplex::operator[](size_t n) always return String Field type.

*** Query id: b73dc1a149a469b-ac1b822f8fe0a8a2 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1731047590 (unix time) try "date -d @1731047590" if you are using GNU date ***
*** Current BE git commitID: 55e92da7e7 ***
*** SIGSEGV address not mapped to object (@0x58) received by PID 2528792 (TID 2533139 OR 0x7f6add64b700) from PID 88; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/doris/be/src/common/signal_handler.h:421
 1# 0x00007F6FEE12BB50 in /lib64/libc.so.6
 2# doris::BitmapValue::BitmapValue(doris::BitmapValue const&) at /mnt/disk1/doris/be/src/util/bitmap_value.h:850
 3# void std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> >::_M_realloc_insert<doris::BitmapValue const&>(__gnu_cxx::__normal_iterator<doris::BitmapValue*, std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> > >, doris::BitmapValue const&) in /mnt/disk1/doris/be/output/lib/doris_be
 4# doris::vectorized::ColumnNullable::insert(doris::vectorized::Field const&) at /mnt/disk1/doris/be/src/vec/columns/column_nullable.cpp:334
 5# doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::add(doris::vectorized::Field const&, doris::vectorized::Field const&) in /mnt/disk1/doris/be/output/lib/doris_be
 6# doris::vectorized::AggregateFunctionMapAgg<doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::deserialize_and_merge_from_column(char*, doris::vectorized::IColumn const&, doris::vectorized::Arena*) const at /mnt/disk1/doris/be/src/vec/aggregate_functions/aggregate_function_map.h:287
 7# doris::pipeline::AggSinkLocalState::_merge_without_key(doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:389
 8# doris::pipeline::AggSinkLocalState::Executor<true, true>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.h:73
 9# doris::pipeline::AggSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:744
10# doris::pipeline::PipelineXTask::execute(bool*) at /mnt/disk1/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:332
11# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /mnt/disk1/doris/be/src/pipeline/task_scheduler.cpp:347
12# doris::ThreadPool::dispatch_thread() in /mnt/disk1/doris/be/output/lib/doris_be
13# doris::Thread::supervise_thread(void*) at /mnt/disk1/doris/be/src/util/thread.cpp:499
14# start_thread in /lib64/libpthread.so.0
15# __clone in /lib64/libc.so.6

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 8, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.90% (9865/26027)
Line Coverage: 29.08% (82235/282767)
Region Coverage: 28.23% (42324/149921)
Branch Coverage: 24.78% (21430/86482)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8957f6e415d2f80af7fb24f0bc6ff549b66f5e5b_8957f6e415d2f80af7fb24f0bc6ff549b66f5e5b/report/index.html

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 9, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.92% (9869/26027)
Line Coverage: 29.09% (82271/282774)
Region Coverage: 28.25% (42354/149928)
Branch Coverage: 24.79% (21441/86488)
Coverage Report: http://coverage.selectdb-in.cc/coverage/4c8e095ddaa8c9e88010235d1f32c23290914c74_4c8e095ddaa8c9e88010235d1f32c23290914c74/report/index.html

yiguolei
yiguolei previously approved these changes Nov 11, 2024
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 11, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

data.resize(row_size);
check_serialize_and_deserialize(column);

check_field_type(column);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only check the return type.

  1. build a bitmap value and insert into the column
  2. get the filed value from column and check the value
  3. check the type.

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 11, 2024

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Nov 11, 2024
@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.91% (9867/26028)
Line Coverage: 29.10% (82284/282785)
Region Coverage: 28.24% (42342/149939)
Branch Coverage: 24.79% (21443/86494)
Coverage Report: http://coverage.selectdb-in.cc/coverage/46fcd90c798b8430593762ff474379161e65a049_46fcd90c798b8430593762ff474379161e65a049/report/index.html

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 11, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 52205 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 14bfe69dea07dbd5616941f7946ecd2cc398e5b1, data reload: false

------ Round 1 ----------------------------------
q1	17735	7490	7343	7343
q2	2588	1317	1293	1293
q3	10023	1175	1192	1175
q4	10265	881	866	866
q5	7600	3026	2973	2973
q6	238	144	143	143
q7	1046	631	621	621
q8	9391	2372	2380	2372
q9	12405	12314	12460	12314
q10	7095	2411	2461	2411
q11	464	262	258	258
q12	408	211	217	211
q13	17771	3052	3061	3052
q14	243	212	213	212
q15	571	515	515	515
q16	635	581	596	581
q17	988	583	584	583
q18	7383	6763	6772	6763
q19	1341	1025	1070	1025
q20	3059	2855	2904	2855
q21	3970	3319	3270	3270
q22	1426	1378	1369	1369
Total cold run time: 116645 ms
Total hot run time: 52205 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7553	7450	7544	7450
q2	369	240	241	240
q3	3174	2986	2999	2986
q4	2080	1829	1827	1827
q5	5667	5615	5675	5615
q6	226	139	137	137
q7	2237	1870	1846	1846
q8	3406	3580	3594	3580
q9	14513	14337	14311	14311
q10	3656	3562	3535	3535
q11	605	500	489	489
q12	846	613	654	613
q13	14147	3230	3282	3230
q14	290	258	263	258
q15	589	550	537	537
q16	671	655	624	624
q17	1852	1645	1606	1606
q18	8225	7914	7577	7577
q19	1700	1600	1571	1571
q20	2170	2029	1944	1944
q21	5538	5265	5244	5244
q22	673	545	554	545
Total cold run time: 80187 ms
Total hot run time: 65765 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.75% (9815/26003)
Line Coverage: 29.04% (82090/282677)
Region Coverage: 28.16% (42203/149895)
Branch Coverage: 24.74% (21390/86468)
Coverage Report: http://coverage.selectdb-in.cc/coverage/14bfe69dea07dbd5616941f7946ecd2cc398e5b1_14bfe69dea07dbd5616941f7946ecd2cc398e5b1/report/index.html

// not a valid jsonb type, insert as string
const auto* str = static_cast<const JsonbStringVal*>(arg);
field.assign_string(str->getBlob(), str->getBlobLen());
field = String(str->getBlob(), str->getBlobLen());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field(String(str->getBlob(), str->getBlobLen()));

ColumnsWithTypeAndName arguments {
{source, json_type, ""},
{type_string->create_column_const(1, Field(jsonpath.data(), jsonpath.size())),
{type_string->create_column_const(1, String(jsonpath.data(), jsonpath.size())),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Field(String(jsonpath.data(), jsonpath.size()))

@yiguolei
Copy link
Contributor

run buildall

@mrhhsg
Copy link
Member Author

mrhhsg commented Nov 11, 2024

run buildall

@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Nov 12, 2024
@yiguolei yiguolei added the p0_c label Nov 12, 2024
@yiguolei yiguolei merged commit 55dbedf into apache:master Nov 12, 2024
github-actions bot pushed a commit that referenced this pull request Nov 12, 2024
…43515)

### What problem does this PR solve?

`ColumnComplex::operator[](size_t n)` always return String Field type.

```
*** Query id: b73dc1a149a469b-ac1b822f8fe0a8a2 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1731047590 (unix time) try "date -d @1731047590" if you are using GNU date ***
*** Current BE git commitID: 55e92da ***
*** SIGSEGV address not mapped to object (@0x58) received by PID 2528792 (TID 2533139 OR 0x7f6add64b700) from PID 88; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/doris/be/src/common/signal_handler.h:421
 1# 0x00007F6FEE12BB50 in /lib64/libc.so.6
 2# doris::BitmapValue::BitmapValue(doris::BitmapValue const&) at /mnt/disk1/doris/be/src/util/bitmap_value.h:850
 3# void std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> >::_M_realloc_insert<doris::BitmapValue const&>(__gnu_cxx::__normal_iterator<doris::BitmapValue*, std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> > >, doris::BitmapValue const&) in /mnt/disk1/doris/be/output/lib/doris_be
 4# doris::vectorized::ColumnNullable::insert(doris::vectorized::Field const&) at /mnt/disk1/doris/be/src/vec/columns/column_nullable.cpp:334
 5# doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::add(doris::vectorized::Field const&, doris::vectorized::Field const&) in /mnt/disk1/doris/be/output/lib/doris_be
 6# doris::vectorized::AggregateFunctionMapAgg<doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::deserialize_and_merge_from_column(char*, doris::vectorized::IColumn const&, doris::vectorized::Arena*) const at /mnt/disk1/doris/be/src/vec/aggregate_functions/aggregate_function_map.h:287
 7# doris::pipeline::AggSinkLocalState::_merge_without_key(doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:389
 8# doris::pipeline::AggSinkLocalState::Executor<true, true>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.h:73
 9# doris::pipeline::AggSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:744
10# doris::pipeline::PipelineXTask::execute(bool*) at /mnt/disk1/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:332
11# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /mnt/disk1/doris/be/src/pipeline/task_scheduler.cpp:347
12# doris::ThreadPool::dispatch_thread() in /mnt/disk1/doris/be/output/lib/doris_be
13# doris::Thread::supervise_thread(void*) at /mnt/disk1/doris/be/src/util/thread.cpp:499
14# start_thread in /lib64/libpthread.so.0
15# __clone in /lib64/libc.so.6
```
py023 pushed a commit to py023/doris that referenced this pull request Nov 13, 2024
…pache#43515)

### What problem does this PR solve?

`ColumnComplex::operator[](size_t n)` always return String Field type.

```
*** Query id: b73dc1a149a469b-ac1b822f8fe0a8a2 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1731047590 (unix time) try "date -d @1731047590" if you are using GNU date ***
*** Current BE git commitID: 55e92da ***
*** SIGSEGV address not mapped to object (@0x58) received by PID 2528792 (TID 2533139 OR 0x7f6add64b700) from PID 88; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/doris/be/src/common/signal_handler.h:421
 1# 0x00007F6FEE12BB50 in /lib64/libc.so.6
 2# doris::BitmapValue::BitmapValue(doris::BitmapValue const&) at /mnt/disk1/doris/be/src/util/bitmap_value.h:850
 3# void std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> >::_M_realloc_insert<doris::BitmapValue const&>(__gnu_cxx::__normal_iterator<doris::BitmapValue*, std::vector<doris::BitmapValue, std::allocator<doris::BitmapValue> > >, doris::BitmapValue const&) in /mnt/disk1/doris/be/output/lib/doris_be
 4# doris::vectorized::ColumnNullable::insert(doris::vectorized::Field const&) at /mnt/disk1/doris/be/src/vec/columns/column_nullable.cpp:334
 5# doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::add(doris::vectorized::Field const&, doris::vectorized::Field const&) in /mnt/disk1/doris/be/output/lib/doris_be
 6# doris::vectorized::AggregateFunctionMapAgg<doris::vectorized::AggregateFunctionMapAggData<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::deserialize_and_merge_from_column(char*, doris::vectorized::IColumn const&, doris::vectorized::Arena*) const at /mnt/disk1/doris/be/src/vec/aggregate_functions/aggregate_function_map.h:287
 7# doris::pipeline::AggSinkLocalState::_merge_without_key(doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:389
 8# doris::pipeline::AggSinkLocalState::Executor<true, true>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.h:73
 9# doris::pipeline::AggSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /mnt/disk1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:744
10# doris::pipeline::PipelineXTask::execute(bool*) at /mnt/disk1/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:332
11# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /mnt/disk1/doris/be/src/pipeline/task_scheduler.cpp:347
12# doris::ThreadPool::dispatch_thread() in /mnt/disk1/doris/be/output/lib/doris_be
13# doris::Thread::supervise_thread(void*) at /mnt/disk1/doris/be/src/util/thread.cpp:499
14# start_thread in /lib64/libpthread.so.0
15# __clone in /lib64/libc.so.6
```
@mrhhsg mrhhsg deleted the fix_complex branch November 13, 2024 07:11
yiguolei pushed a commit that referenced this pull request Nov 13, 2024
…umnComplex (#43718)

Cherry-picked from #43515

Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.8-merged dev/3.0.3-merged p0_c reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants