[improvement](pipeline) task group scan entity #19924 #27039

wangbo · 2023-11-15T09:02:29Z

#19924

Since the default column separator for tvf reading csv format has changed, these cases need to be fixed.

…pache#26106) add external_docker tag for some external case

…enclose or escape (apache#25816)

Fix select table tablet not effective, table distributed by random. If tabletID specified in query does not exist in this partition, skip scan partition.

…hema (apache#25844) There is FE config `infodb_support_ext_catalog`, the default is false. Which means that the tables in `information_schema` database will not return info of external catalog. Because if there are too many external catalogs in Doris with lots of db/tbl (like running p0 regression tests), querying infomation_schema db will take a long time and may causing rpc timeout. And there is an unresolved issue that if thrift rpc timeout, the BE may be crashed in ASAN mode. So to avoid this issue(not fix yet), this PR mainly changes: if `infodb_support_ext_catalog` is false, 1. query info of external catalog in information_schema db is not allowed, such as show database like "external_catalog"; show tables like "xxx" 2. select * from information_schema.tbl will not contains external catalogs' info 3. For external p0 regression test pipeline, set `infodb_support_ext_catalog` to true to run the tests related to external catalog

Add timeout config for sync

…sfully (apache#25891)

…timeunit is const (apache#25824) this PR apache#22602 have check function. only support date_trunc(column, const), so the second must be const literal and no need to check time unit every row.

…pache#25653) (apache#26216)

…ce_Clickbench_ClickbenchNew apache#25827 (apache#26219)

…e#26197)

…ould not be pushed down (apache#25901) (apache#26215)

…he#25972 (apache#26211)

…ache#25364 apache#25834 (apache#26190)

…e cases (apache#26214)

…pache#26217)

…pache#25938) (apache#26222) we put bound expr into unbound group by list by mistake. This will lead to bind twice on some exprssion. Since binding is not idempotent, below exception will be thrown for sql ```sql select k5 / k5 as nu, sum(k1) from test group by nu order by nu nulls first ``` ``` Caused by: org.apache.doris.nereids.exceptions.AnalysisException: Input slot(s) not in child's output: k5#5 in plan: LogicalProject[176] ( distinct=false, projects=[(cast(k5#5 as DECIMALV3(16, 10)) / k5#5) AS `nu`apache#14, sum(k1)apache#15], excepts=[] ), child output is: [nu#16, sum(k1)apache#15] plan tree: LogicalProject[176] ( distinct=false, projects=[(cast(k5#5 as DECIMALV3(16, 10)) / k5#5) AS `nu`apache#14, sum(k1)apache#15], excepts=[] ) +--LogicalAggregate[168] ( groupByExpr=[nu#16], outputExpr=[nu#16, sum(k1#1) AS `sum(k1)`apache#15], hasRepeat=false ) +--LogicalProject[156] ( distinct=false, projects=[k1#1, (cast(k5#5 as DECIMALV3(16, 10)) / k5#5) AS `nu`apache#16], excepts=[] ) +--LogicalOlapScan ( qualified=default_cluster:regression_test_nereids_syntax_p0.test, indexName=test, selectedIndexId=503229, preAgg=OFF, Aggregate function sum(k1) contains key column k1. ) at org.apache.doris.nereids.rules.analysis.CheckAfterRewrite.checkAllSlotReferenceFromChildren(CheckAfterRewrite.java:108) ~[classes/:?] ```

…n load multi table (apache#25762) (apache#26223)

…25769 (apache#26224)

…rflow (apache#26209) (apache#26228)

…contains complex where conditions apache#23874 (apache#26212)

…y time checkpoint mysql load (apache#26031) (apache#26139)

…ass apache#26814 (apache#26838)

…an in limit with where scan (apache#25952) (apache#26815)

… nullable columns without default value apache#26776 (apache#26848)

…pache#26647) (apache#26868)

…6680) (apache#26869)

… reporting correctly to TeamCity (apache#26606) (apache#26871)

…e#26163 (apache#26858) picked from apache#26163

…ke effect apache#26585 (apache#26859)

…apache#26890) backport apache#26435 Improve the accuracy of sample stats collection. For non distribution columns, use `n*d / (n - f1 + f1*n/N)` where `f1` is the number of distinct values that occurred exactly once in our sample of n rows (from a total of N), and `d` is the total number of distinct values in the sample. For distribution columns, use `ndv(n) * fraction of tablets sampled` for NDV. For very large tablet to sample, use limit to control the total lines to scan (for non key column only, because key column is sorted and will be inaccurate using limit).

…tatement is a point query in OriginPlanner apache#26881 (apache#26900)

…pache#26926) Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>

…ache#26850)

apache#26950 (apache#26976) backport apache#26950

…#26910 (apache#26921) backport: apache#26910

apache#26928)

…al as first arg (apache#26957) (apache#26987)

…#26884) (apache#26985)

)

…ss (apache#26741) (apache#26923) backport apache#26741

…not in tablet schema apache#26990 (apache#26991)

…e table size apache#26968 (apache#26972)

…net (apache#26872)

…roovy (apache#26925) Co-authored-by: stephen <hello-stephen@qq.com>

backport apache#27002

BePPPower and others added 30 commits October 31, 2023 15:43

[fix](Export) fix timeout property not work for export job (apache#26032

0ff645d

) Refer: apache#25913

[branch-2.0][fix](outfile) Fix parquet writer wrong result issue (apa…

79608f9

…che#26027)

[branch-2.0](regression-test) fix regression test (apache#26144)

aac9002

Since the default column separator for tvf reading csv format has changed, these cases need to be fixed.

[Enhance](regression)add external_docker tag for some external case (a…

fecc605

…pache#26106) add external_docker tag for some external case

[fix](report handler) fix report handler lock leak (apache#25853)

532e287

[enhancement](CSV-reader) enhance err log for csv reading containing …

0e8f46d

…enclose or escape (apache#25816)

[improve](regression test) Add case for if function (apache#25780)

2127dd2

[fix](stats) analyze specific column only if indicate column in analy…

244383b

…ze stmt (apache#25660)

[fix](planner) Fix select table tablet not effective (apache#25378)

1f5962b

Fix select table tablet not effective, table distributed by random. If tabletID specified in query does not exist in this partition, skip scan partition.

[fix](meta) add sync new image timeout (apache#25768)

dbe002f

Add timeout config for sync

[test](sync)add sync after insert in test case (apache#25911)

e649092

[chore](compaction) Print roswet size when compaction finishes succes…

29515a6

…sfully (apache#25891)

[improvement](function) improve date_trunc function performance when …

4e78a3d

…timeunit is const (apache#25824) this PR apache#22602 have check function. only support date_trunc(column, const), so the second must be const literal and no need to check time unit every row.

[branch-2.0](cherry-pick) use correct dataset for unique_with_mow_p2 (a…

c229b99

…pache#25653) (apache#26216)

[fix](trigger) fix pipeline bug that does not trigger Doris_Performan…

e72309e

…ce_Clickbench_ClickbenchNew apache#25827 (apache#26219)

[Fix](regression)P0 case fail in inside pipeline on branch-2.0 (apach…

8ffdfb6

…e#26197)

Update sidebars.json (apache#26201)

c87724c

[Fix](predicate pushdown) Common expression not acting on any slot sh…

104721d

…ould not be pushed down (apache#25901) (apache#26215)

[feature](mtmv)disable mtmv for 2.0 (apache#26176)

683b4ef

[Fix](inverted index) reorder ConjunctionQuery deconstruct order apac…

1acb37e

…he#25972 (apache#26211)

[branch-2.0][catalog](paimon)paimon catalog supports complex types ap…

72c925b

…ache#25364 apache#25834 (apache#26190)

[chore](case) branch-2.0, exclude some unstable cases and include som…

d424754

…e cases (apache#26214)

[log](compaction) add more stats for compaction log (apache#24984) (a…

b27e07a

…pache#26217)

[branch-2.0][fix](multi-table) fix unknown source slot descriptor whe…

a35de81

…n load multi table (apache#25762) (apache#26223)

[bugfix](nereids) prune partition bug in pattern ColA <> ColB apache#…

81b5f74

…25769 (apache#26224)

[fix](spill) disable spill of sort and agg for now to avoid diisk ove…

213e6b0

…rflow (apache#26209) (apache#26228)

[2.0-branch][Fix](Plan)StreamLoad cannot be parsed correctly when it …

a321f63

…contains complex where conditions apache#23874 (apache#26212)

(pick-2.0-26031)[Fix](MySqlLoad) Fix meaningless thread creation ever…

0242f87

…y time checkpoint mysql load (apache#26031) (apache#26139)

ByteYue and others added 29 commits November 12, 2023 21:55

[chore](fs) Don't print the stack for file system and it's derived cl…

3e543ff

…ass apache#26814 (apache#26838)

[compile](gcc) fix gcc compile error apache#26863

e1574ed

[test](jdbc) pick some jdbc test from branch master (apache#26860)

2f74832

[pipeline](exec) disable shared scan in default and disable shared sc…

d25f02c

…an in limit with where scan (apache#25952) (apache#26815)

[regression](partial update) Add cases when the deleted rows have non…

925ff56

… nullable columns without default value apache#26776 (apache#26848)

[feature](fe) Add coverage tool for FE UT (apache#26203) (apache#26857)

87862f9

[fix](map) the implementation of ColumnMap::replicate was incorrect (a…

4d9a8f8

…pache#26647) (apache#26868)

[fix](broker load) pass loadToSingleTablet to olapTableSink (apache#2…

c23b674

…6680) (apache#26869)

[regression-test](framework) Support running tests multiple times and…

be2dd64

… reporting correctly to TeamCity (apache#26606) (apache#26871)

[refactor](stats) refactor collection logic and opt some config apach…

5ddabea

…e#26163 (apache#26858) picked from apache#26163

[bug](user login)fix PASSWORD_LOCK_TIME setting UNBOUNDED does not ta…

b536915

…ke effect apache#26585 (apache#26859)

[fix](partial update) Fix NPE when the query statement of an update s…

d06c13f

…tatement is a point query in OriginPlanner apache#26881 (apache#26900)

[bug](function) add signature for precentile function (apache#26867) (a…

644f5f2

…pache#26926) Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>

enable pipeline and nereids in test-pipeline (apache#26918)

255c62f

[Fix](Planner) fix varchar does not show real length apache#25171 (ap…

3b5d9e7

…ache#26850)

[improvement](statistics)Multi bucket columns using DUJ1 to collect ndv

9f44fa1

apache#26950 (apache#26976) backport apache#26950

[fix](statistics)Fix external table show column stats type bug apache…

160a515

…#26910 (apache#26921) backport: apache#26910

[minor](stats) rename stats related session variable name apache#26936 (

f5974e2

apache#26928)

[nereids](datetime) fix wrong result type of datetime add with interv…

0d9f486

…al as first arg (apache#26957) (apache#26987)

[fix](Nereids) column pruning under union broken unexpectedly (apache…

be34030

…#26884) (apache#26985)

[fix](catalog) Fix ClickHouse DataTime64 precision parsing (apache#26980

0ca79c2

)

[opt](MergeIO) use equivalent merge size to measure merge effectivene…

2ad0bb4

…ss (apache#26741) (apache#26923) backport apache#26741

add defensive code in runtime predicate to avoid crash due to column …

299d61b

…not in tablet schema apache#26990 (apache#26991)

[fix](stats) fix auto collector always create sample job no matter th…

b107544

…e table size apache#26968 (apache#26972)

[Enhance](regression)enhance docker network by add docker network sub…

3ba57eb

…net (apache#26872)

[fix](case) regression-test/suites/show_p0/test_show_statistic_proc.g…

7284f8e

…roovy (apache#26925) Co-authored-by: stephen <hello-stephen@qq.com>

[fix](auth) fix overwrite logic of user with domain (apache#27003)

9450a59

backport apache#27002

[improvement](pipeline) task group scan entity (apache#19924)

f30b7e9

wangbo closed this Nov 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[improvement](pipeline) task group scan entity #19924 #27039

[improvement](pipeline) task group scan entity #19924 #27039

Uh oh!

wangbo commented Nov 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[improvement](pipeline) task group scan entity #19924 #27039

[improvement](pipeline) task group scan entity #19924 #27039

Uh oh!

Conversation

wangbo commented Nov 15, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants