Skip to content

Conversation

@eldenmoon
Copy link
Member

@eldenmoon eldenmoon commented Feb 16, 2023

Proposed changes

Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones

Issue Number: close #xxx

Problem summary

Describe your changes.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@eldenmoon eldenmoon force-pushed the fix-dynamic-block-segw branch from c78d682 to cec7f17 Compare February 16, 2023 05:34
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@yiguolei
Copy link
Contributor

related with: https://github.com/apache/doris/pull/16795/files
But this pr maybe more elegant.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones
@eldenmoon eldenmoon force-pushed the fix-dynamic-block-segw branch from 182d38b to ac82a8e Compare February 16, 2023 09:01
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yiguolei yiguolei merged commit 5dfd6d2 into apache:master Feb 17, 2023
@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 17, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@luozenglin
Copy link
Contributor

fix #16658

yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 9, 2023
apache#16816)

* [improve](dynamic table) refine SegmentWriter columns writer generate

```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones

* test
luwei16 pushed a commit to luwei16/Doris that referenced this pull request Apr 7, 2023
…nerate (apache#16816)" (apache#1492)

* [improve](dynamic table) refine SegmentWriter columns writer generate

```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones
luwei16 pushed a commit to luwei16/Doris that referenced this pull request Apr 7, 2023
* [WIP](dynamic-table) support dynamic schema table (apache#16335)

Issue Number: close apache#16351

Dynamic schema table is a special type of table, it's schema change with loading procedure.Now we implemented this feature mainly for semi-structure data such as JSON, since JSON is schema self-described we could extract schema info from the original documents and inference the final type infomation.This speical table could reduce manual schema change operation and easily import semi-structure data and extends it's schema automatically.

* [improve](dynamic-table) change `addColumns` RPC interface fields from `required` to `optional` and and config doc (apache#16632)

* [doc](dynamic-table) Add docs for dynamic-table (apache#16669)

* [improve](dynamic table) refine SegmentWriter columns writer generate (apache#16816)

* [improve](dynamic table) refine SegmentWriter columns writer generate

```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones

* test

* [typo](docs)fix dynamic Table version label (apache#16895)

* [Feature](Dynamic schema table) step1 support schema change expression (apache#17494)

1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility

* Fix compile

* [Bug](dynamic-table) Fix column alignment logic and support filtering null values when slot is not null

Before this PR when encountering null values with some columns which is specified as `NOT NULL`, null values will not be filtered,thi behavior does not match with the original load behavior.
Second column alignment logic has bug :

```
template <typename ColumnInserterFn>
void align_variant_by_name_and_type(ColumnObject& dst, const ColumnObject& src, size_t row_cnt,
                                    ColumnInserterFn inserter) {
    CHECK(dst.is_finalized() && src.is_finalized());
    // Use rows() here instead of size(), since size() will check_consistency
    // but we could not check_consistency since num_rows will be upgraded even
    // if src and dst is empty, we just increase the num_rows of dst and fill
    // num_rows of default values when meet new data
    size_t num_rows = dst.rows();
```

---------

Co-authored-by: jiafeng.zhang <zhangjf1@gmail.com>
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
…nerate (apache#16816)" (apache#1492)

* [improve](dynamic table) refine SegmentWriter columns writer generate

```
Dynamic Block consists of two parts, dynamic part of columns and static part of columns
static   dynamic
| ----- | ------- |
the static ones are original _tablet_schame columns
the dynamic ones are auto generated and extended from file scan.
```
**We should only consisder to use Block info to generte columns when it's a dynamic table load procudure.**
And seperate the static ones and dynamic ones
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. kind/test reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants