Skip to content

Conversation

@spaces-X
Copy link
Contributor

@spaces-X spaces-X commented Apr 18, 2022

Proposed changes

Problem Summary:

when partition_id exceed integer range, it will encountered java.lang.NumberFormatException in spark load.

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (No)
  2. Has unit tests been added: (No Need)
  3. Has document been added or modified: (No Need)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@wangbo wangbo added the area/spark-load Issues or PRs related to the spark load label Apr 19, 2022
Copy link
Contributor

@wangbo wangbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 19, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit 39c0fec into apache:master Apr 20, 2022
@morningman morningman added the dev/1.0.1-deprecated should be merged into dev-1.0.1 branch label Apr 20, 2022
@morningman morningman added dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 and removed dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels Apr 20, 2022
weizhengte pushed a commit to weizhengte/incubator-doris that referenced this pull request Apr 22, 2022
zhengshiJ pushed a commit to zhengshiJ/incubator-doris that referenced this pull request Apr 27, 2022
starocean999 pushed a commit to starocean999/incubator-doris that referenced this pull request May 19, 2022
englefly pushed a commit to englefly/incubator-doris that referenced this pull request May 23, 2022
liutang123 pushed a commit to liutang123/doris that referenced this pull request Apr 15, 2024
change list
1.add broker plus manifest.yaml
2.[MT] spark load for mt
3.[MT] Adapt MT internal spark/yarn commands and configurations
4.[MT] add custom properties for spark load etl & del tmp hive table
5.[MT] delete spark delete spark repository and archive & improve etl job log
6.[MT] feature(sparkload): support bitmap encode features in spark load
7.[MT] feature(sparkload parquet): disable parquet dictionary
8.feature(sparkload): support bitmap binary data from hive in spark load
9.[MT] feature(sparkload): add tolas-output dependency in SparkDpp
10.fix(spark load): resolve args conflict between skip_null_value  and map_side_join
   refactor(spark load): refactor function name

each commit detail are listed in this branch:
https://dev.sankuai.com/code/repo-detail/data/palo/commit/list?branch=sparkload-14-update-details
or in branh 13:
https://dev.sankuai.com/code/repo-detail/data/palo/commit/list?branch=refs%2Fheads%2F13

Some Spark Load changes in 0.15 to 1.1:
[MT][FIX][SPARKLOAD] fix bug when partition_id exceeds integer range in spark load (apache#9073)
[MT][FIX][SPARKLOAD] fix `getHashValue` of string type is always zero in spark load (apache#9135)
[MT][TMP][SPARKLOAD] support `custom.global.dict.table` in spark load
[MT][FEATURE][SPARKLOAD] support retry-strategy when get the spark elt job state timeout
[MT][SPARKLOAD] hive table name start with tmp key word and its size should be no longer than 128
[MT][TMP][SPARKLOAD] fix min_value will be negative number when `maxGlobalDictValue`  exceeds integer range (apache#9436)

detail commmits' content could be found in this branch:
https://dev.sankuai.com/code/repo-detail/data/palo/commit/list?branch=refs%2Fheads%2F14

[MT][TMP][FIX] fix UT in spark load
[MT][FEATURE] feature(spark-dpp version): add version file for spark-dpp
add spark-dpp commit id as version file when build FE

[MT][SPARKLOAD] some fixes from 15 to 1.1 by wangbo36
1 not connect hive metastore when create hive table
2 avoid cast from string to bitmap expr
3 cast bytebuffer to buffer
4 handle exception in ut
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/spark-load Issues or PRs related to the spark load dev/merged-1.0.1-deprecated PR has been merged into dev-1.0.1 reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants