-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feature](csv)Supports reading CSV data using LF and CRLF as line separators. #37687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
| } | ||
|
|
||
| [[nodiscard]] inline size_t line_delimiter_length() const final { return line_delimiter_len; } | ||
| [[nodiscard]] inline size_t line_delimiter_length() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: annotate this function with 'override' or (rarely) 'final' [modernize-use-override]
| [[nodiscard]] inline size_t line_delimiter_length() const { | |
| [[nodiscard]] inline size_t line_delimiter_length() const override { |
4a05198 to
6fc9f08
Compare
|
run buildall |
|
run buildall |
TPC-H: Total hot run time: 40112 ms |
TPC-DS: Total hot run time: 174019 ms |
ClickBench: Total hot run time: 30.32 s |
|
run buildall |
TPC-H: Total hot run time: 39956 ms |
TPC-DS: Total hot run time: 174024 ms |
ClickBench: Total hot run time: 31.51 s |
|
run buildall |
TPC-H: Total hot run time: 40221 ms |
TPC-DS: Total hot run time: 172871 ms |
ClickBench: Total hot run time: 31.28 s |
|
run buildall |
TPC-H: Total hot run time: 39651 ms |
TPC-DS: Total hot run time: 173674 ms |
ClickBench: Total hot run time: 30.17 s |
|
run buildall |
TPC-H: Total hot run time: 39877 ms |
TPC-DS: Total hot run time: 174213 ms |
ClickBench: Total hot run time: 31.23 s |
|
run p0 |
|
run p1 |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
suxiaogang223
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
kaka11chen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hastyshell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…arators. (apache#37687) Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
…arators. (apache#37687) ## Proposed changes Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. ## warning It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
…arators. (apache#37687) ## Proposed changes Supports reading CSV data using LF and CRLF as line separators. csv file: ``` 1,abc 2,def\r 3,qwe 4,hello\r ``` if you `set keep_carriage_return = false` you will get : ```mysql 1 abc 2 def 3 qwe 4 hello ``` Here, both \r\n and \n are used as delimiters. if you `set keep_carriage_return = true` you will get : ```mysql 1 abc 2 def\r 3 qwe 4 hello\r ``` Here only \n is used as a delimiter. ## warning It should be noted that `set keep_carriage_return = true` is valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if you `set keep_carriage_return = true`.
Proposed changes
Supports reading CSV data using LF and CRLF as line separators.
csv file:
if you
set keep_carriage_return = falseyou will get :
Here, both \r\n and \n are used as delimiters.
if you
set keep_carriage_return = trueyou will get :
Here only \n is used as a delimiter.
warning
It should be noted that
set keep_carriage_return = trueis valid for tvf, but not for stream load/mysql load. This means that when you perform stream load/mysql load, crlf and lf will be automatically used as delimiters, even if youset keep_carriage_return = true.Issue Number: close #xxx