Skip to content

Conversation

@morningman
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Problem summary

  1. Introduce hadoop libhdfs
  2. For Linux-X86 platform, use the hadoop libhdfs
  3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform

Still WIP, this PR is just for review.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@morningman morningman marked this pull request as draft March 7, 2023 02:15
@github-actions
Copy link
Contributor

github-actions bot commented Mar 7, 2023

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.


#pragma once

#if defined(__x86_64__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also check system os?

Status HDFSStorageBackend::list(const std::string& remote_path, bool contain_md5, bool recursion,
std::map<std::string, FileStat>* files) {
CHECK_HDFS_CLIENT(_hdfs_fs);
#if 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment out temporarily, this class will be refactored

@github-actions github-actions bot added area/planner Issues or PRs related to the query planner area/vectorization labels Mar 27, 2023
@github-actions
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

@github-actions
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

@github-actions
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

@github-actions
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

@github-actions
Copy link
Contributor

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In bin/start_be.sh line 100:
export CLASSPATH="${DORIS_HOME}/conf/:$DORIS_CLASSPATH"
                                      ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
export CLASSPATH="${DORIS_HOME}/conf/:${DORIS_CLASSPATH}"


In bin/start_be.sh line 262:
        export LD_LIBRARY_PATH=$JAVA_HOME/jre/lib/$jvm_arch/server:$JAVA_HOME/jre/lib/$jvm_arch:$LD_LIBRARY_PATH
                               ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                  ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                   ^--------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                      ^-------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                                                ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${JAVA_HOME}/jre/lib/${jvm_arch}/server:${JAVA_HOME}/jre/lib/${jvm_arch}:${LD_LIBRARY_PATH}


In bin/start_be.sh line 263:
        export LD_LIBRARY_PATH=$DORIS_HOME/lib/hadoop_hdfs/native:$LD_LIBRARY_PATH
                               ^---------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.
                                                                  ^--------------^ SC2250 (style): Prefer putting braces around variable references even when not strictly required.

Did you mean: 
        export LD_LIBRARY_PATH=${DORIS_HOME}/lib/hadoop_hdfs/native:${LD_LIBRARY_PATH}


In bin/start_be.sh line 264:
        export LIBHDFS_OPTS="${JAVA_OPTS}"
                             ^----------^ SC2154 (warning): JAVA_OPTS is referenced but not assigned.


In bin/start_be.sh line 269:
echo "CLASSPATH: ${CLASSPATH}\n"
     ^-------------------------^ SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 270:
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.


In bin/start_be.sh line 271:
echo "LIBHDFS_OPTS: ${LIBHDFS_OPTS}\n"
     ^-- SC2028 (info): echo may not expand escape sequences. Use printf.

For more information:
  https://www.shellcheck.net/wiki/SC2154 -- JAVA_OPTS is referenced but not a...
  https://www.shellcheck.net/wiki/SC2028 -- echo may not expand escape sequen...
  https://www.shellcheck.net/wiki/SC2250 -- Prefer putting braces around vari...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors
'shfmt ' found no issues.

@morningman
Copy link
Contributor Author

see #18204

@morningman morningman closed this Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/planner Issues or PRs related to the query planner area/vectorization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant