From 01848dcddbf22ca3f44ee9af62979229ae350df5 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Mon, 6 Dec 2021 20:18:23 +0800 Subject: [PATCH 01/20] [SPARK-37558][DOC] Improve spark sql cli document --- docs/index.md | 1 + ...ql-distributed-sql-engine-spark-sql-cli.md | 169 ++++++++++++++++++ docs/sql-distributed-sql-engine.md | 12 +- 3 files changed, 174 insertions(+), 8 deletions(-) create mode 100644 docs/sql-distributed-sql-engine-spark-sql-cli.md diff --git a/docs/index.md b/docs/index.md index a853016c19890..99b9f5a2b7802 100644 --- a/docs/index.md +++ b/docs/index.md @@ -111,6 +111,7 @@ options for deployment: * [GraphX](graphx-programming-guide.html): processing graphs * [SparkR](sparkr.html): processing data with Spark in R * [PySpark](api/python/getting_started/index.html): processing data with Spark in Python +* [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html): process data use SQL with command line interface **API Docs:** diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md new file mode 100644 index 0000000000000..f039aa7e84edd --- /dev/null +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -0,0 +1,169 @@ +--- +layout: global +title: Spark SQL CLI +displayTitle: Spark SQL CLI +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +* Table of contents +{:toc} + + +The Spark SQL CLI is a convenient tool to run the Hive metastore service in local mode and execute +queries input from the command line. Note that the Spark SQL CLI cannot talk to the Thrift JDBC server. + +To start the Spark SQL CLI, run the following in the Spark directory: + + ./bin/spark-sql + +Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` and `hdfs-site.xml` files in `conf/`. + +## Spark SQL Command Line Options + +You may run `./bin/spark-sql --help` for a complete list of all available options. + + CLI options: + -d,--define Variable substitution to apply to Hive + commands. e.g. -d A=B or --define A=B + --database Specify the database to use + -e SQL from command line + -f SQL from files + -H,--help Print help information + --hiveconf Use value for given property + --hivevar Variable substitution to apply to Hive + commands. e.g. --hivevar A=B + -i Initialization SQL file + -S,--silent Silent mode in interactive shell + -v,--verbose Verbose mode (echo executed SQL to the + console) + +## The hiverc File + +The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HIVE_HOME/bin/.hiverc` and `$HOME/.hiverc` as initialization files. + +## Spark SQL CLI Interactive Shell Commands + +When `$SPARK__HOME/bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. +Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Comments in scripts can be specified using the `--` prefix. + + + + + + + + + + + + + + + + + + + + + + + +
CommandDescription
quit exitUse quit or exit to leave the interactive shell.
!<command>Executes a shell command from the Spark SQL CLI shell.
dfs <dfs command>Executes a dfs command from the Hive shell.
<query string>Executes a Spark SQL query and prints results to standard output.
source <filepath>Executes a script file inside the CLI.
+ +## Supported comment type + + + + + + + + + + + + + + + +
CommentExample
simple comment + + -- This is a simple comment. +
+ SELECT 1; +
+
bracketed comment + + /* This is a bracketed comment. */ +
+ SELECT 1; +
+
nested bracketed comment + + /* This is a /* nested bracketed comment*/ .*/ +
+ SELECT 1; +
+
+ +## Examples + +See Variable Substitution for examples of using the hiveconf option. + + +Example of running a query from the command line + + ./bin/spark-sql -e 'SELECT COL FROM TBL' + +Example of setting Hive configuration variables + + ./bin/spark-sql -e 'SELECT COL FROM TBL' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch --hiveconf mapred.reduce.tasks=32 + +Example of dumping data out from a query into a file using silent mode + + ./bin/spark-sql -S -e 'SELECT COL FROM TBL' > result.txt + +Example of running a script non-interactively from local disk + + ./bin/spark-sql -f /path/to/spark-sql-script.sql + +Example of running a script non-interactively from a Hadoop supported filesystem + + ./bin/spark-sql -f hdfs://:/spark-sql-script.sql + ./bin/spark-sql -f s3://mys3bucket/spark-sql-script.sql + +Example of running an initialization script before entering interactive mode + + ./bin/spark-sql -i /path/to/spark-sql-init.sql + +Example of entering interactive mode + + ./bin/spark-sql + spark-sql> SELECT 1; + 1 + spark-sql> -- This is a simple comment. + spark-sql> SELECT 1; + 1 + +Example of entering interactive mode with escape `;` in comment + + ./bin/spark-sql + spark-sql>/* This is a comment contains \\; + > It won't be terminaled by \\; */ + > SELECT 1; + 1 + diff --git a/docs/sql-distributed-sql-engine.md b/docs/sql-distributed-sql-engine.md index 8d47a672985d3..96140a68f901c 100644 --- a/docs/sql-distributed-sql-engine.md +++ b/docs/sql-distributed-sql-engine.md @@ -90,12 +90,8 @@ See more details in [[SPARK-21067]](https://issues.apache.org/jira/browse/SPARK- ## Running the Spark SQL CLI -The Spark SQL CLI is a convenient tool to run the Hive metastore service in local mode and execute -queries input from the command line. Note that the Spark SQL CLI cannot talk to the Thrift JDBC server. +To use the Spark SQL command line interface (CLI) from the shell: -To start the Spark SQL CLI, run the following in the Spark directory: - - ./bin/spark-sql - -Configuration of Hive is done by placing your `hive-site.xml`, `core-site.xml` and `hdfs-site.xml` files in `conf/`. -You may run `./bin/spark-sql --help` for a complete list of all available options. + ./bin/hive + +Details please refer to [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html) \ No newline at end of file From 7a9ecbf7adb205a78e17ca9b7412f154ff09fd28 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Tue, 7 Dec 2021 10:50:28 +0800 Subject: [PATCH 02/20] update --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 5 +---- docs/sql-distributed-sql-engine.md | 4 ++-- 2 files changed, 3 insertions(+), 6 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index f039aa7e84edd..821719377cc65 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -84,7 +84,7 @@ Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Com -## Supported comment type +## Supported comment types @@ -122,9 +122,6 @@ Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Com ## Examples -See Variable Substitution for examples of using the hiveconf option. - - Example of running a query from the command line ./bin/spark-sql -e 'SELECT COL FROM TBL' diff --git a/docs/sql-distributed-sql-engine.md b/docs/sql-distributed-sql-engine.md index 96140a68f901c..734723f8c6235 100644 --- a/docs/sql-distributed-sql-engine.md +++ b/docs/sql-distributed-sql-engine.md @@ -92,6 +92,6 @@ See more details in [[SPARK-21067]](https://issues.apache.org/jira/browse/SPARK- To use the Spark SQL command line interface (CLI) from the shell: - ./bin/hive + ./bin/spark-sql -Details please refer to [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html) \ No newline at end of file +For details, please refer to [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html) \ No newline at end of file From a0218f8b2fa9b18170c5714cbbc613317907d0ec Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Tue, 7 Dec 2021 11:11:57 +0800 Subject: [PATCH 03/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 821719377cc65..4792a05cda8df 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -57,7 +57,7 @@ The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HI ## Spark SQL CLI Interactive Shell Commands -When `$SPARK__HOME/bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. +When `./bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Comments in scripts can be specified using the `--` prefix.
CommentExample
@@ -72,7 +72,7 @@ Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Com - + @@ -146,7 +146,7 @@ Example of running a script non-interactively from a Hadoop supported filesystem Example of running an initialization script before entering interactive mode ./bin/spark-sql -i /path/to/spark-sql-init.sql - + Example of entering interactive mode ./bin/spark-sql @@ -155,7 +155,7 @@ Example of entering interactive mode spark-sql> -- This is a simple comment. spark-sql> SELECT 1; 1 - + Example of entering interactive mode with escape `;` in comment ./bin/spark-sql @@ -163,4 +163,3 @@ Example of entering interactive mode with escape `;` in comment > It won't be terminaled by \\; */ > SELECT 1; 1 - From 1c59308a3aea9ed58935380af72a15fa8dfa3306 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Tue, 7 Dec 2021 13:11:38 +0800 Subject: [PATCH 04/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 4792a05cda8df..a34c307f326bd 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -58,7 +58,19 @@ The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HI ## Spark SQL CLI Interactive Shell Commands When `./bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. -Use `;` (semicolon) to terminate commands, but user can escape `;` by `\\;`. Comments in scripts can be specified using the `--` prefix. +Use `;` (semicolon) to terminate commands. Notice: + + 1. CLI use `;` to terminate commands only when it's at the end of line and it's not escaped by `\\;`. + 2. `;` is the only way to terminate commands, if user type `SELECT 1` and press enter, console will just wait for input. + 3. If user type multiple commands in one line like `SELECT 1; SELECT 2;`, commands `SELECT 1` and `SELECT 2` will be executed separatly. + 4. If `;` in a simple comment `-- This is a comment;`, this line will just be ignored. If `;` in a bracketed command and not at the end of line + /* This is a comment contains ';'. */ + SELECT 1; + It won't terminate commands. If `;` in a bracketed command and in the end of line, + /* This is a comment contains ; + */ SELECT 1; + It will terminate commands into `/* This is a comment contains ` and `*/ SELECT 1`. +
dfs <dfs command>Executes a dfs command from the Hive shell.Executes a dfs command from the Hive shell.
<query string>
From cea5be4be8abdb5dd24ac44f2f4758018d030439 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Tue, 7 Dec 2021 13:19:43 +0800 Subject: [PATCH 05/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- ...ql-distributed-sql-engine-spark-sql-cli.md | 90 ++++++++++--------- 1 file changed, 49 insertions(+), 41 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index a34c307f326bd..7e5cfc75f6135 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -55,47 +55,6 @@ You may run `./bin/spark-sql --help` for a complete list of all available option The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HIVE_HOME/bin/.hiverc` and `$HOME/.hiverc` as initialization files. -## Spark SQL CLI Interactive Shell Commands - -When `./bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. -Use `;` (semicolon) to terminate commands. Notice: - - 1. CLI use `;` to terminate commands only when it's at the end of line and it's not escaped by `\\;`. - 2. `;` is the only way to terminate commands, if user type `SELECT 1` and press enter, console will just wait for input. - 3. If user type multiple commands in one line like `SELECT 1; SELECT 2;`, commands `SELECT 1` and `SELECT 2` will be executed separatly. - 4. If `;` in a simple comment `-- This is a comment;`, this line will just be ignored. If `;` in a bracketed command and not at the end of line - /* This is a comment contains ';'. */ - SELECT 1; - It won't terminate commands. If `;` in a bracketed command and in the end of line, - /* This is a comment contains ; - */ SELECT 1; - It will terminate commands into `/* This is a comment contains ` and `*/ SELECT 1`. - - -
CommandDescription
- - - - - - - - - - - - - - - - - - - - - -
CommandDescription
quit exitUse quit or exit to leave the interactive shell.
!<command>Executes a shell command from the Spark SQL CLI shell.
dfs <dfs command>Executes a dfs command from the Hive shell.
<query string>Executes a Spark SQL query and prints results to standard output.
source <filepath>Executes a script file inside the CLI.
- ## Supported comment types @@ -132,6 +91,55 @@ Use `;` (semicolon) to terminate commands. Notice:
+## Spark SQL CLI Interactive Shell Commands + +When `./bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. +Use `;` (semicolon) to terminate commands. Notice: +1. CLI use `;` to terminate commands only when it's at the end of line and it's not escaped by `\\;`. +2. `;` is the only way to terminate commands, if user type `SELECT 1` and press enter, console will just wait for input. +3. If user type multiple commands in one line like `SELECT 1; SELECT 2;`, commands `SELECT 1` and `SELECT 2` will be executed separatly. +4. If `;` in a simple comment + ```sql + -- This is a comment; + SELECT 1; + ``` + This comment line will just be ignored. If `;` in a bracketed command and not at the end of line, + ```sql + /* This is a comment contains ';'. */ + SELECT 1; + ``` + it will not terminate commands. If `;` in a bracketed command and in the end of line, + ```sql + /* This is a comment contains ; + */ SELECT 1; + ``` + it will terminate commands into `/* This is a comment contains ` and `*/ SELECT 1`. + + + + + + + + + + + + + + + + + + + + + + + + +
CommandDescription
quit exitUse quit or exit to leave the interactive shell.
!<command>Executes a shell command from the Spark SQL CLI shell.
dfs <dfs command>Executes a dfs command from the Hive shell.
<query string>Executes a Spark SQL query and prints results to standard output.
source <filepath>Executes a script file inside the CLI.
+ ## Examples Example of running a query from the command line From 36ce29f1c034a31f88317bcebd23d76474272222 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Wed, 8 Dec 2021 14:47:24 +0800 Subject: [PATCH 06/20] follow comment --- docs/index.md | 2 +- ...ql-distributed-sql-engine-spark-sql-cli.md | 58 +++++++++---------- 2 files changed, 30 insertions(+), 30 deletions(-) diff --git a/docs/index.md b/docs/index.md index 99b9f5a2b7802..c6caf31d5603a 100644 --- a/docs/index.md +++ b/docs/index.md @@ -111,7 +111,7 @@ options for deployment: * [GraphX](graphx-programming-guide.html): processing graphs * [SparkR](sparkr.html): processing data with Spark in R * [PySpark](api/python/getting_started/index.html): processing data with Spark in Python -* [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html): process data use SQL with command line interface +* [Spark SQL CLI](sql-distributed-sql-engine-spark-sql-cli.html): processing data with SQL on the command line **API Docs:** diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 7e5cfc75f6135..845174926262b 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -23,7 +23,7 @@ license: | {:toc} -The Spark SQL CLI is a convenient tool to run the Hive metastore service in local mode and execute +The Spark SQL CLI is a convenient tool to run the Hive metastore service in local mode and execute SQL queries input from the command line. Note that the Spark SQL CLI cannot talk to the Thrift JDBC server. To start the Spark SQL CLI, run the following in the Spark directory: @@ -53,7 +53,7 @@ You may run `./bin/spark-sql --help` for a complete list of all available option ## The hiverc File -The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HIVE_HOME/bin/.hiverc` and `$HOME/.hiverc` as initialization files. +When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOME/bin/.hiverc` and `$HOME/.hiverc` as initialization files. ## Supported comment types @@ -95,40 +95,40 @@ The Spark SQL CLI when invoked without the `-i` option will attempt to load `$HI When `./bin/spark-sql` is run without either the `-e` or `-f` option, it enters interactive shell mode. Use `;` (semicolon) to terminate commands. Notice: -1. CLI use `;` to terminate commands only when it's at the end of line and it's not escaped by `\\;`. -2. `;` is the only way to terminate commands, if user type `SELECT 1` and press enter, console will just wait for input. -3. If user type multiple commands in one line like `SELECT 1; SELECT 2;`, commands `SELECT 1` and `SELECT 2` will be executed separatly. -4. If `;` in a simple comment +1. The CLI use `;` to terminate commands only when it's at the end of line, and it's not escaped by `\\;`. +2. `;` is the only way to terminate commands. If the user types `SELECT 1` and presses enter, the console will just wait for input. +3. If the user types multiple commands in one line like `SELECT 1; SELECT 2;`, the commands `SELECT 1` and `SELECT 2` will be executed separatly. +4. If `;` appears in a simple comment, as in: ```sql -- This is a comment; SELECT 1; ``` - This comment line will just be ignored. If `;` in a bracketed command and not at the end of line, - ```sql - /* This is a comment contains ';'. */ - SELECT 1; - ``` - it will not terminate commands. If `;` in a bracketed command and in the end of line, - ```sql - /* This is a comment contains ; - */ SELECT 1; - ``` - it will terminate commands into `/* This is a comment contains ` and `*/ SELECT 1`. + then this comment line will be ignored. If `;` appears in a bracketed comment, + ```sql + /* This is a comment contains ';'. */ + SELECT 1; + ``` + then this bracketed comment lines will be ignored. If `;` appears in a bracketed comment and at the end of line, + ```sql + /* This is a comment contains ; + */ SELECT 1; + ``` + then the whole command will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`. - - + + - - + + @@ -142,32 +142,32 @@ Use `;` (semicolon) to terminate commands. Notice: ## Examples -Example of running a query from the command line +Example of running a query from the command line: ./bin/spark-sql -e 'SELECT COL FROM TBL' -Example of setting Hive configuration variables +Example of setting Hive configuration variables: ./bin/spark-sql -e 'SELECT COL FROM TBL' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch --hiveconf mapred.reduce.tasks=32 -Example of dumping data out from a query into a file using silent mode +Example of dumping data out from a query into a file using silent mode: ./bin/spark-sql -S -e 'SELECT COL FROM TBL' > result.txt -Example of running a script non-interactively from local disk +Example of running a script non-interactively from local disk: ./bin/spark-sql -f /path/to/spark-sql-script.sql -Example of running a script non-interactively from a Hadoop supported filesystem +Example of running a script non-interactively from a Hadoop supported filesystem: ./bin/spark-sql -f hdfs://:/spark-sql-script.sql ./bin/spark-sql -f s3://mys3bucket/spark-sql-script.sql -Example of running an initialization script before entering interactive mode +Example of running an initialization script before entering interactive mode: ./bin/spark-sql -i /path/to/spark-sql-init.sql -Example of entering interactive mode +Example of entering interactive mode: ./bin/spark-sql spark-sql> SELECT 1; @@ -176,7 +176,7 @@ Example of entering interactive mode spark-sql> SELECT 1; 1 -Example of entering interactive mode with escape `;` in comment +Example of entering interactive mode with escape `;` in comment: ./bin/spark-sql spark-sql>/* This is a comment contains \\; From 21fb3497f774ebda343d8d02ad9177c08cb17ff1 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Fri, 10 Dec 2021 14:09:06 +0800 Subject: [PATCH 07/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 845174926262b..95bb3c6d474b9 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -100,20 +100,22 @@ Use `;` (semicolon) to terminate commands. Notice: 3. If the user types multiple commands in one line like `SELECT 1; SELECT 2;`, the commands `SELECT 1` and `SELECT 2` will be executed separatly. 4. If `;` appears in a simple comment, as in: ```sql - -- This is a comment; - SELECT 1; + -- This is a ';' comment + SELECT ';' as a; ``` - then this comment line will be ignored. If `;` appears in a bracketed comment, + then this comment line will be ignored. `;` in `SELECT ';' as a` will just be treated as a char of string. + If `;` appears in the middle of a bracketed comment, ```sql /* This is a comment contains ';'. */ SELECT 1; ``` - then this bracketed comment lines will be ignored. If `;` appears in a bracketed comment and at the end of line, + then this ';' will not terminate the commands. If `;` appears in a bracketed comment and at the end of line, ```sql /* This is a comment contains ; */ SELECT 1; ``` - then the whole command will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`. + then the whole command will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`, + Spark will submit these two command and throw parser error.
CommandDescription
quit exitUse quit or exit to leave the interactive shell.quit or exitExits the interactive shell.
!<command> Executes a shell command from the Spark SQL CLI shell.
dfs <dfs command>Executes a dfs command from the Hive shell.dfs <HDFS dfs command>Executes a HDFS dfs command from the Spark SQL CLI shell.
<query string>
From 0c772172f38a65a14664061e3d1f68d7fbf6ae03 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Fri, 10 Dec 2021 16:54:54 +0800 Subject: [PATCH 08/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 95bb3c6d474b9..bf4b03b756e7e 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -150,7 +150,19 @@ Example of running a query from the command line: Example of setting Hive configuration variables: - ./bin/spark-sql -e 'SELECT COL FROM TBL' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch --hiveconf mapred.reduce.tasks=32 + ./bin/spark-sql -e 'SELECT COL FROM TBL' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch + +Example of setting Hive configuration variables: + + ./bin/spark-sql -e 'SELECT ${hiveconf:aaa}' --hiveconf aaa=bbb --hiveconf hive.exec.scratchdir=/home/my/hive_scratch + spark-sql> SELECT ${aaa}; + bbb + +Example of setting Hive variables substitution: + + ./bin/spark-sql --hivevar aaa=bbb --define ccc=ddd + spark-sql> SELECT ${aaa}, ${ccc}; + bbb ddd Example of dumping data out from a query into a file using silent mode: From e27470073c305d76c8ba12bdb1f5150c034c83f6 Mon Sep 17 00:00:00 2001 From: Wenchen Fan Date: Mon, 13 Dec 2021 13:37:28 +0800 Subject: [PATCH 09/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 15 ++++----------- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index bf4b03b756e7e..e248fc08eaf92 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -98,24 +98,17 @@ Use `;` (semicolon) to terminate commands. Notice: 1. The CLI use `;` to terminate commands only when it's at the end of line, and it's not escaped by `\\;`. 2. `;` is the only way to terminate commands. If the user types `SELECT 1` and presses enter, the console will just wait for input. 3. If the user types multiple commands in one line like `SELECT 1; SELECT 2;`, the commands `SELECT 1` and `SELECT 2` will be executed separatly. -4. If `;` appears in a simple comment, as in: +4. If `;` appears within a SQL statement (not the end of the line), then it has no special meanings: ```sql - -- This is a ';' comment + -- This is a ; comment SELECT ';' as a; ``` - then this comment line will be ignored. `;` in `SELECT ';' as a` will just be treated as a char of string. - If `;` appears in the middle of a bracketed comment, - ```sql - /* This is a comment contains ';'. */ - SELECT 1; - ``` - then this ';' will not terminate the commands. If `;` appears in a bracketed comment and at the end of line, + This is just a comment line followed by a SQL query which returns a string literal. ```sql /* This is a comment contains ; */ SELECT 1; ``` - then the whole command will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`, - Spark will submit these two command and throw parser error. + However, if ';' is the end of the line, it terminates the SQL statement. The example above will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`, Spark will submit these two command and throw parser error (unclosed bracketed comment).
From 6d6b0cf1675de6f8d05dbbaf25aacc0334ef4250 Mon Sep 17 00:00:00 2001 From: AngersZhuuuu Date: Mon, 13 Dec 2021 14:07:24 +0800 Subject: [PATCH 10/20] Update docs/sql-distributed-sql-engine-spark-sql-cli.md Co-authored-by: Wenchen Fan --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index e248fc08eaf92..03621e54ccc41 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -145,7 +145,7 @@ Example of setting Hive configuration variables: ./bin/spark-sql -e 'SELECT COL FROM TBL' --hiveconf hive.exec.scratchdir=/home/my/hive_scratch -Example of setting Hive configuration variables: +Example of setting Hive configuration variables and using it in the SQL query: ./bin/spark-sql -e 'SELECT ${hiveconf:aaa}' --hiveconf aaa=bbb --hiveconf hive.exec.scratchdir=/home/my/hive_scratch spark-sql> SELECT ${aaa}; From a06b1662c61639a5b2babc9f37f2e82dd3789607 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Mon, 13 Dec 2021 15:11:25 +0800 Subject: [PATCH 11/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 03621e54ccc41..688a3d85d9696 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -164,6 +164,7 @@ Example of dumping data out from a query into a file using silent mode: Example of running a script non-interactively from local disk: ./bin/spark-sql -f /path/to/spark-sql-script.sql + ./bin/spark-sql -f file:///path/to/spark-sql-script.sql Example of running a script non-interactively from a Hadoop supported filesystem: From 8d2fa9a41cd39997948eb0a42524dc94d514b748 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Mon, 13 Dec 2021 16:19:30 +0800 Subject: [PATCH 12/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 688a3d85d9696..b0f60d9fb1640 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -55,6 +55,11 @@ You may run `./bin/spark-sql --help` for a complete list of all available option When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOME/bin/.hiverc` and `$HOME/.hiverc` as initialization files. +## Path interpretation + +Spark SQL CLI support run SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path is not absolute, the path will be handled as local file. +For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. + ## Supported comment types
From 4a7ca98a217106cd06ec37c5cdc5f751c785d47a Mon Sep 17 00:00:00 2001 From: AngersZhuuuu Date: Tue, 14 Dec 2021 10:43:53 +0800 Subject: [PATCH 13/20] Update docs/sql-distributed-sql-engine-spark-sql-cli.md Co-authored-by: Wenchen Fan --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index b0f60d9fb1640..f151f52ec826d 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -166,7 +166,7 @@ Example of dumping data out from a query into a file using silent mode: ./bin/spark-sql -S -e 'SELECT COL FROM TBL' > result.txt -Example of running a script non-interactively from local disk: +Example of running a script non-interactively: ./bin/spark-sql -f /path/to/spark-sql-script.sql ./bin/spark-sql -f file:///path/to/spark-sql-script.sql From 45ec6a05c75fb9b7004143b7b049ae56e79d2d65 Mon Sep 17 00:00:00 2001 From: AngersZhuuuu Date: Tue, 14 Dec 2021 10:44:00 +0800 Subject: [PATCH 14/20] Update docs/sql-distributed-sql-engine-spark-sql-cli.md Co-authored-by: Wenchen Fan --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index f151f52ec826d..2253552d39cb6 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -57,7 +57,7 @@ When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOM ## Path interpretation -Spark SQL CLI support run SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path is not absolute, the path will be handled as local file. +Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path is not absolute, the path will be handled as local file. For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. ## Supported comment types From d9f150c85652a16ca719358dab6af654091e1560 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Tue, 14 Dec 2021 10:47:32 +0800 Subject: [PATCH 15/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 2253552d39cb6..b0d3c0bdc05fd 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -57,7 +57,7 @@ When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOM ## Path interpretation -Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path is not absolute, the path will be handled as local file. +Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path url don't have a schema component, the path will be handled as local file. For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. ## Supported comment types From 453f2604c7351c9208310b4747776b77ef684e5f Mon Sep 17 00:00:00 2001 From: Wenchen Fan Date: Tue, 14 Dec 2021 23:25:31 +0800 Subject: [PATCH 16/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index b0d3c0bdc05fd..9578c2d2dfb3f 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -169,12 +169,6 @@ Example of dumping data out from a query into a file using silent mode: Example of running a script non-interactively: ./bin/spark-sql -f /path/to/spark-sql-script.sql - ./bin/spark-sql -f file:///path/to/spark-sql-script.sql - -Example of running a script non-interactively from a Hadoop supported filesystem: - - ./bin/spark-sql -f hdfs://:/spark-sql-script.sql - ./bin/spark-sql -f s3://mys3bucket/spark-sql-script.sql Example of running an initialization script before entering interactive mode: From d4f677824fd2185fc60b27a53f5b19e6174398c5 Mon Sep 17 00:00:00 2001 From: AngersZhuuuu Date: Wed, 15 Dec 2021 14:31:27 +0800 Subject: [PATCH 17/20] Update docs/sql-distributed-sql-engine-spark-sql-cli.md Co-authored-by: Wenchen Fan --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 9578c2d2dfb3f..c8e88c490c229 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -57,7 +57,7 @@ When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOM ## Path interpretation -Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path url don't have a schema component, the path will be handled as local file. +Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path url don't have a scheme component, the path will be handled as local file. For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. ## Supported comment types From 29f3b457855b76fa858568174b8390b07464adf9 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Wed, 15 Dec 2021 14:47:46 +0800 Subject: [PATCH 18/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index c8e88c490c229..4adce27abb5c1 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -58,7 +58,7 @@ When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOM ## Path interpretation Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path url don't have a scheme component, the path will be handled as local file. -For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. +For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. User also can use Hadoop supported filesystems such as `s3://path/to/spark-sql-cli.sql` or `hdfs://nameservice/path/to/spark-sql-cli.sql`. ## Supported comment types From 2b6f1307c5af4b3dc338a22cd7d0c3abf122d926 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Wed, 15 Dec 2021 15:10:41 +0800 Subject: [PATCH 19/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 4adce27abb5c1..2951c52a37695 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -58,7 +58,7 @@ When invoked without the `-i`, the Spark SQL CLI will attempt to load `$HIVE_HOM ## Path interpretation Spark SQL CLI supports running SQL from initialization script file(`-i`) or normal SQL file(`-f`), If path url don't have a scheme component, the path will be handled as local file. -For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. User also can use Hadoop supported filesystems such as `s3://path/to/spark-sql-cli.sql` or `hdfs://nameservice/path/to/spark-sql-cli.sql`. +For example: `/path/to/spark-sql-cli.sql` equals to `file:///path/to/spark-sql-cli.sql`. User also can use Hadoop supported filesystems such as `s3:///path/to/spark-sql-cli.sql` or `hdfs://:/path/to/spark-sql-cli.sql`. ## Supported comment types @@ -113,8 +113,7 @@ Use `;` (semicolon) to terminate commands. Notice: /* This is a comment contains ; */ SELECT 1; ``` - However, if ';' is the end of the line, it terminates the SQL statement. The example above will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`, Spark will submit these two command and throw parser error (unclosed bracketed comment). - + However, if ';' is the end of the line, it terminates the SQL statement. The example above will be terminated into `/* This is a comment contains ` and `*/ SELECT 1`, Spark will submit these two commands separated and throw parser error (`unclosed bracketed comment` and `extraneous input '*/'`).
From 147d8b0f07419308740b0264f39f6a6d52e6c1e7 Mon Sep 17 00:00:00 2001 From: Angerszhuuuu Date: Wed, 15 Dec 2021 16:54:10 +0800 Subject: [PATCH 20/20] Update sql-distributed-sql-engine-spark-sql-cli.md --- docs/sql-distributed-sql-engine-spark-sql-cli.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/sql-distributed-sql-engine-spark-sql-cli.md b/docs/sql-distributed-sql-engine-spark-sql-cli.md index 2951c52a37695..f7f366952068d 100644 --- a/docs/sql-distributed-sql-engine-spark-sql-cli.md +++ b/docs/sql-distributed-sql-engine-spark-sql-cli.md @@ -23,7 +23,7 @@ license: | {:toc} -The Spark SQL CLI is a convenient tool to run the Hive metastore service in local mode and execute SQL +The Spark SQL CLI is a convenient interactive command tool to run the Hive metastore service and execute SQL queries input from the command line. Note that the Spark SQL CLI cannot talk to the Thrift JDBC server. To start the Spark SQL CLI, run the following in the Spark directory:
CommandDescription