From 723a7f4ca02ddda4edfa7f2ec7979f9690af8d9c Mon Sep 17 00:00:00 2001 From: David Li Date: Sat, 5 Mar 2022 10:52:53 -0500 Subject: [PATCH 1/5] ARROW-15721: [C++][FlightRPC] Add Flight/Flight SQL to subprojects --- docs/source/format/FlightSql.rst | 159 +++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+) create mode 100644 docs/source/format/FlightSql.rst diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst new file mode 100644 index 00000000000..96c0f449dfa --- /dev/null +++ b/docs/source/format/FlightSql.rst @@ -0,0 +1,159 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. _flight-sql: + +================ +Arrow Flight SQL +================ + +Arrow Flight SQL is a protocol for interacting with SQL databases +using the Arrow in-memory format and the :doc:`Flight RPC +<./Flight>` framework. + +Generally, a database will implement the RPC methods according to the +specification, but does not need to implement a client-side driver. A +database client can use the provided Flight SQL client to interact +with any database that supports the necessary endpoints. + +.. warning:: Flight SQL is **experimental** and changes to the + protocol may still be made. + +RPC Methods +=========== + +Flight SQL reuses the predefined RPC methods in Arrow Flight, and +provides various commands that pair those methods with request/response +messages defined via Protobuf (see below). + +SQL Metadata +------------ + +Flight SQL provides a variety of commands to fetch metadata about the +database server. + +All of these commands can be used with the GetFlightInfo and GetSchema +RPC methods. The Protobuf request message should be packed into a +google.protobuf.Any message, then serialized and packed as the ``cmd`` +field in a CMD-type FlightDescriptor. + +If the command is used with GetFlightInfo, the server will return a +FlightInfo response. The client should then use the Ticket(s) in the +FlightInfo with the DoGet RPC method to fetch a Arrow data containing +the results of the command. In other words, metadata is returned as +Arrow data, just like query results themselves. + +The schema returned by GetSchema or DoGet for a particular command is +fixed according to the specification. + +``CommandGetCatalogs`` + List the catalogs available in the database. The definition varies + by vendor. + +``CommandGetCrossReference`` + List the foreign key columns in a given table that reference + columns in a given parent table. + +``CommandGetDbSchemas`` + List the schemas (note: *not* an Arrow schema) available in the + database. The definition varies by vendor. + +``CommandGetExportedKeys`` + List foreign key columns that reference the primary key columns of + a given table. + +``CommandGetImportedKeys`` + List foreign keys of a given table. + +``CommandGetPrimaryKeys`` + List the primary keys of a given table. + +``CommandGetSqlInfo`` + Fetch metadata about the database server and its supported SQL + features. + +``CommandGetTables`` + List tables in the database. + +``CommandGetTableTypes`` + List table types in the database. The list of types varies by + vendor. + +Query Execution +--------------- + +Flight SQL also provides commands to execute SQL queries and manage +prepared statements. + +Many of these commands are also used with GetFlightInfo/GetSchema and +work identically to the metadata methods above. Some of these commands +can be used with the DoPut RPC method, but the command should still be +encoded in the request FlightDescriptor in the same way. + +Commands beginning with "Action" are instead used with the DoAction +RPC method, in which case the command should be packed into a +google.protobuf.Any message, then serialized and packed into the +``body`` of a Flight Action. Also, the ``type`` should be set to the +command name (i.e. for ``ActionClosePreparedStatementRequest``, the +``type`` should be ``ClosePreparedStatement``). + +``ActionClosePreparedStatementRequest`` + Close a previously created prepared statement. + +``ActionCreatePreparedStatementRequest`` + Create a new prepared statement for a SQL query. + +``CommandPreparedStatementQuery`` + Execute a previously created prepared statement and get the results. + + When used with DoPut: binds parameter values to the prepared statement. + + When used with GetFlightInfo: execute the prepared statement. + +``CommandPreparedStatementUpdate`` + Execute a previously created prepared statement that does not + return results. + + When used with DoPut: execute the query and return the number of + affected rows. + +``CommandStatementQuery`` + Execute an ad-hoc SQL query. + + When used with GetFlightInfo: execute the query (call DoGet to + fetch results). + + When used with GetSchema: return the schema of the query results. + +``CommandStatementUpdate`` + Execute an ad-hoc SQL query that does not return results. + + When used with DoPut: execute the query and return the number of + affected rows. + +External Resources +================== + +- `Introducing Apache Arrow Flight SQL: Accelerating Database Access + `_ + +Protocol Buffer Definitions +=========================== + +.. literalinclude:: ../../../format/FlightSql.proto + :language: protobuf + :linenos: From e32f971d360ae015849051f940ffde7840177996 Mon Sep 17 00:00:00 2001 From: David Li Date: Mon, 7 Mar 2022 08:04:48 -0500 Subject: [PATCH 2/5] Apply suggestions from code review Co-authored-by: James Duong --- docs/source/format/FlightSql.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst index 96c0f449dfa..12a4338e9da 100644 --- a/docs/source/format/FlightSql.rst +++ b/docs/source/format/FlightSql.rst @@ -43,7 +43,7 @@ messages defined via Protobuf (see below). SQL Metadata ------------ -Flight SQL provides a variety of commands to fetch metadata about the +Flight SQL provides a variety of commands to fetch catalog metadata about the database server. All of these commands can be used with the GetFlightInfo and GetSchema @@ -69,7 +69,7 @@ fixed according to the specification. columns in a given parent table. ``CommandGetDbSchemas`` - List the schemas (note: *not* an Arrow schema) available in the + List the schemas (note: a grouping of tables, *not* an Arrow schema) available in the database. The definition varies by vendor. ``CommandGetExportedKeys`` From 314d3e0579e180e9fad0cc9b8baff4cab59f5e39 Mon Sep 17 00:00:00 2001 From: David Li Date: Mon, 7 Mar 2022 08:06:51 -0500 Subject: [PATCH 3/5] ARROW-15721: [C++][FlightRPC] Tweak wording --- docs/source/format/FlightSql.rst | 13 ++++++++----- docs/source/index.rst | 1 + 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst index 12a4338e9da..d7d23bfeff5 100644 --- a/docs/source/format/FlightSql.rst +++ b/docs/source/format/FlightSql.rst @@ -28,7 +28,9 @@ using the Arrow in-memory format and the :doc:`Flight RPC Generally, a database will implement the RPC methods according to the specification, but does not need to implement a client-side driver. A database client can use the provided Flight SQL client to interact -with any database that supports the necessary endpoints. +with any database that supports the necessary endpoints. Flight SQL +clients wrap the underlying Flight client to provide methods for the +new RPC methods described here. .. warning:: Flight SQL is **experimental** and changes to the protocol may still be made. @@ -43,8 +45,8 @@ messages defined via Protobuf (see below). SQL Metadata ------------ -Flight SQL provides a variety of commands to fetch catalog metadata about the -database server. +Flight SQL provides a variety of commands to fetch catalog metadata +about the database server. All of these commands can be used with the GetFlightInfo and GetSchema RPC methods. The Protobuf request message should be packed into a @@ -69,8 +71,9 @@ fixed according to the specification. columns in a given parent table. ``CommandGetDbSchemas`` - List the schemas (note: a grouping of tables, *not* an Arrow schema) available in the - database. The definition varies by vendor. + List the schemas (note: a grouping of tables, *not* an Arrow + schema) available in the database. The definition varies by + vendor. ``CommandGetExportedKeys`` List foreign key columns that reference the primary key columns of diff --git a/docs/source/index.rst b/docs/source/index.rst index b8570f20a12..b886fe9df3b 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -74,6 +74,7 @@ target environment.** format/Versioning format/Columnar format/Flight + format/FlightSql format/Integration format/CDataInterface format/CStreamInterface From 9f35efc84fcd96b8c9fd4f200ead04901a298941 Mon Sep 17 00:00:00 2001 From: David Li Date: Tue, 8 Mar 2022 08:48:32 -0500 Subject: [PATCH 4/5] Apply suggestions from code review Co-authored-by: Antoine Pitrou --- docs/source/format/FlightSql.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst index d7d23bfeff5..e2e80e19a69 100644 --- a/docs/source/format/FlightSql.rst +++ b/docs/source/format/FlightSql.rst @@ -56,10 +56,10 @@ field in a CMD-type FlightDescriptor. If the command is used with GetFlightInfo, the server will return a FlightInfo response. The client should then use the Ticket(s) in the FlightInfo with the DoGet RPC method to fetch a Arrow data containing -the results of the command. In other words, metadata is returned as +the results of the command. In other words, SQL metadata is returned as Arrow data, just like query results themselves. -The schema returned by GetSchema or DoGet for a particular command is +The Arrow schema returned by GetSchema or DoGet for a particular command is fixed according to the specification. ``CommandGetCatalogs`` @@ -110,7 +110,7 @@ encoded in the request FlightDescriptor in the same way. Commands beginning with "Action" are instead used with the DoAction RPC method, in which case the command should be packed into a google.protobuf.Any message, then serialized and packed into the -``body`` of a Flight Action. Also, the ``type`` should be set to the +``body`` of a Flight Action. Also, the action ``type`` should be set to the command name (i.e. for ``ActionClosePreparedStatementRequest``, the ``type`` should be ``ClosePreparedStatement``). From 2b7d7e5b673714801919ad4396e068d0e1e25157 Mon Sep 17 00:00:00 2001 From: David Li Date: Tue, 8 Mar 2022 08:52:44 -0500 Subject: [PATCH 5/5] ARROW-15721: [C++][FlightRPC] Clarify some concepts --- docs/source/format/FlightSql.rst | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/docs/source/format/FlightSql.rst b/docs/source/format/FlightSql.rst index e2e80e19a69..3f062a85c61 100644 --- a/docs/source/format/FlightSql.rst +++ b/docs/source/format/FlightSql.rst @@ -59,8 +59,8 @@ FlightInfo with the DoGet RPC method to fetch a Arrow data containing the results of the command. In other words, SQL metadata is returned as Arrow data, just like query results themselves. -The Arrow schema returned by GetSchema or DoGet for a particular command is -fixed according to the specification. +The Arrow schema returned by GetSchema or DoGet for a particular +command is fixed according to the specification. ``CommandGetCatalogs`` List the catalogs available in the database. The definition varies @@ -110,9 +110,9 @@ encoded in the request FlightDescriptor in the same way. Commands beginning with "Action" are instead used with the DoAction RPC method, in which case the command should be packed into a google.protobuf.Any message, then serialized and packed into the -``body`` of a Flight Action. Also, the action ``type`` should be set to the -command name (i.e. for ``ActionClosePreparedStatementRequest``, the -``type`` should be ``ClosePreparedStatement``). +``body`` of a Flight Action. Also, the action ``type`` should be set +to the command name (i.e. for ``ActionClosePreparedStatementRequest``, +the ``type`` should be ``ClosePreparedStatement``). ``ActionClosePreparedStatementRequest`` Close a previously created prepared statement. @@ -125,14 +125,15 @@ command name (i.e. for ``ActionClosePreparedStatementRequest``, the When used with DoPut: binds parameter values to the prepared statement. - When used with GetFlightInfo: execute the prepared statement. + When used with GetFlightInfo: execute the prepared statement. The + prepared statement can be reused after fetching results. ``CommandPreparedStatementUpdate`` Execute a previously created prepared statement that does not return results. When used with DoPut: execute the query and return the number of - affected rows. + affected rows. The prepared statement can be reused afterwards. ``CommandStatementQuery`` Execute an ad-hoc SQL query.