Skip to content

[FlightSQL] Stateless prepared statement with parameter support #37720

@alamb

Description

@alamb

Describe the enhancement requested

dev@arrow.apache.com mailing list thread: https://lists.apache.org/thread/f0xb61z4yw611rw0v8vf9rht0qtq8opc

Usecase

InfluxDB IOx / 3.0 would like to allow customers to create prepared SQL statements with parameters so they can send parameterized queries and parameter values to the serve. Without this feature, they have to do the parameter substitution on the client side, which is both subject to possible SQL injection attacks, or (if they use a pre existing library) may not have the same parameter typing as our SQL implementation.

Given the JDBC driver doesn't yet support binding parameters to prepared statements (see #33961) I am not sure how widely used the parameter support is, but I think interest is growing -- for example apache/arrow-rs#4797 adds client side support to the Rust implementation

Background: Stateless services

A common design pattern in cloud services is that the request from a client can be handled by one of a number of identical backend servers as shown in the diagram below.

Subsequent requests may be processed by different backend servers. Any state needed to continue a session is sent to the client which passes it back in subsequent requests.

This design can used to support features such as zero downtime deployments and automatic workload based scaling. It also has the nice property that there is no server side state to clean up (via timeout or other mechanism).

                                                                            ┌────────────────────┐   
                                                                  ┌ ─ ─ ─ ─▶│      Server 1      │   
                                                                            └────────────────────┘   
                                                                  │                                  
                                                                            ┌────────────────────┐   
   ┌────────────────────┐                                         │         │      Server 2      │   
   │     FlightSQL      │                                                   └────────────────────┘   
   │       Client       │─ ─ ─ ─ ─ ▶   ... Network ...    ─ ─ ─ ─ ┘                                  
   │                    │                                                             ...            
   └────────────────────┘                                                                            
                                                                            ┌────────────────────┐   
             ActionCreatePreparedStatementRequest                           │      Server N      │   
                      handled by Server 1                                   └────────────────────┘   
                                                                                                     
                                                                                                     
                                                                                                     
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
                                                                                                     
                                                                            ┌────────────────────┐   
                                                                            │      Server 1      │   
                                                                            └────────────────────┘   
                                                                                                     
                                                                            ┌────────────────────┐   
   ┌────────────────────┐                                                   │      Server 2      │   
   │     FlightSQL      │                                                   └────────────────────┘   
   │       Client       │─ ─ ─ ─ ─ ▶   ... Network ...    ─ ─ ─ ─ ┐                                  
   │                    │                                                             ...            
   └────────────────────┘                                         │                                  
                                                                            ┌────────────────────┐   
                                                                  └ ─ ─ ─ ─▶│      Server N      │   
                ActionPreparedStatementExecute                              └────────────────────┘   
                      handled by Server N                                                            
                                                                                                     

Problem

As currently specified, I don't think we can implement FlightSQL prepared statements with parameters with such a stateless design.

In IOx, the handle returned from ActionCreatePreparedStatementRequest contains the original SQL query text among other things. Thus the subsequent call to ActionPreparedStatementExecute have access to the SQL query.

However, the CommandPrepareStatementQuery message to bind parameters does not return anything to the client that is sent to calls to ActionPreparedStatementExecute. Thus there is no way for the server that processes ActionPreparedStatementExecute to know the values of the parameters.

FlightSQL sequence Diagram

Here is the sequence diagram from https://arrow.apache.org/docs/format/FlightSql.html for reference

CommandPreparedStatementQuery mmd

Strawman Proposal

One way to support stateless implementation of prepared statements with bind parameters would be to extend the response returned from calling DoPut with CommandPrepareStatementQuery to include a new CommandPrepareStatementQueryResponse , similar to

arrow/format/FlightSql.proto

Lines 1782 to 1792 in 15a8ac3

* Returned from the RPC call DoPut when a CommandStatementUpdate
* CommandPreparedStatementUpdate was in the request, containing
* results from the update.
*/
message DoPutUpdateResult {
option (experimental) = true;
// The number of records updated. A return value of -1 represents
// an unknown updated record count.
int64 record_count = 1;
}

/**
* Response returned when `DoPut` is called with `CommandPrepareStatementQuery`
message DoPutStatementPrepareResult {
  option (experimental) = true;

  // (potentially updated) opaque handle for the prepared statement on the server.
  // All subsequent requests for his prepared statement must use this new handle, if specified
  bytes prepared_statement_handle = 1;
}

I think this would be a fairly low overhead and easy extension. Existing clients that support bind parameters would require an update, but existing servers would not. Given that bind parameters are just starting to be used more I think the overall ecosystem impact would be low

See also

See this discussion for more context: https://github.com/apache/arrow-rs/pull/4797/files#r1319807938

Component(s)

FlightRPC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions