Skip to content

gRPC query extension#14024

Closed
paul-rogers wants to merge 52 commits intoapache:masterfrom
paul-rogers:grpc-query
Closed

gRPC query extension#14024
paul-rogers wants to merge 52 commits intoapache:masterfrom
paul-rogers:grpc-query

Conversation

@paul-rogers
Copy link
Copy Markdown
Contributor

Implementation of a gRPC query endpoint per issue #13469.

Provides a single gRPC-based endpoint for SQL queries. The query request is similar to the existing REST SqlQuery class. The response is gRPC-specific. It provides the result schema along with the results as a binary "blob". Results can be in CSV, JSON array lines or as an array of Protobuf objects. If using Protobuf, the corresponding class must be installed along with the rRPC query extension so it is available to the Broker at runtime.

The PR includes both unit and integration tests. The PR includes additional files that are also offered in other PRs. The intent is that those other PRs (#13877 and #14009) are merged first, then this one merges with master so that the other files "disappear" from this PR.

This PR also has a number of code-cleanup changes encountered while doing the implementation.

See the README.md file in the PR for details. See the query.proto file for the gRPC protocol and Protobuf messages.

he project consists of three Maven modules:

  • grpc-query: The actual gRPC query endpoint.
  • grpc-query-it: Integration tests for the extension.
  • grpc-shaded: Creates a shaded jar containing gRPC and our rRPC service definition.

The shaded module is needed because gRPC uses a version of Guava different than that which Druid uses. We get runtime errors if we try to combine the two. The shaded module contains the service definition because that generates code that also uses Guava.

The IT module has to be separate because it has dependencies that occur in the Maven build after the gRPC module. Specifically, it depends on it-cases which must come after distribution, but grpc-query must come before distribution.

Release note

This is a "contrib" extension. We don't seem to document such extensions in Druid itself. The README.md can serve as documentation instead.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

paul-rogers and others added 30 commits January 14, 2023 14:36
Report stats & close statement
Simple server + query test works
The gRPC shaded jar module causes issues when loaded
into an IDE. This PR hides the module from Maven (and
hence IDEs) by building it implicitly in the grpc-query
module. It is a hack, but it works.
add empty response and GrpcResponseHandler tests
Allow ITs outside of the 'cases' directory
Refactoring
Build the grpc-query extension archive
Build a jar for the test Protobuf message
Fixes to the extension path mechanism
* Refactor template into cluster directory
* Add verify, setup scripts
* Forbid snapshots from upstream repos
* Basic grpc-query IT test
Retains support for anonymous (test) usage
Remove the Python basic auth code: it is another PR
@paul-rogers
Copy link
Copy Markdown
Contributor Author

The build is failing due to dependency:analyze being unable to find the grpc symbols. This may be due to the shaded grpc jar being unavailable. It may be unavailable because it is not in the saved Maven cache, and this particular Maven invocation did not rebuild it. A similar problem was resolved by moving from the compile to install Maven step.

Per the documentation:

Invokes the execution of the lifecycle phase test-compile prior to executing itself.

This step does a compilation, but does not trigger creation of the shaded jar, which then causes compile errors.

I've tried adding install to the analyze_dependencies_script.sh script.

At this point, I won't be available to continue the fixes. Can someone else please continue chipping away at the various build issues? It seems the shaded jar does not play well with our build scripts: I've made several changes to work around the issue: more may be needed. Or, perhaps we're doing something wrong with the shaded jar. Thanks!

abhishekagarwal87 pushed a commit that referenced this pull request Feb 2, 2024
Proposal #13469

Original PR #14024

A new method is being added in QueryLifecycle class to authorise a query based on authentication result.
This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.
findingrish added a commit to findingrish/druid that referenced this pull request Feb 6, 2024
Proposal apache#13469

Original PR apache#14024

A new method is being added in QueryLifecycle class to authorise a query based on authentication result.
This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.
abhishekagarwal87 pushed a commit that referenced this pull request Feb 6, 2024
Proposal #13469

Original PR #14024

A new method is being added in QueryLifecycle class to authorise a query based on authentication result.
This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result.
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions Bot added the stale label Feb 11, 2024
@findingrish findingrish mentioned this pull request Mar 4, 2024
10 tasks
@github-actions
Copy link
Copy Markdown

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions Bot closed this Mar 11, 2024
abhishekagarwal87 pushed a commit that referenced this pull request Sep 19, 2024
Revives #14024 and additionally supports,

Native queries
gRPC health check endpoint
This PR doesn't have the shaded module for packaging gRPC and Guava libraries since grpc-query module uses the same Guava version as that of Druid.

The response is gRPC-specific. It provides the result schema along with the results as a binary "blob". Results can be in CSV, JSON array lines or as an array of Protobuf objects. If using Protobuf, the corresponding class must be installed along with the gRPC query extension so it is available to the Broker at runtime.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants