Skip to content

Conversation

@lidavidm
Copy link
Member

@lidavidm lidavidm commented Feb 18, 2022

Defines interfaces to add new network transports to Arrow Flight, and rewrite the gRPC implementation in terms of these interfaces. The bulk of the new APIs are in transport.h. Where possible, code has been factored out so that transports have to implement as little as possible.

@lidavidm
Copy link
Member Author

lidavidm commented Feb 18, 2022

TODOs

  • Address new TODOs scattered in source
  • Make client/server share code around writing IPC with app metadata
  • Document the new interfaces
  • Check that Flight benchmark did not regress
  • Check that CUDA support did not regress

@github-actions
Copy link

@lidavidm lidavidm force-pushed the arrow-15282 branch 2 times, most recently from 0cfb0dd to 4b20d53 Compare February 22, 2022 21:32
@lidavidm lidavidm changed the title ARROW-15282: [C++][FlightRPC] WIP: Split data methods from the underlying transport ARROW-15282: [C++][FlightRPC] Split data methods from the underlying transport Feb 22, 2022
@lidavidm lidavidm force-pushed the arrow-15282 branch 6 times, most recently from fe5f27b to d5ea2f8 Compare February 23, 2022 16:45
@lidavidm lidavidm marked this pull request as ready for review February 23, 2022 17:38
@lidavidm
Copy link
Member Author

CC @cyb70289 if you have a chance to take a look (no pressure - this is a pretty large PR)

Copy link
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great ! Some nits in comments.

Will test the PR at my side. Thanks.

Copy link
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

On my test machine, DoPutLargeBatch often costs 10s to finish. It's not related to this PR. I've filed a jira issue: https://issues.apache.org/jira/browse/ARROW-15793

@lidavidm
Copy link
Member Author

Thanks for tracking that down to the original PR. Removing the reset() fixes things. I also noticed FlightClient::Close() wasn't bound to ClientTransportImpl::Close() so that has also been fixed.

@pitrou
Copy link
Member

pitrou commented Mar 1, 2022

The responsibilities on the server side are difficult to understand. It looks like they are shared between 3 or 4 different classes (FlightServerBase, ServerTransportImpl, FlightServiceImpl and perhaps FlightService::Service). I was expecting one transport-agnostic class and one transport implementation, but it looks like there is more...

@pitrou
Copy link
Member

pitrou commented Mar 1, 2022

Also, from a file organization standpoint, I would expect the gRPC-specific parts to be in their own files and named perhaps grpc_{client,server}.{h,cc}?

@lidavidm
Copy link
Member Author

lidavidm commented Mar 1, 2022

I can try to document it better but effectively the layers are:

  • FlightServerBase: application-facing API that provides overridable methods
  • ServerTransportImpl: transport-facing API that provides an appropriate I/O stream implementation for a given method, and sets up the actual underlying server. Calls into FlightServiceImpl/FlightServerBase
  • FlightServiceImpl: implements application-facing APIs in terms of the transport-facing APIs (effectively, code that is supposed to be transport- and application- agnostic, i.e. "Flight itself")

Ideally transports wouldn't deal with FlightServerBase but I don't want to go through and wrap all the non-data methods quite yet. But effectively, a transport only implements ServerTransportImpl. It's just that the gRPC code is still with the rest of the Flight code; would it be easier if I went ahead and split them out?

FlightService::Serivce/FlightGrpcServiceImpl are gRPC specific

@lidavidm
Copy link
Member Author

lidavidm commented Mar 1, 2022

Also, from a file organization standpoint, I would expect the gRPC-specific parts to be in their own files and named perhaps grpc_{client,server}.{h,cc}?

Ok, that should be doable

@lidavidm
Copy link
Member Author

lidavidm commented Mar 1, 2022

I should also add a doc page for transports (and the Flight documentation is generally lacking; I should take the time to clean things up)

lidavidm added 21 commits March 10, 2022 11:29
@cyb70289
Copy link
Contributor

I promise this is my last comment :)
There are some legacy comments about grpc found by git grep gRPC {client,server}.{h,cc}.

@lidavidm
Copy link
Member Author

Thanks, I thought I checked but apparently not!

@cyb70289 cyb70289 closed this in bb65225 Mar 11, 2022
@ursabot
Copy link

ursabot commented Mar 11, 2022

Benchmark runs are scheduled for baseline = dfca6a7 and contender = bb65225. bb65225 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️1.97% ⬆️1.8%] test-mac-arm
[Finished ⬇️0.36% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.77% ⬆️0.17%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@lidavidm lidavidm deleted the arrow-15282 branch March 11, 2022 13:42
@lidavidm
Copy link
Member Author

Thank you both for the reviews!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants