[Proposal] Allow Spark to act as the execution engine for druid queries

Currently, Druid has a rough time when there is a long query that prevents from other shorter queries to be processed. In order to mitigate the effect of long queries, I'd like to enable Spark to serve as the execution engine for druid queries. I do not have a concrete (or set) design of what I would try to implement, but here are the two approaches I am considering:

1) Make Spark query historical nodes and merge the results like a broker does.
2) Make Spark download segments from S3 and do all the computations by itself.

Links related to this issue/proposal:

https://groups.google.com/forum/#!searchin/druid-development/spark/druid-development/ULdKYZeven4/NIM5ySTbAQAJ

https://groups.google.com/forum/#!msg/druid-development/Jp9Lv0mGFlg/f9m1U9MBBAAJ

Related PRs:
- #2837 refactoring of client segment walker


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Allow Spark to act as the execution engine for druid queries #2330

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Proposal] Allow Spark to act as the execution engine for druid queries #2330

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions