Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions distribution/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -232,8 +232,6 @@
<argument>-c</argument>
<argument>io.druid.extensions.contrib:druid-redis-cache</argument>
<argument>-c</argument>
<argument>io.druid.extensions.contrib:scan-query</argument>
<argument>-c</argument>
<argument>io.druid.extensions.contrib:sqlserver-metadata-storage</argument>
<argument>-c</argument>
<argument>io.druid.extensions.contrib:statsd-emitter</argument>
Expand Down
1 change: 0 additions & 1 deletion docs/content/development/extensions.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ All of these community extensions can be downloaded using *pull-deps* with the c
|statsd-emitter|StatsD metrics emitter|[link](../development/extensions-contrib/statsd.html)|
|kafka-emitter|Kafka metrics emitter|[link](../development/extensions-contrib/kafka-emitter.html)|
|druid-thrift-extensions|Support thrift ingestion |[link](../development/extensions-contrib/thrift.html)|
|scan-query|Scan query|[link](../development/extensions-contrib/scan-query.html)|

## Promoting Community Extension to Core Extension

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,11 @@ There are several main parts to a scan query:
|columns|A String array of dimensions and metrics to scan. If left empty, all dimensions and metrics are returned.|no|
|batchSize|How many rows buffered before return to client. Default is `20480`|no|
|limit|How many rows to return. If not specified, all rows will be returned.|no|
|legacy|Return results consistent with the legacy "scan-query" contrib extension. Defaults to the value set by `druid.query.scan.legacy`, which in turn defaults to false. See [Legacy mode](#legacy-mode) for details.|no|
|context|An additional JSON Object which can be used to specify certain flags.|no|

## Example results

The format of the result when resultFormat equals to `list`:

```json
Expand Down Expand Up @@ -154,4 +157,19 @@ The format of the result when resultFormat equals to `compactedList`:
The biggest difference between select query and scan query is that, scan query doesn't retain all rows in memory before rows can be returned to client.
It will cause memory pressure if too many rows required by select query.
Scan query doesn't have this issue.
Scan query can return all rows without issuing another pagination query, which is extremely useful when query against historical or realtime node directly.
Scan query can return all rows without issuing another pagination query, which is extremely useful when query against historical or realtime node directly.

## Legacy mode

The Scan query supports a legacy mode designed for protocol compatibility with the former scan-query contrib extension.
In legacy mode you can expect the following behavior changes:

- The __time column is returned as "timestamp" rather than "__time". This will take precedence over any other column
you may have that is named "timestamp".
- The __time column is included in the list of columns even if you do not specifically ask for it.
- Timestamps are returned as ISO8601 time strings rather than integers (milliseconds since 1970-01-01 00:00:00 UTC).

Legacy mode can be triggered either by passing `"legacy" : true` in your query JSON, or by setting
`druid.query.scan.legacy = true` on your Druid nodes. If you were previously using the scan-query contrib extension,
the best way to migrate is to activate legacy mode during a rolling upgrade, then switch it off after the upgrade
is complete.
8 changes: 8 additions & 0 deletions docs/content/querying/select-query.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
layout: doc_page
---
# Select Queries

Select queries return raw Druid rows and support pagination.

```json
Expand All @@ -19,6 +20,13 @@ Select queries return raw Druid rows and support pagination.
}
```

<div class="note info">
Consider using the [Scan query](scan-query.html) instead of the Select query if you don't need pagination, and you
don't need the strict time-ascending or time-descending ordering offered by the Select query. The Scan query returns
results without pagination, and offers "looser" ordering than Select, but is significantly more efficient in terms of
both processing time and memory requirements. It is also capable of returning a virtually unlimited number of results.
</div>

There are several main parts to a select query:

|property|description|required?|
Expand Down
4 changes: 3 additions & 1 deletion docs/content/querying/sql.md
Original file line number Diff line number Diff line change
Expand Up @@ -256,7 +256,9 @@ converted to zeroes).

## Query execution

Queries without aggregations will use Druid's [Select](select-query.html) native query type.
Queries without aggregations will use Druid's [Scan](scan-query.html) or [Select](select-query.html) native query types.
Scan is used whenever possible, as it is generally higher performance and more efficient than Select. However, Select
is used in one case: when the query includes an `ORDER BY __time`, since Scan does not have a sorting feature.

Aggregation queries (using GROUP BY, DISTINCT, or any aggregation functions) will use one of Druid's three native
aggregation query types. Two (Timeseries and TopN) are specialized for specific types of aggregations, whereas the other
Expand Down
1 change: 1 addition & 0 deletions docs/content/toc.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ layout: toc
* [DataSource Metadata](/docs/VERSION/querying/datasourcemetadataquery.html)
* [Search](/docs/VERSION/querying/searchquery.html)
* [Select](/docs/VERSION/querying/select-query.html)
* [Scan](/docs/VERSION/querying/scan-query.html)
* Components
* [Datasources](/docs/VERSION/querying/datasource.html)
* [Filters](/docs/VERSION/querying/filters.html)
Expand Down
63 changes: 0 additions & 63 deletions extensions-contrib/scan-query/pom.xml

This file was deleted.

This file was deleted.

This file was deleted.

1 change: 0 additions & 1 deletion pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,6 @@
<module>extensions-contrib/virtual-columns</module>
<module>extensions-contrib/thrift-extensions</module>
<module>extensions-contrib/ambari-metrics-emitter</module>
<module>extensions-contrib/scan-query</module>
<module>extensions-contrib/sqlserver-metadata-storage</module>
<module>extensions-contrib/kafka-emitter</module>
<module>extensions-contrib/redis-cache</module>
Expand Down
3 changes: 3 additions & 0 deletions processing/src/main/java/io/druid/query/Query.java
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
import io.druid.query.filter.DimFilter;
import io.druid.query.groupby.GroupByQuery;
import io.druid.query.metadata.metadata.SegmentMetadataQuery;
import io.druid.query.scan.ScanQuery;
import io.druid.query.search.search.SearchQuery;
import io.druid.query.select.SelectQuery;
import io.druid.query.spec.QuerySegmentSpec;
Expand All @@ -46,6 +47,7 @@
@JsonSubTypes.Type(name = Query.SEARCH, value = SearchQuery.class),
@JsonSubTypes.Type(name = Query.TIME_BOUNDARY, value = TimeBoundaryQuery.class),
@JsonSubTypes.Type(name = Query.GROUP_BY, value = GroupByQuery.class),
@JsonSubTypes.Type(name = Query.SCAN, value = ScanQuery.class),
@JsonSubTypes.Type(name = Query.SEGMENT_METADATA, value = SegmentMetadataQuery.class),
@JsonSubTypes.Type(name = Query.SELECT, value = SelectQuery.class),
@JsonSubTypes.Type(name = Query.TOPN, value = TopNQuery.class),
Expand All @@ -58,6 +60,7 @@ public interface Query<T>
String SEARCH = "search";
String TIME_BOUNDARY = "timeBoundary";
String GROUP_BY = "groupBy";
String SCAN = "scan";
String SEGMENT_METADATA = "segmentMetadata";
String SELECT = "select";
String TOPN = "topN";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import com.fasterxml.jackson.annotation.JsonSubTypes;
import com.fasterxml.jackson.annotation.JsonTypeInfo;
import io.druid.guice.annotations.ExtensionPoint;
import io.druid.java.util.common.Cacheable;
import io.druid.query.lookup.LookupExtractionFn;
import io.druid.query.lookup.RegisteredLookupExtractionFn;

Expand Down Expand Up @@ -57,16 +58,8 @@
* regular expression with a capture group. When the regular expression matches the value of a dimension,
* the value captured by the group is used for grouping operations instead of the dimension value.
*/
public interface ExtractionFn
public interface ExtractionFn extends Cacheable
{
/**
* Returns a byte[] unique to all concrete implementations of DimExtractionFn. This byte[] is used to
* generate a cache key for the specific query.
*
* @return a byte[] unit to all concrete implements of DimExtractionFn
*/
public byte[] getCacheKey();

/**
* The "extraction" function. This should map an Object into some String value.
* <p>
Expand Down
Loading