-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Describe the bug
If you will take a 10 GB file from a S3 remote storage the following request:
SELECT * FROM test LIMIT 1;
will try to read the WHOLE file (10GB) instead of just a first row (chunk).
To Reproduce
Steps to reproduce the behavior:
- Put 1GB CSV file to S3
- Add s3 contrib object store that is fine
// let mut ctx: Context = Context::new_local(&session_config);
let mut ctx = {
let runtime = RuntimeEnv::new(RuntimeConfig::default()).unwrap();
runtime.register_object_store("s3", Arc::new(S3FileSystem::default().await));
Context::Local(SessionContext::with_config_rt(
session_config.clone(),
Arc::new(runtime.clone()),
))
};
- CREATE EXTERNAL TABLE test (...) STORED AS CSV WITH HEADER ROW LOCATION 's3://blah/blah.csv';
- SELECT * FROM test LIMIT 1;
list file from: s3://blah/blah.csv
sync_chunk_reader: 0-10428263736
sending get object request blah/blah.csv
ArrowError(ExternalError(Custom { kind: TimedOut, error: AWS("Timeout") }))
Expected behavior
It must read only a small chunk that is enough to execute the LIMIT 1 query.
Additional context
The contrib module is fine... It's an engine that requests this epic lenght.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request