-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Auto start testcontainers for datafusion-cli
#16644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5021bc5
6e189f0
9f987fd
ed8c85e
6dba785
3b41443
42ca948
4cf2dcc
f454866
fa9d170
ca16255
2198c46
bcddd99
d941ae6
aebb86c
bf7f857
4deb0eb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -29,47 +29,26 @@ cargo test | |
|
|
||
| ## Running Storage Integration Tests | ||
|
|
||
| By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and | ||
| then provide the necessary configuration for that object store. | ||
| By default, storage integration tests are not run. These test use the `testcontainers` crate to start up a local MinIO server using docker on port 9000. | ||
|
|
||
| For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. | ||
|
|
||
| ### AWS | ||
|
|
||
| To test the S3 integration against [Minio](https://github.com/minio/minio) | ||
|
|
||
| First start up a container with Minio and load test files. | ||
| To run them you will need to set `TEST_STORAGE_INTEGRATION`: | ||
|
|
||
| ```shell | ||
| docker run -d \ | ||
| --name datafusion-test-minio \ | ||
| -p 9000:9000 \ | ||
| -e MINIO_ROOT_USER=TEST-DataFusionLogin \ | ||
| -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \ | ||
| -v $(pwd)/../datafusion/core/tests/data:/source \ | ||
| quay.io/minio/minio server /data | ||
|
|
||
| docker exec datafusion-test-minio /bin/sh -c "\ | ||
| mc ready local | ||
| mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \ | ||
| mc mb localminio/data && \ | ||
| mc cp -r /source/* localminio/data" | ||
| TEST_STORAGE_INTEGRATION=1 cargo test | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When I first ran this command without docker running, several commands failed Once I started docker it worked great
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note it didn't show any errors about starting the container in the logs / output
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if ! docker info > /dev/null 2>&1; then
echo "This script requires docker to be running. Please start docker and try again."
exit 1
fi???
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks, updated in 4deb0eb |
||
| ``` | ||
|
|
||
| Setup environment | ||
| For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used. | ||
|
|
||
| ```shell | ||
| export TEST_STORAGE_INTEGRATION=1 | ||
| export AWS_ACCESS_KEY_ID=TEST-DataFusionLogin | ||
| export AWS_SECRET_ACCESS_KEY=TEST-DataFusionPassword | ||
| export AWS_ENDPOINT=http://127.0.0.1:9000 | ||
| export AWS_ALLOW_HTTP=true | ||
| ``` | ||
| ### AWS | ||
|
|
||
| Note that `AWS_ENDPOINT` is set without slash at the end. | ||
| S3 integration is tested against [Minio](https://github.com/minio/minio) with [TestContainers](https://github.com/testcontainers/testcontainers-rs) | ||
| This requires Docker to be running on your machine and port 9000 to be free. | ||
|
|
||
| Run tests | ||
| If you see an error mentioning "failed to load IMDS session token" such as | ||
|
|
||
| ```shell | ||
| cargo test | ||
| ``` | ||
| > ---- object_storage::tests::s3_object_store_builder_resolves_region_when_none_provided stdout ---- | ||
| > Error: ObjectStore(Generic { store: "S3", source: "Error getting credentials from provider: an error occurred while loading credentials: failed to load IMDS session token" }) | ||
|
|
||
| You my need to disable trying to fetch S3 credentials from the environment using the `AWS_EC2_METADATA_DISABLED`, for example: | ||
|
|
||
| > $ AWS_EC2_METADATA_DISABLED=true TEST_STORAGE_INTEGRATION=1 cargo test | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,7 +21,12 @@ use rstest::rstest; | |
|
|
||
| use insta::{glob, Settings}; | ||
| use insta_cmd::{assert_cmd_snapshot, get_cargo_bin}; | ||
| use std::path::PathBuf; | ||
| use std::{env, fs}; | ||
| use testcontainers::core::{CmdWaitFor, ExecCommand, Mount}; | ||
| use testcontainers::runners::AsyncRunner; | ||
| use testcontainers::{ContainerAsync, ImageExt, TestcontainersError}; | ||
| use testcontainers_modules::minio; | ||
|
|
||
| fn cli() -> Command { | ||
| Command::new(get_cargo_bin("datafusion-cli")) | ||
|
|
@@ -35,6 +40,83 @@ fn make_settings() -> Settings { | |
| settings | ||
| } | ||
|
|
||
| async fn setup_minio_container() -> ContainerAsync<minio::MinIO> { | ||
| const MINIO_ROOT_USER: &str = "TEST-DataFusionLogin"; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is very nice that these environment variables get setup via the test harness now rather than having to be setup outside |
||
| const MINIO_ROOT_PASSWORD: &str = "TEST-DataFusionPassword"; | ||
|
|
||
| let data_path = | ||
| PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("../datafusion/core/tests/data"); | ||
|
|
||
| let absolute_data_path = data_path | ||
| .canonicalize() | ||
| .expect("Failed to get absolute path for test data"); | ||
|
|
||
| let container = minio::MinIO::default() | ||
| .with_env_var("MINIO_ROOT_USER", MINIO_ROOT_USER) | ||
| .with_env_var("MINIO_ROOT_PASSWORD", MINIO_ROOT_PASSWORD) | ||
| .with_mount(Mount::bind_mount( | ||
| absolute_data_path.to_str().unwrap(), | ||
| "/source", | ||
| )) | ||
| .start() | ||
| .await; | ||
|
|
||
| match container { | ||
| Ok(container) => { | ||
| // We wait for MinIO to be healthy and preprare test files. We do it via CLI to avoid s3 dependency | ||
| let commands = [ | ||
| ExecCommand::new(["/usr/bin/mc", "ready", "local"]), | ||
| ExecCommand::new([ | ||
| "/usr/bin/mc", | ||
| "alias", | ||
| "set", | ||
| "localminio", | ||
| "http://localhost:9000", | ||
| MINIO_ROOT_USER, | ||
| MINIO_ROOT_PASSWORD, | ||
| ]), | ||
| ExecCommand::new(["/usr/bin/mc", "mb", "localminio/data"]), | ||
| ExecCommand::new([ | ||
| "/usr/bin/mc", | ||
| "cp", | ||
| "-r", | ||
| "/source/", | ||
| "localminio/data/", | ||
| ]), | ||
| ]; | ||
|
|
||
| for command in commands { | ||
| let command = | ||
| command.with_cmd_ready_condition(CmdWaitFor::Exit { code: Some(0) }); | ||
|
|
||
| let cmd_ref = format!("{command:?}"); | ||
|
|
||
| if let Err(e) = container.exec(command).await { | ||
| let stdout = container.stdout_to_vec().await.unwrap_or_default(); | ||
| let stderr = container.stderr_to_vec().await.unwrap_or_default(); | ||
|
|
||
| panic!( | ||
| "Failed to execute command: {}\nError: {}\nStdout: {:?}\nStderr: {:?}", | ||
| cmd_ref, | ||
| e, | ||
| String::from_utf8_lossy(&stdout), | ||
| String::from_utf8_lossy(&stderr) | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| container | ||
| } | ||
|
|
||
| Err(TestcontainersError::Client(e)) => { | ||
| panic!("Failed to start MinIO container. Ensure Docker is running and accessible: {e}"); | ||
| } | ||
| Err(e) => { | ||
| panic!("Failed to start MinIO container: {e}"); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| #[ctor::ctor] | ||
| fn init() { | ||
|
|
@@ -165,12 +247,22 @@ async fn test_cli() { | |
| return; | ||
| } | ||
|
|
||
| let container = setup_minio_container().await; | ||
|
|
||
| let settings = make_settings(); | ||
| let _bound = settings.bind_to_scope(); | ||
|
|
||
| let port = container.get_host_port_ipv4(9000).await.unwrap(); | ||
|
|
||
| glob!("sql/integration/*.sql", |path| { | ||
| let input = fs::read_to_string(path).unwrap(); | ||
| assert_cmd_snapshot!(cli().pass_stdin(input)) | ||
| assert_cmd_snapshot!(cli() | ||
| .env_clear() | ||
| .env("AWS_ACCESS_KEY_ID", "TEST-DataFusionLogin") | ||
| .env("AWS_SECRET_ACCESS_KEY", "TEST-DataFusionPassword") | ||
| .env("AWS_ENDPOINT", format!("http://localhost:{port}")) | ||
| .env("AWS_ALLOW_HTTP", "true") | ||
| .pass_stdin(input)) | ||
| }); | ||
| } | ||
|
|
||
|
|
@@ -186,20 +278,17 @@ async fn test_aws_options() { | |
| let settings = make_settings(); | ||
| let _bound = settings.bind_to_scope(); | ||
|
|
||
| let access_key_id = | ||
| env::var("AWS_ACCESS_KEY_ID").expect("AWS_ACCESS_KEY_ID is not set"); | ||
| let secret_access_key = | ||
| env::var("AWS_SECRET_ACCESS_KEY").expect("AWS_SECRET_ACCESS_KEY is not set"); | ||
| let endpoint_url = env::var("AWS_ENDPOINT").expect("AWS_ENDPOINT is not set"); | ||
| let container = setup_minio_container().await; | ||
| let port = container.get_host_port_ipv4(9000).await.unwrap(); | ||
|
|
||
| let input = format!( | ||
| r#"CREATE EXTERNAL TABLE CARS | ||
| STORED AS CSV | ||
| LOCATION 's3://data/cars.csv' | ||
| OPTIONS( | ||
| 'aws.access_key_id' '{access_key_id}', | ||
| 'aws.secret_access_key' '{secret_access_key}', | ||
| 'aws.endpoint' '{endpoint_url}', | ||
| 'aws.access_key_id' 'TEST-DataFusionLogin', | ||
| 'aws.secret_access_key' 'TEST-DataFusionPassword', | ||
| 'aws.endpoint' 'http://localhost:{port}', | ||
| 'aws.allow_http' 'true' | ||
| ); | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi, I am not sure why but locally I am unable to properly run minio with version 0.12, I have this in my Cargo.toml:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please post the error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just hangs locally (NOT this code, but my own tests). I think it's because of something related to waiting for minio to come up - I'm not doing the mc commands like you are.