diff --git a/README.md b/README.md index 4df7b46..a86c607 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ While not used directly by `dsub` for the `google-batch` provider, you are likel Cloud SDK](https://cloud.google.com/sdk/). If you will be using the `local` provider for faster job development, -you *will* need to install the Google Cloud SDK, which uses `gsutil` to ensure +you *will* need to install the Google Cloud SDK, which uses `gcloud storage` to ensure file operation semantics consistent with the Google `dsub` providers. 1. [Install the Google Cloud SDK](https://cloud.google.com/sdk/) @@ -182,10 +182,10 @@ The steps for getting started differ slightly as indicated in the steps below: The dsub logs and output files will be written to a bucket. Create a bucket using the [storage browser](https://console.cloud.google.com/storage/browser?project=) - or run the command-line utility [gsutil](https://cloud.google.com/storage/docs/gsutil), + or run the command-line utility [gcloud storage](https://cloud.google.com/storage/docs/gcloud-storage), included in the Cloud SDK. - gsutil mb gs://my-bucket + gcloud storage buckets create gs://my-bucket Change `my-bucket` to a unique name that follows the [bucket-naming conventions](https://cloud.google.com/storage/docs/bucket-naming). @@ -215,7 +215,7 @@ The steps for getting started differ slightly as indicated in the steps below: 1. View the output file. - gsutil cat gs://my-bucket/output/out.txt + gcloud storage cat gs://my-bucket/output/out.txt ## Backend providers @@ -351,8 +351,8 @@ by: To upload the files to Google Cloud Storage, you can use the [storage browser](https://console.cloud.google.com/storage/browser?project=) or -[gsutil](https://cloud.google.com/storage/docs/gsutil). You can also run on data -that’s public or shared with your service account, an email address that you +[gcloud storage](https://cloud.google.com/storage/docs/gcloud-storage). You can also run on data +that's public or shared with your service account, an email address that you can find in the [Google Cloud Console](https://console.cloud.google.com). #### Files @@ -728,7 +728,7 @@ of the service account will be `sa-name@project-id.iam.gserviceaccount.com`. 2. Grant IAM access on buckets, etc. to the service account. - gsutil iam ch serviceAccount:sa-name@project-id.iam.gserviceaccount.com:roles/storage.objectAdmin gs://bucket-name + gcloud storage buckets add-iam-policy-binding gs://bucket-name --member=serviceAccount:sa-name@project-id.iam.gserviceaccount.com --role=roles/storage.objectAdmin 3. Update your `dsub` command to include `--service-account` diff --git a/docs/code.md b/docs/code.md index 2e465ec..6c6a3d3 100644 --- a/docs/code.md +++ b/docs/code.md @@ -187,7 +187,7 @@ To run the driver script, first copy `script1.sh` and `script2.sh` to cloud storage: ``` -gsutil cp my-code/script1.sh my-code/script2.sh gs://MY-BUCKET/my-code/ +gcloud storage cp my-code/script1.sh my-code/script2.sh gs://MY-BUCKET/my-code/ ``` Then launch a dsub job: @@ -205,7 +205,7 @@ Extending the previous example, you could copy `script1.sh` and `script2.sh` to cloud storage with: ``` -gsutil rsync -r my-code gs://MY-BUCKET/my-code/ +gcloud storage rsync --recursive my-code gs://MY-BUCKET/my-code/ ``` and then launch a `dsub` job with: diff --git a/docs/providers/README.md b/docs/providers/README.md index c482dc8..1e71cb0 100644 --- a/docs/providers/README.md +++ b/docs/providers/README.md @@ -132,7 +132,7 @@ copying output files. The copying of files is performed in the host environment, not inside the Docker container. This means that for copying to/from Google Cloud Storage, the host environment requires a copy of -[gsutil](https://cloud.google.com/storage/docs/gsutil) to be installed. +[gcloud](https://cloud.google.com/cli) to be installed. #### Container runtime environment diff --git a/dsub/lib/param_util.py b/dsub/lib/param_util.py index c5b9dba..ccddf48 100644 --- a/dsub/lib/param_util.py +++ b/dsub/lib/param_util.py @@ -827,23 +827,23 @@ def directory_fmt(directory): Multiple files copy, works as intended in all cases: $ touch a.txt b.txt - $ gsutil cp ./*.txt gs://mybucket/text_dest - $ gsutil ls gs://mybucket/text_dest/ + $ gcloud storage cp ./*.txt gs://mybucket/text_dest + $ gcloud storage ls gs://mybucket/text_dest/ 0 2017-07-19T21:44:36Z gs://mybucket/text_dest/a.txt 0 2017-07-19T21:44:36Z gs://mybucket/text_dest/b.txt TOTAL: 2 objects, 0 bytes (0 B) Single file copy fails to copy into a directory: $ touch 1.bam - $ gsutil cp ./*.bam gs://mybucket/bad_dest - $ gsutil ls gs://mybucket/bad_dest + $ gcloud storage cp ./*.bam gs://mybucket/bad_dest + $ gcloud storage ls gs://mybucket/bad_dest 0 2017-07-19T21:46:16Z gs://mybucket/bad_dest TOTAL: 1 objects, 0 bytes (0 B) Adding a trailing forward slash fixes this: $ touch my.sam - $ gsutil cp ./*.sam gs://mybucket/good_folder - $ gsutil ls gs://mybucket/good_folder + $ gcloud storage cp ./*.sam gs://mybucket/good_folder + $ gcloud storage ls gs://mybucket/good_folder 0 2017-07-19T21:46:16Z gs://mybucket/good_folder/my.sam TOTAL: 1 objects, 0 bytes (0 B) diff --git a/dsub/lib/providers_util.py b/dsub/lib/providers_util.py index 5700ea5..f75b626 100644 --- a/dsub/lib/providers_util.py +++ b/dsub/lib/providers_util.py @@ -20,7 +20,7 @@ from .._dsub_version import DSUB_VERSION _LOCALIZE_COMMAND_MAP = { - job_model.P_GCS: 'gsutil -m rsync -r', + job_model.P_GCS: 'gcloud storage rsync --recursive', job_model.P_LOCAL: 'rsync -r', } diff --git a/dsub/providers/google_utils.py b/dsub/providers/google_utils.py index cea1196..9839159 100644 --- a/dsub/providers/google_utils.py +++ b/dsub/providers/google_utils.py @@ -68,7 +68,7 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, # pylint: enable=g-complex-comprehension -# Action steps that interact with GCS need gsutil and Python. +# Action steps that interact with GCS need gcloud storage and Python. # Use the 'slim' variant of the cloud-sdk image as it is much smaller. CLOUD_SDK_IMAGE = 'gcr.io/google.com/cloudsdktool/cloud-sdk:294.0.0-slim' @@ -94,7 +94,7 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, } """) -# Define a bash function for "gsutil cp" to be used by the logging, +# Define a bash function for "gcloud storage cp" to be used by the logging, # localization, and delocalization actions. GSUTIL_CP_FN = textwrap.dedent("""\ function gsutil_cp() { @@ -103,30 +103,30 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, local content_type="${3}" local user_project_name="${4}" - local headers="" + local content_type_flag="" if [[ -n "${content_type}" ]]; then - headers="-h Content-Type:${content_type}" + content_type_flag="--content-type=${content_type}" fi local user_project_flag="" if [[ -n "${user_project_name}" ]]; then - user_project_flag="-u ${user_project_name}" + user_project_flag="--billing-project=${user_project_name}" fi local attempt for ((attempt = 0; attempt < 4; attempt++)); do - log_info "gsutil ${headers} ${user_project_flag} -mq cp \"${src}\" \"${dst}\"" - if gsutil ${headers} ${user_project_flag} -mq cp "${src}" "${dst}"; then + log_info "gcloud storage cp ${content_type_flag} ${user_project_flag} --no-user-output-enabled \"${src}\" \"${dst}\"" + if gcloud storage cp ${content_type_flag} ${user_project_flag} --no-user-output-enabled "${src}" "${dst}"; then return fi if (( attempt < 3 )); then - log_warning "Sleeping 10s before the next attempt of failed gsutil command" - log_warning "gsutil ${headers} ${user_project_flag} -mq cp \"${src}\" \"${dst}\"" + log_warning "Sleeping 10s before the next attempt of failed gcloud storage command" + log_warning "gcloud storage cp ${content_type_flag} ${user_project_flag} --no-user-output-enabled \"${src}\" \"${dst}\"" sleep 10s fi done - log_error "gsutil ${headers} ${user_project_flag} -mq cp \"${src}\" \"${dst}\"" + log_error "gcloud storage cp ${content_type_flag} ${user_project_flag} --no-user-output-enabled \"${src}\" \"${dst}\"" exit 1 } """) @@ -144,7 +144,7 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, return fi - # Copy the log files to a local temporary location so that our "gsutil cp" is never + # Copy the log files to a local temporary location so that our "gcloud storage cp" is never # executed on a file that is changing. local tmp_path="${tmp}/$(basename ${src})" @@ -154,7 +154,7 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, } """) -# Define a bash function for "gsutil rsync" to be used by the logging, +# Define a bash function for "gcloud storage rsync" to be used by the logging, # localization, and delocalization actions. GSUTIL_RSYNC_FN = textwrap.dedent("""\ function gsutil_rsync() { @@ -164,23 +164,23 @@ def make_runtime_dirs_command(script_dir: str, tmp_dir: str, local user_project_flag="" if [[ -n "${user_project_name}" ]]; then - user_project_flag="-u ${user_project_name}" + user_project_flag="--billing-project=${user_project_name}" fi local attempt for ((attempt = 0; attempt < 4; attempt++)); do - log_info "gsutil ${user_project_flag} -mq rsync -r \"${src}\" \"${dst}\"" - if gsutil ${user_project_flag} -mq rsync -r "${src}" "${dst}"; then + log_info "gcloud storage rsync ${user_project_flag} --recursive --no-user-output-enabled \"${src}\" \"${dst}\"" + if gcloud storage rsync ${user_project_flag} --recursive --no-user-output-enabled "${src}" "${dst}"; then return fi if (( attempt < 3 )); then - log_warning "Sleeping 10s before the next attempt of failed gsutil command" - log_warning "gsutil ${user_project_flag} -mq rsync -r \"${src}\" \"${dst}\"" + log_warning "Sleeping 10s before the next attempt of failed gcloud storage command" + log_warning "gcloud storage rsync ${user_project_flag} --recursive --no-user-output-enabled \"${src}\" \"${dst}\"" sleep 10s fi done - log_error "gsutil ${user_project_flag} -mq rsync -r \"${src}\" \"${dst}\"" + log_error "gcloud storage rsync ${user_project_flag} --recursive --no-user-output-enabled \"${src}\" \"${dst}\"" exit 1 } """) diff --git a/dsub/providers/local.py b/dsub/providers/local.py index b4b4814..8f04998 100644 --- a/dsub/providers/local.py +++ b/dsub/providers/local.py @@ -712,9 +712,9 @@ def _delocalize_logging_command(self, logging_path, user_project): elif logging_path.file_provider == job_model.P_GCS: mkdir_cmd = '' if user_project: - cp_cmd = 'gsutil -u {} -mq cp'.format(user_project) + cp_cmd = 'gcloud storage cp --billing-project={} --no-user-output-enabled'.format(user_project) else: - cp_cmd = 'gsutil -mq cp' + cp_cmd = 'gcloud storage cp --no-user-output-enabled' else: assert False @@ -773,7 +773,7 @@ def _localize_inputs_recursive_command(self, task_dir, inputs): return '\n'.join(provider_commands) def _get_input_target_path(self, local_file_path): - """Returns a directory or file path to be the target for "gsutil cp". + """Returns a directory or file path to be the target for "gcloud storage cp". If the filename contains a wildcard, then the target path must be a directory in order to ensure consistency whether the source pattern @@ -784,7 +784,7 @@ def _get_input_target_path(self, local_file_path): local_file_path: A full path terminating in a file or a file wildcard. Returns: - The path to use as the "gsutil cp" target. + The path to use as the "gcloud storage cp" target. """ path, filename = os.path.split(local_file_path) @@ -808,17 +808,17 @@ def _localize_inputs_command(self, task_dir, inputs, user_project): if i.file_provider in [job_model.P_LOCAL, job_model.P_GCS]: # The semantics that we expect here are implemented consistently in - # "gsutil cp", and are a bit different than "cp" when it comes to + # "gcloud storage cp", and are a bit different than "cp" when it comes to # wildcard handling, so use it for both local and GCS: # # - `cp path/* dest/` will error if "path" has subdirectories. # - `cp "path/*" "dest/"` will fail (it expects wildcard expansion # to come from shell). if user_project: - command = 'gsutil -u %s -mq cp "%s" "%s"' % ( + command = 'gcloud storage cp --billing-project=%s --no-user-output-enabled "%s" "%s"' % ( user_project, source_file_path, dest_file_path) else: - command = 'gsutil -mq cp "%s" "%s"' % (source_file_path, + command = 'gcloud storage cp --no-user-output-enabled "%s" "%s"' % (source_file_path, dest_file_path) commands.append(command) @@ -865,13 +865,13 @@ def _delocalize_outputs_commands(self, task_dir, outputs, user_project): if o.file_provider == job_model.P_LOCAL: commands.append('mkdir -p "%s"' % dest_path) - # Use gsutil even for local files (explained in _localize_inputs_command). + # Use gcloud storage even for local files (explained in _localize_inputs_command). if o.file_provider in [job_model.P_LOCAL, job_model.P_GCS]: if user_project: - command = 'gsutil -u %s -mq cp "%s" "%s"' % (user_project, local_path, + command = 'gcloud storage cp --billing-project=%s --no-user-output-enabled "%s" "%s"' % (user_project, local_path, dest_path) else: - command = 'gsutil -mq cp "%s" "%s"' % (local_path, dest_path) + command = 'gcloud storage cp --no-user-output-enabled "%s" "%s"' % (local_path, dest_path) commands.append(command) return '\n'.join(commands) diff --git a/examples/custom_scripts/README.md b/examples/custom_scripts/README.md index fd1d965..734f057 100644 --- a/examples/custom_scripts/README.md +++ b/examples/custom_scripts/README.md @@ -87,7 +87,7 @@ Because the `--wait` flag was set, `dsub` will block until the job completes. To list the output, use the command: ``` -gsutil ls gs://MY-BUCKET/get_vcf_sample_ids.sh/output +gcloud storage ls gs://MY-BUCKET/get_vcf_sample_ids.sh/output ``` Output should look like: @@ -99,7 +99,7 @@ gs://MY-BUCKET/get_vcf_sample_ids.sh/output/sample_ids.txt To see the first few lines of the sample IDs file, run: ``` -gsutil cat gs://MY-BUCKET/get_vcf_sample_ids.sh/output/sample_ids.txt | head -n 5 +gcloud storage cat gs://MY-BUCKET/get_vcf_sample_ids.sh/output/sample_ids.txt | head -n 5 ``` Output should look like: @@ -166,7 +166,7 @@ Because the `--wait` flag was set, `dsub` will block until the job completes. To list the output, use the command: ``` -gsutil ls gs://MY-BUCKET/get_vcf_sample_ids.py/output +gcloud storage ls gs://MY-BUCKET/get_vcf_sample_ids.py/output ``` Output should look like: @@ -178,7 +178,7 @@ gs://MY-BUCKET/get_vcf_sample_ids.py/output/sample_ids.txt To see the first few lines of the sample IDs file, run: ``` -gsutil cat gs://MY-BUCKET/get_vcf_sample_ids.py/output/sample_ids.txt | head -n 5 +gcloud storage cat gs://MY-BUCKET/get_vcf_sample_ids.py/output/sample_ids.txt | head -n 5 ``` Output should look like: @@ -265,7 +265,7 @@ When all tasks for the job have completed, `dsub` will exit. To list the output objects, use the command: ``` -gsutil ls gs://MY-BUCKET/get_vcf_sample_ids/output +gcloud storage ls gs://MY-BUCKET/get_vcf_sample_ids/output ``` Output should look like: diff --git a/examples/custom_scripts/submit_one.sh b/examples/custom_scripts/submit_one.sh index 2886bb9..7fb41b6 100755 --- a/examples/custom_scripts/submit_one.sh +++ b/examples/custom_scripts/submit_one.sh @@ -73,5 +73,5 @@ dsub \ # Check output echo "Check the head of the output file:" -2>&1 gsutil cat "${OUTPUT_FILE}" | head +2>&1 gcloud storage cat "${OUTPUT_FILE}" | head diff --git a/examples/decompress/README.md b/examples/decompress/README.md index 7fa2f81..c319cf7 100644 --- a/examples/decompress/README.md +++ b/examples/decompress/README.md @@ -65,7 +65,7 @@ Because the `--wait` flag was set, `dsub` will block until the job completes. To list the output, use the command: ``` -gsutil ls gs://MY-BUCKET/decompress_one/output +gcloud storage ls gs://MY-BUCKET/decompress_one/output ``` Output should look like: @@ -77,7 +77,7 @@ gs://MY-BUCKET/decompress_one/output/ALL.ChrY.Cornell.20130502.SNPs.Genotypes.vc To see the first few lines of the decompressed file, run: ``` -gsutil cat gs://MY-BUCKET/decompress_one/output/*.vcf | head -n 5 +gcloud storage cat gs://MY-BUCKET/decompress_one/output/*.vcf | head -n 5 ``` Output should look like: @@ -153,7 +153,7 @@ when all tasks for the job have completed, `dsub` will exit. To list the output objects, use the command: ``` -gsutil ls gs://MY-BUCKET/decompress_list/output +gcloud storage ls gs://MY-BUCKET/decompress_list/output ``` Output should look like: diff --git a/examples/fastqc/README.md b/examples/fastqc/README.md index ef02fb7..c9ddea0 100644 --- a/examples/fastqc/README.md +++ b/examples/fastqc/README.md @@ -113,7 +113,7 @@ Because the `--wait` flag was set, `dsub` will block until the job completes. To list the output, use the command: ``` -gsutil ls -l gs://MY-BUCKET/fastqc/submit_one/output +gcloud storage ls -l gs://MY-BUCKET/fastqc/submit_one/output ``` Output should look like: @@ -189,7 +189,7 @@ when all tasks for the job have completed, `dsub` will exit. To list the output objects, use the command: ``` -gsutil ls -l gs://MY-BUCKET/fastqc/submit_list/output +gcloud storage ls -l gs://MY-BUCKET/fastqc/submit_list/output ``` Output should look like: diff --git a/examples/samtools/README.md b/examples/samtools/README.md index 5f968ad..eeb7beb 100644 --- a/examples/samtools/README.md +++ b/examples/samtools/README.md @@ -77,7 +77,7 @@ Because the `--wait` flag was set, `dsub` will block until the job completes. To list the output, use the command: ``` -gsutil ls -l gs://MY-BUCKET/samtools/submit_one/output +gcloud storage ls -l gs://MY-BUCKET/samtools/submit_one/output ``` Output should look like: @@ -155,7 +155,7 @@ when all tasks for the job have completed, `dsub` will exit. To list the output objects, use the command: ``` -gsutil ls -l gs://MY-BUCKET/samtools/submit_list/output +gcloud storage ls -l gs://MY-BUCKET/samtools/submit_list/output ``` Output should look like: diff --git a/examples/split_process/README.md b/examples/split_process/README.md index ad71d35..32fd1a5 100644 --- a/examples/split_process/README.md +++ b/examples/split_process/README.md @@ -44,6 +44,6 @@ rm "${WORKSPACE}/temp/*" ``` WORKSPACE=gs://mybucket/someprefix ./demo_split_process.sh input.txt "${WORKSPACE}" -gsutil ls "${WORKSPACE}/output/" -gsutil rm "${WORKSPACE}/temp/*" +gcloud storage ls "${WORKSPACE}/output/" +gcloud storage rm "${WORKSPACE}/temp/*" ``` diff --git a/examples/split_process/demo_split_process.sh b/examples/split_process/demo_split_process.sh index e33a1af..141bbf9 100755 --- a/examples/split_process/demo_split_process.sh +++ b/examples/split_process/demo_split_process.sh @@ -11,10 +11,10 @@ # example: # WORKSPACE=gs://mybucket/someprefix # ./demo_split_process.sh input.txt "${WORKSPACE}" -# gsutil ls "${WORKSPACE}/output/" -# gsutil rm "${WORKSPACE}/temp/*" +# gcloud storage ls "${WORKSPACE}/output/" +# gcloud storage rm "${WORKSPACE}/temp/*" # -# You need dsub, docker, and gsutil installed. +# You need dsub, docker, and gcloud storage installed. # Change WORKSPACE to point to a bucket you have write permission to. # # Since this uses the local provider, you can set WORKSPACE to a local path, diff --git a/test/integration/e2e_accelerator.google-batch.sh b/test/integration/e2e_accelerator.google-batch.sh index 9a9a1c6..4f6ae4b 100755 --- a/test/integration/e2e_accelerator.google-batch.sh +++ b/test/integration/e2e_accelerator.google-batch.sh @@ -87,7 +87,7 @@ echo echo "Checking GPU detection output..." # Check that GPU was detected and accessible -RESULT="$(gsutil cat "${STDOUT_LOG}")" +RESULT="$(gcloud storage cat "${STDOUT_LOG}")" # Validate GPU hardware was detected if ! echo "${RESULT}" | grep -qi "Tesla T4"; then diff --git a/test/integration/e2e_accelerator.google-cls-v2.sh b/test/integration/e2e_accelerator.google-cls-v2.sh index 0960efc..3c13a56 100755 --- a/test/integration/e2e_accelerator.google-cls-v2.sh +++ b/test/integration/e2e_accelerator.google-cls-v2.sh @@ -45,7 +45,7 @@ echo echo "Checking output..." # Check the results -RESULT="$(gsutil cat "${STDOUT_LOG}")" +RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! echo "${RESULT}" | grep -qi "GPU Memory"; then 1>&2 echo "GPU Memory not found in the dsub output!" 1>&2 echo "${RESULT}" diff --git a/test/integration/e2e_accelerator_vpc_sc.google-batch.sh b/test/integration/e2e_accelerator_vpc_sc.google-batch.sh index b798992..496c6f7 100755 --- a/test/integration/e2e_accelerator_vpc_sc.google-batch.sh +++ b/test/integration/e2e_accelerator_vpc_sc.google-batch.sh @@ -155,7 +155,7 @@ echo echo "Checking GPU detection output..." # Check that GPU was detected and accessible -RESULT="$(gsutil cat "${STDOUT_LOG}")" +RESULT="$(gcloud storage cat "${STDOUT_LOG}")" # Validate GPU hardware was detected if ! echo "${RESULT}" | grep -qi "Tesla T4"; then diff --git a/test/integration/e2e_after.sh b/test/integration/e2e_after.sh index 9b9c6b7..bb414d6 100755 --- a/test/integration/e2e_after.sh +++ b/test/integration/e2e_after.sh @@ -55,7 +55,7 @@ fi echo echo "Checking output..." -readonly RESULT="$(gsutil cat "${TEST_FILE_PATH_2}")" +readonly RESULT="$(gcloud storage cat "${TEST_FILE_PATH_2}")" if [[ "${RESULT}" != "hello world" ]]; then echo "Output file does not match expected" echo "Expected: hello world" diff --git a/test/integration/e2e_block_external_network.google-cls-v2.sh b/test/integration/e2e_block_external_network.google-cls-v2.sh index feb4889..6354176 100755 --- a/test/integration/e2e_block_external_network.google-cls-v2.sh +++ b/test/integration/e2e_block_external_network.google-cls-v2.sh @@ -31,9 +31,8 @@ echo "Launching pipeline..." set +o errexit -# Run gsutil with Boto:num_retries=0 option. Otherwise, gsutil will retry up to -# 24 times due to the network error -# https://stackoverflow.com/questions/44459685/sql-server-agent-job-and-gsutil +# Note: gcloud storage commands will retry due to network errors. +# This test validates that the job fails when network access is blocked. JOB_ID="$(run_dsub \ --image 'gcr.io/google.com/cloudsdktool/cloud-sdk:327.0.0-slim' \ --block-external-network \ @@ -54,9 +53,9 @@ readonly ATTEMPT_1_STDERR_LOG="$(dirname "${LOGGING}")/${TEST_NAME}.1-stderr.log readonly ATTEMPT_2_STDERR_LOG="$(dirname "${LOGGING}")/${TEST_NAME}.2-stderr.log" for STDERR_LOG_FILE in "${ATTEMPT_1_STDERR_LOG}" "${ATTEMPT_2_STDERR_LOG}" ; do - RESULT="$(gsutil cat "${STDERR_LOG_FILE}")" + RESULT="$(gcloud storage cat "${STDERR_LOG_FILE}")" if ! echo "${RESULT}" | grep -qi "Unable to find the server at storage.googleapis.com"; then - 1>&2 echo "Network error from gsutil not found in the dsub stderr log!" + 1>&2 echo "Network error from gcloud storage not found in the dsub stderr log!" 1>&2 echo "${RESULT}" exit 1 fi diff --git a/test/integration/e2e_cleanup.local.sh b/test/integration/e2e_cleanup.local.sh index 1677cf9..5064299 100755 --- a/test/integration/e2e_cleanup.local.sh +++ b/test/integration/e2e_cleanup.local.sh @@ -26,7 +26,7 @@ readonly SCRIPT_DIR="$(dirname "${0}")" source "${SCRIPT_DIR}/test_setup_e2e.sh" # Stage a test file. -date | gsutil cp - "${INPUTS}/recursive/deep/today.txt" +date | gcloud storage cp - "${INPUTS}/recursive/deep/today.txt" readonly TGT_1="${OUTPUTS}/testfile_1.txt" readonly TGT_2="${OUTPUTS}/testfile_2.txt" @@ -82,7 +82,7 @@ JOB_ID=$(run_dsub \ check_jobid "${JOB_ID}" for out in "${TGT_1}" "${TGT_2}" "${TGT_3}"; do - if ! gsutil ls "${out}" > /dev/null; then + if ! gcloud storage ls "${out}" > /dev/null; then echo "Missing output: ${out}" exit 1 fi diff --git a/test/integration/e2e_command_flag.sh b/test/integration/e2e_command_flag.sh index da66abc..bc03140 100755 --- a/test/integration/e2e_command_flag.sh +++ b/test/integration/e2e_command_flag.sh @@ -57,7 +57,7 @@ VAR5=VAL5 EOF ) -readonly RESULT="$(gsutil cat "${STDOUT_LOG}")" +readonly RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! diff <(echo "${RESULT_EXPECTED}") <(echo "${RESULT}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/e2e_env_tasks.sh b/test/integration/e2e_env_tasks.sh index 512a287..c1c559e 100755 --- a/test/integration/e2e_env_tasks.sh +++ b/test/integration/e2e_env_tasks.sh @@ -101,7 +101,7 @@ for ((TASK_ID=1; TASK_ID <= NUM_TASKS; TASK_ID++)); do sed -e 's#^ *##' )" - RESULT="$(gsutil cat "${LOGGING}.${TASK_ID}-stdout.log")" + RESULT="$(gcloud storage cat "${LOGGING}.${TASK_ID}-stdout.log")" if ! diff <(echo "${RESULT_EXPECTED}") <(echo "${RESULT}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/e2e_image.sh b/test/integration/e2e_image.sh index 6f38c84..f085307 100755 --- a/test/integration/e2e_image.sh +++ b/test/integration/e2e_image.sh @@ -58,7 +58,7 @@ for image in ${IMAGE_ARRAY[@]}; do echo "Checking output..." # Check the results - RESULT="$(gsutil cat "${STDOUT_LOG}")" + RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! diff <(echo "${RESULT_EXPECTED}") <(echo "${RESULT}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/e2e_input_wildcards.sh b/test/integration/e2e_input_wildcards.sh index d7c8d04..4bf56e8 100755 --- a/test/integration/e2e_input_wildcards.sh +++ b/test/integration/e2e_input_wildcards.sh @@ -43,7 +43,7 @@ function exit_handler() { # Only cleanup on success if [[ "${code}" -eq 0 ]]; then rm -rf "${TEST_TMP}" - gsutil -mq rm "${INPUTS}/**" + gcloud storage rm --no-user-output-enabled "${INPUTS}/**" fi return "${code}" @@ -64,7 +64,7 @@ for INPUT_DIR in "${INPUT_BASIC}" "${INPUT_WITH_SPACE}"; do done done -gsutil -m rsync -r "${INPUT_ROOT}" "${INPUTS}/" +gcloud storage rsync --recursive "${INPUT_ROOT}" "${INPUTS}/" echo "Launching pipeline..." @@ -94,7 +94,7 @@ FILE_NAME=file.3.txt EOF ) -readonly RESULT="$(gsutil cat "${STDOUT_LOG}")" +readonly RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! diff <(echo "${RESULT_EXPECTED}") <(echo "${RESULT}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/e2e_io_auto.sh b/test/integration/e2e_io_auto.sh index 35b6f94..813b841 100755 --- a/test/integration/e2e_io_auto.sh +++ b/test/integration/e2e_io_auto.sh @@ -64,7 +64,7 @@ readonly EXPECTED_FS_OUTPUT_ENTRIES=( ) # Get the results- "env" and "find" output is bounded by "BEGIN" and "END" -readonly RESULT=$(gsutil cat "${STDOUT_LOG}") +readonly RESULT=$(gcloud storage cat "${STDOUT_LOG}") readonly ENV=$(echo "${RESULT}" | sed -n '/^BEGIN: env$/,/^END: env$/p') readonly FIND=$(echo "${RESULT}" | sed -n '/^BEGIN: find$/,/^END: find$/p') diff --git a/test/integration/e2e_io_gcs_tasks.sh b/test/integration/e2e_io_gcs_tasks.sh index 43948f6..a3a4515 100755 --- a/test/integration/e2e_io_gcs_tasks.sh +++ b/test/integration/e2e_io_gcs_tasks.sh @@ -44,11 +44,11 @@ io_tasks_setup::write_tasks_file # Copy the script to GCS to test loading the script remotely echo "Copying script to ${DSUB_PARAMS}" -gsutil cp "${SCRIPT_DIR}/script_io_test.sh" "${DSUB_PARAMS}/" +gcloud storage cp "${SCRIPT_DIR}/script_io_test.sh" "${DSUB_PARAMS}/" # Copy the TASKS_FILE to GCS to test loading the tasks file remotely echo "Copying tasks file to ${DSUB_PARAMS}" -gsutil cp "${TASKS_FILE}" "${DSUB_PARAMS}/" +gcloud storage cp "${TASKS_FILE}" "${DSUB_PARAMS}/" echo "Launching pipelines..." @@ -62,4 +62,4 @@ io_tasks_setup::check_output io_tasks_setup::check_dstat "${JOB_ID}" # Clean up what we uploaded after the test is done. -gsutil rm "${DSUB_PARAMS}"/** +gcloud storage rm "${DSUB_PARAMS}"/** diff --git a/test/integration/e2e_io_mount_dir.local.sh b/test/integration/e2e_io_mount_dir.local.sh index 746a2d0..31756b0 100755 --- a/test/integration/e2e_io_mount_dir.local.sh +++ b/test/integration/e2e_io_mount_dir.local.sh @@ -18,7 +18,7 @@ set -o errexit set -o nounset # This test verifies that mounting a local directory works. The test will copy -# input files via gsutil to the local disk. +# input files via gcloud storage to the local disk. # # The actual operation performed here is to download a BAM and compute # the md5, writing it to .bam.md5. diff --git a/test/integration/e2e_io_recursive.sh b/test/integration/e2e_io_recursive.sh index 3bca94c..6ece7f9 100755 --- a/test/integration/e2e_io_recursive.sh +++ b/test/integration/e2e_io_recursive.sh @@ -64,7 +64,7 @@ echo "Setting up test inputs" echo "Setting up pipeline input..." build_recursive_files "${INPUT_DEEP}" "${INPUT_SHALLOW}" -gsutil -m rsync -r "${LOCAL_INPUTS}" "${INPUTS}/" +gcloud storage rsync --recursive "${LOCAL_INPUTS}" "${INPUTS}/" echo "Launching pipeline..." @@ -87,7 +87,7 @@ setup_expected_fs_output_entries "${DOCKER_GCS_OUTPUTS}" setup_expected_remote_output_entries "${OUTPUTS}" # Verify in the stdout file that the expected directories were written -readonly RESULT=$(gsutil cat "${STDOUT_LOG}") +readonly RESULT=$(gcloud storage cat "${STDOUT_LOG}") readonly FS_FIND_IN=$(echo "${RESULT}" | sed -n '/^BEGIN: find$/,/^END: find$/p' \ | grep --fixed-strings /mnt/data/input/"${DOCKER_GCS_INPUTS}") @@ -136,7 +136,7 @@ echo "On-disk output file list matches expected" # Verify in GCS that the DEEP directory is deep and the SHALLOW directory # is shallow. Gsutil prints directories with a trailing "/:" marker that is # stripped using sed in order to match the output format of the `find` utility. -readonly GCS_FIND="$(gsutil ls -r "${OUTPUTS}" \ +readonly GCS_FIND="$(gcloud storage ls -r "${OUTPUTS}" \ | grep -v '^ *$' \ | sed -e 's#/:$##')" diff --git a/test/integration/e2e_logging_content.sh b/test/integration/e2e_logging_content.sh index 1aabec0..4e51f9d 100755 --- a/test/integration/e2e_logging_content.sh +++ b/test/integration/e2e_logging_content.sh @@ -74,7 +74,7 @@ echo "Checking output..." # Check the results readonly STDOUT_RESULT_EXPECTED="$(echo -n "${STDOUT_MSG%.}")" -readonly STDOUT_RESULT="$(gsutil cat "${STDOUT_LOG}")" +readonly STDOUT_RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! diff <(echo "${STDOUT_RESULT_EXPECTED}") <(echo "${STDOUT_RESULT}"); then echo "STDOUT file does not match expected" exit 1 @@ -82,7 +82,7 @@ fi readonly STDERR_RESULT_EXPECTED="$(echo -n "${STDERR_MSG%.}")" -readonly STDERR_RESULT="$(gsutil cat "${STDERR_LOG}")" +readonly STDERR_RESULT="$(gcloud storage cat "${STDERR_LOG}")" if ! diff <(echo "${STDERR_RESULT_EXPECTED}") <(echo "${STDERR_RESULT}"); then echo "STDERR file does not match expected" exit 1 diff --git a/test/integration/e2e_runtime.sh b/test/integration/e2e_runtime.sh index fb276b9..4bd9e30 100755 --- a/test/integration/e2e_runtime.sh +++ b/test/integration/e2e_runtime.sh @@ -64,7 +64,7 @@ TMPDIR: EOF ) -readonly RESULT="$(gsutil cat "${STDOUT_LOG}")" +readonly RESULT="$(gcloud storage cat "${STDOUT_LOG}")" if ! diff <(echo "${RESULT_EXPECTED}") <(echo "${RESULT}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/e2e_skip.sh b/test/integration/e2e_skip.sh index 20eb2c0..99d48ed 100755 --- a/test/integration/e2e_skip.sh +++ b/test/integration/e2e_skip.sh @@ -29,9 +29,9 @@ source "${SCRIPT_DIR}/test_setup_e2e.sh" TEST_FILE_PATH_1="${OUTPUTS}/testfile_1.txt" TEST_FILE_PATH_2="${OUTPUTS}/testfile_2.txt" -echo "hello world" | gsutil cp - "${TEST_FILE_PATH_1}" +echo "hello world" | gcloud storage cp - "${TEST_FILE_PATH_1}" -if gsutil ls "${TEST_FILE_PATH_2}" &> /dev/null; then +if gcloud storage ls "${TEST_FILE_PATH_2}" &> /dev/null; then echo "Unexpected: the output file '${TEST_FILE_PATH_1}' already exists." exit 1 fi @@ -44,7 +44,7 @@ JOB_ID="$( --skip \ --wait)" -RESULT="$(gsutil cat "${TEST_FILE_PATH_1}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_1}")" if [[ "${RESULT}" != "hello world" ]]; then echo "Output file does not match expected (from step 4)" echo "Expected: hello world" @@ -60,7 +60,7 @@ JOB_ID="$( --skip \ --wait)" -RESULT="$(gsutil cat "${TEST_FILE_PATH_2}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_2}")" if [[ "${RESULT}" != "hello from the job" ]]; then echo "Output file does not match expected (from step 2)" echo "Expected: hello world" diff --git a/test/integration/e2e_skip_tasks.sh b/test/integration/e2e_skip_tasks.sh index 4359c78..0ef9ecb 100755 --- a/test/integration/e2e_skip_tasks.sh +++ b/test/integration/e2e_skip_tasks.sh @@ -30,7 +30,7 @@ TEST_FILE_PATH_1="${OUTPUTS}/testfile_1.txt" TEST_FILE_PATH_2="${OUTPUTS}/testfile_2.txt" TEST_FILE_PATH_3="${OUTPUTS}/testfile_3.txt" -if gsutil ls "${TEST_FILE_PATH_1}" &> /dev/null; then +if gcloud storage ls "${TEST_FILE_PATH_1}" &> /dev/null; then echo "Unexpected: the output file '${TEST_FILE_PATH_1}' already exists." exit 1 fi @@ -51,12 +51,12 @@ JOB_ID="$( --skip \ --wait)" -if ! gsutil ls "${TEST_FILE_PATH_1}" &> /dev/null; then +if ! gcloud storage ls "${TEST_FILE_PATH_1}" &> /dev/null; then echo "Unexpected: the output file '${TEST_FILE_PATH_1}' was not created." exit 1 fi -RESULT="$(gsutil cat "${TEST_FILE_PATH_1}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_1}")" if [[ "${RESULT}" != "hello world" ]]; then echo "Output file does not match expected (from step 1)" echo "Expected: hello world" @@ -72,7 +72,7 @@ JOB_ID="$( --skip \ --wait)" -RESULT="$(gsutil cat "${TEST_FILE_PATH_1}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_1}")" if [[ "${RESULT}" != "hello world" ]]; then echo "Output file does not match expected (from step 2)" echo "Expected: hello world" @@ -96,14 +96,14 @@ JOB_ID="$( --skip \ --wait)" -RESULT="$(gsutil cat "${TEST_FILE_PATH_1}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_1}")" if [[ "${RESULT}" != "hello world" ]]; then echo "Output file does not match expected (from step 3)" echo "Expected: hello world" echo "Got: ${RESULT}" exit 1 fi -RESULT="$(gsutil cat "${TEST_FILE_PATH_2}")" +RESULT="$(gcloud storage cat "${TEST_FILE_PATH_2}")" if [[ "${RESULT}" != "hello again from row 2" ]]; then echo "Output file does not match expected (from step 3)" echo "Expected: hello again from row 2" diff --git a/test/integration/io_setup.sh b/test/integration/io_setup.sh index 6cf7caf..db62459 100644 --- a/test/integration/io_setup.sh +++ b/test/integration/io_setup.sh @@ -54,10 +54,10 @@ readonly TEST_LOCAL_MOUNT_PARAMETER="file://${TEST_TMP_PATH}" function io_setup::mount_local_path_setup() { mkdir -p "${TEST_TMP_PATH}" if [[ ! -f "${TEST_TMP_PATH}/${POPULATION_FILE}" ]]; then - gsutil cp "${POPULATION_FILE_FULL_PATH}" "${TEST_TMP_PATH}/${POPULATION_FILE}" + gcloud storage cp "${POPULATION_FILE_FULL_PATH}" "${TEST_TMP_PATH}/${POPULATION_FILE}" fi if [[ ! -f "${TEST_TMP_PATH}/${INPUT_BAM_FILE}" ]]; then - gsutil cp "${INPUT_BAM_FULL_PATH}" "${TEST_TMP_PATH}/${INPUT_BAM_FILE}" + gcloud storage cp "${INPUT_BAM_FULL_PATH}" "${TEST_TMP_PATH}/${INPUT_BAM_FILE}" fi } readonly -f io_setup::mount_local_path_setup @@ -190,7 +190,7 @@ function io_setup::_check_output() { local output_file="${1}" local result_expected="${2}" - local result=$(gsutil cat "${output_file}") + local result=$(gcloud storage cat "${output_file}") if ! diff <(echo "${result_expected}") <(echo "${result}"); then echo "Output file does not match expected" exit 1 diff --git a/test/integration/io_tasks_setup.sh b/test/integration/io_tasks_setup.sh index 9b93627..f78d4ec 100644 --- a/test/integration/io_tasks_setup.sh +++ b/test/integration/io_tasks_setup.sh @@ -70,7 +70,7 @@ function io_tasks_setup::check_output() { output_path="$(grep "${input_bam}" "${TASKS_FILE}" | cut -d $'\t' -f 3)" output_file="${output_path%/*.md5}/$(basename "${input_bam}").md5" - result="$(gsutil cat "${output_file}")" + result="$(gcloud storage cat "${output_file}")" if ! diff <(echo "${expected}") <(echo "${result}"); then echo "Output file does not match expected" @@ -88,7 +88,7 @@ function io_tasks_setup::check_output() { expected="${POPULATION_MD5}" for ((i=0; i < tasks_count; i++)); do output_file="${OUTPUTS}/TASK_$((i+1)).md5" - result="$(gsutil cat "${output_file}")" + result="$(gcloud storage cat "${output_file}")" if ! diff <(echo "${expected}") <(echo "${result}"); then echo "Output file does not match expected" diff --git a/test/integration/script_block_external_network.sh b/test/integration/script_block_external_network.sh index a2b20e5..860b2fc 100755 --- a/test/integration/script_block_external_network.sh +++ b/test/integration/script_block_external_network.sh @@ -21,8 +21,8 @@ set -o nounset RC=0 -if ! gsutil -o 'Boto:num_retries=0' ls gs://genomics-public-data; then - 1>&2 echo "\`gsutil ls\` should not have succeeded" +if ! gcloud storage ls gs://genomics-public-data; then + 1>&2 echo "\`gcloud storage ls\` should not have succeeded" RC=1 fi diff --git a/test/integration/test_setup_e2e.py b/test/integration/test_setup_e2e.py index 898d878..9866e9d 100644 --- a/test/integration/test_setup_e2e.py +++ b/test/integration/test_setup_e2e.py @@ -102,7 +102,7 @@ def _environ(): print(" Checking if bucket exists") if not test_util.gsutil_ls_check("gs://%s" % DSUB_BUCKET): print("Bucket does not exist: %s" % DSUB_BUCKET, file=sys.stderr) - print("Create the bucket with \"gsutil mb\".", file=sys.stderr) + print("Create the bucket with \"gcloud storage buckets create\".", file=sys.stderr) sys.exit(1) # Set standard LOGGING, INPUTS, and OUTPUTS values @@ -133,7 +133,7 @@ def _environ(): if test_util.gsutil_ls_check("%s/**" % TEST_GCS_ROOT): print("Test files exist: %s" % TEST_GCS_ROOT, file=sys.stderr) print("Remove contents:", file=sys.stderr) - print(" gsutil -m rm %s/**" % TEST_GCS_ROOT, file=sys.stderr) + print(" gcloud storage rm --recursive %s/**" % TEST_GCS_ROOT, file=sys.stderr) sys.exit(1) if TASKS_FILE: diff --git a/test/integration/test_setup_e2e.sh b/test/integration/test_setup_e2e.sh index c5e7cc1..334b1c2 100755 --- a/test/integration/test_setup_e2e.sh +++ b/test/integration/test_setup_e2e.sh @@ -64,9 +64,9 @@ fi echo " Bucket detected as: ${DSUB_BUCKET}" echo " Checking if bucket exists" -if ! gsutil ls "gs://${DSUB_BUCKET}" 2>/dev/null; then +if ! gcloud storage ls "gs://${DSUB_BUCKET}" 2>/dev/null; then 1>&2 echo "Bucket does not exist (or we have no access): ${DSUB_BUCKET}" - 1>&2 echo "Create the bucket with \"gsutil mb\"." + 1>&2 echo "Create the bucket with \"gcloud storage buckets create\"." 1>&2 echo "Current gcloud settings:" 1>&2 echo " account: $(gcloud config get-value account 2>/dev/null)" 1>&2 echo " project: $(gcloud config get-value project 2>/dev/null)" @@ -155,10 +155,10 @@ echo "Output path: ${OUTPUTS}" readonly DSUB_PARAMS="${TEST_GCS_ROOT}/params" echo " Checking if remote test files already exists" -if gsutil ls "${TEST_GCS_ROOT}/**" 2>/dev/null; then +if gcloud storage ls "${TEST_GCS_ROOT}/**" 2>/dev/null; then 1>&2 echo "Test files exist: ${TEST_GCS_ROOT}" 1>&2 echo "Remove contents:" - 1>&2 echo " gsutil -m rm ${TEST_GCS_ROOT}/**" + 1>&2 echo " gcloud storage rm --recursive ${TEST_GCS_ROOT}/**" exit 1 fi diff --git a/test/integration/test_util.py b/test/integration/test_util.py index 8100261..b0d32ff 100644 --- a/test/integration/test_util.py +++ b/test/integration/test_util.py @@ -30,12 +30,12 @@ def to_string(stdoutbytes): def gsutil_ls_check(path): - return not subprocess.call('gsutil ls "%s" 2>/dev/null' % path, shell=True) + return not subprocess.call('gcloud storage ls "%s" 2>/dev/null' % path, shell=True) def gsutil_cat(path): return to_string( - subprocess.check_output('gsutil cat "%s"' % path, shell=True)) + subprocess.check_output('gcloud storage cat "%s"' % path, shell=True)) def diff(str1, str2): diff --git a/test/integration/unit_skip.test-fails.sh b/test/integration/unit_skip.test-fails.sh index e80a500..5fd629d 100755 --- a/test/integration/unit_skip.test-fails.sh +++ b/test/integration/unit_skip.test-fails.sh @@ -46,8 +46,8 @@ readonly NEWFILE="${OUTPUTS}/newfile" readonly OUT_FOLDER_2="${OUTPUTS}/newfolder" # Create pre-existing output -echo "test output" | gsutil cp - "${EXISTING}" -echo "test output" | gsutil cp - "${EXISTING_2}" +echo "test output" | gcloud storage cp - "${EXISTING}" +echo "test output" | gcloud storage cp - "${EXISTING_2}" echo "Job 1 ..."