-
Notifications
You must be signed in to change notification settings - Fork 21
[AN-503] Update to Dataproc 2.2.X #4839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #4839 +/- ##
===========================================
- Coverage 74.67% 74.67% -0.01%
===========================================
Files 166 166
Lines 14623 14622 -1
Branches 1156 1143 -13
===========================================
- Hits 10920 10919 -1
Misses 3703 3703
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
…e VM and investigate
…phere/leonardo into AN-503-update-to-dataproc-2.2.x
jenkins/dataproc-custom-images/prepare-custom-leonardo-jupyter-dataproc-image.sh
Show resolved
Hide resolved
| STEP_TIMINGS=($(date +%s)) | ||
|
|
||
|
|
||
| ## Installs Google Cloud Ops Agent that is now required for Datapoc 2.2.X ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main change in addition to updating docker compose, and that sneaky change in external IP assignment behavior.
It is annoying that the new log agent does not come pre-built into the dataproc image itself, but the install and setup was not too too bad in the end
http/src/main/scala/org/broadinstitute/dsde/workbench/leonardo/util/DataprocInterpreter.scala
Show resolved
Hide resolved
aednichols
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate the detailed comments.
http/src/main/scala/org/broadinstitute/dsde/workbench/leonardo/util/DataprocInterpreter.scala
Show resolved
Hide resolved
http/src/main/scala/org/broadinstitute/dsde/workbench/leonardo/util/DataprocInterpreter.scala
Show resolved
Hide resolved
jenkins/dataproc-custom-images/prepare-custom-leonardo-jupyter-dataproc-image.sh
Show resolved
Hide resolved
|
@Qi77Qi I modified the PR to make sure that Leonardo can support both the deployment of the AOU 2.2.13 image on Dataproc 2.1.x (aka what you currently have in production), and AOU 2.2.16 image on Dataproc 2.2.x. I will do some testing on my BEE, but this should let us release the new hail/dataproc version on terra without impacting RWB (you can switch your pre prod / prod environments on your own timeline). |
Jira ticket: https://broadworkbench.atlassian.net/browse/AN-503
Summary of changes
What
Why
Testing these changes
I pointed my BEE to this PR and was able to successfully launch a Hail and AOU image with both a spark single node, and a spark cluster with 2 nodes.
When opening a jupyter notebook, I can import and initialize the new version of hail:
I also was able to launch the AOU image that is currently I prod using the legacy Dataproc 2.1 image. So we should be safe to merge this as it won't impact RWB and they can move over to Dataproc 2.2 when they want:
jenkins retestorjenkins multi-test.