Skip to content

Conversation

@jgainerdewar
Copy link
Contributor

@jgainerdewar jgainerdewar commented Apr 9, 2025

Rebuild all terra-jupyter-* images on gcr.io/deeplearning-platform-release/tf2-cu123.2-17.py310. This upgrades us from ubuntu 20 to ubuntu 22.

Leo changes in DataBiosphere/leonardo#4843

# of jupyter and they're google-managed packages we don't have access to older versions of.
# Retain two plugins: google.cloud.bigquery and sql
RUN sed -i "s/^\( *'beatrix_jupyterlab',\)/#\1/g" /etc/ipython/ipython_kernel_config.py
RUN sed -i "s/^\( *'bq_stats',\)/#\1/g" /etc/ipython/ipython_kernel_config.py
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me a little because I'm not sure whether anyone is using these extensions. They are google-managed Python packages and we can't install versions that are compatible with our older notebook version, so I'm turning them off. Same deal in terra-jupyter-python.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am comfortable making that decision. Once we update to a newer version of notebooks we can add them back if we think we need it.


RUN chown -R $USER:users $JUPYTER_HOME \
# Disable nb_conda for now. Consider re-enable in the future
&& jupyter nbextension disable nb_conda --py --sys-prefix \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line no longer works because the extension can't be found. No need to disable if it isn't there!

"outputs": [],
"source": [
"output = !pip3 show pendulum\n",
"output = !pip3 show emoji\n",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test checks whether we can install a new package - in the new image, pendulum is already installed, so it makes the test fail. Switch to a package that is not already installed.

"output = !pip3 show emoji\n",
"print(output) # Should show not yet installed.\n",
"assert(0 == output.count('Name: pendulum'))"
"assert(0 == len([i for i in output if 'Name: emoji' in i]))"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to adjust this because the output strings now contain some color-coding junk characters. We need to check whether the strings contain the expected message rather than being equal to it.

@jgainerdewar jgainerdewar marked this pull request as ready for review April 24, 2025 19:36
@lucymcnatt lucymcnatt self-requested a review April 24, 2025 19:43
Comment on lines +45 to +49
# Turn off beatrix-jupyterlab and bq_stats plugins, they don't activate cleanly in our version
# of jupyter and they're google-managed packages we don't have access to older versions of.
# Retain two plugins: google.cloud.bigquery and sql
RUN sed -i "s/^\( *'beatrix_jupyterlab',\)/#\1/g" /etc/ipython/ipython_kernel_config.py
RUN sed -i "s/^\( *'bq_stats',\)/#\1/g" /etc/ipython/ipython_kernel_config.py

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reasonable compromise, that speaks to the need to delegate package selection to users.

Copy link
Collaborator

@LizBaldo LizBaldo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing, I can feel the pain in every git diff 🙈. I am approving to unblock but I would wait until all images are cached and can be launched on your BEE in leo before merging and triggering the GHA that will generate the version file in the documentation bucket

It is not advised to run build.sh locally, as this will push to the remote docker repo and delete the image locally upon completion.
Apple Silicon chips later then M1 (ex. M3) need:

`FROM --platform=linux/amd64/v2 <image name>`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡

"label": "Legacy GATK (GATK 4.2.4.0, Python 3.7.12, R 4.3.0)",
"version": "2.2.14",
"updated": "2025-04-24",
"image": "us.gcr.io/broad-dsp-gcr-public/terra-jupyter-gatk:2.3.9",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean 2.3.8 for GATK legacy instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, yes I do, good catch!

# of jupyter and they're google-managed packages we don't have access to older versions of.
# Retain two plugins: google.cloud.bigquery and sql
RUN sed -i "s/^\( *'beatrix_jupyterlab',\)/#\1/g" /etc/ipython/ipython_kernel_config.py
RUN sed -i "s/^\( *'bq_stats',\)/#\1/g" /etc/ipython/ipython_kernel_config.py
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am comfortable making that decision. Once we update to a newer version of notebooks we can add them back if we think we need it.

scikit-learn-intelex
xgboost
#older version of nbconvert fail to convert notebooks to html
nbconvert>=7.7.3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that this did not need to change in order to fix the notebook download issue on Terra 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm that is odd, I poked around a bit and there's an nbconvert_support section of jupyter_contrib_nbextensions, maybe the old version of that was incompatible for some reason

Copy link
Contributor Author

@jgainerdewar jgainerdewar Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is saying we want anything equal to or later than 7.7.3, so I assume there was a new package version published since we built the last images, which fixed the issue.

"import numpy as np\n",
"import pandas as pd\n",
"from pandas_profiling import ProfileReport\n",
"from ydata_profiling import ProfileReport\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did the name of this package change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

@jgainerdewar jgainerdewar merged commit 7d4b885 into master Apr 30, 2025
9 checks passed
@jgainerdewar jgainerdewar deleted the jd_AN-497_ubuntu22 branch April 30, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants