[ci/docker] Move CI + Docker to Python 3.8 [build_base]#35040
[ci/docker] Move CI + Docker to Python 3.8 [build_base]#35040krfricke merged 94 commits intoray-project:masterfrom
Conversation
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
|
The link error should be fixed by merging in recent changes from the main branch |
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
|
Failing tests are also failing on master |
aslonnie
left a comment
There was a problem hiding this comment.
I do not think I am the owner of any of these changes, and you probably do not need my approval. I am generally confused about why many of the changes are required for python version change. I feel that they at least need more explanations to justify.
for example, if we need to change some of the example code when upgrade from python 3.7 to 3.8. should we worry that the examples do not work on python 3.9, 3.10, which we claim to support and have container images that are released already?
I think most of the changes are not really due to python interpreter upgrade, but mostly due to library version changes of ray's dependencies. because the dependency version resolving for each python version will be different for each library (due to the python_version constraints in requirements.txt files), how do we know if ray works on python 3.9 / 3.10 ?
| ENV PATH "/home/ray/anaconda3/bin:$PATH" | ||
| ARG DEBIAN_FRONTEND=noninteractive | ||
| ARG PYTHON_VERSION=3.7.16 | ||
| ARG PYTHON_VERSION=3.8.16 |
There was a problem hiding this comment.
I think they have 3.8.17 already
| frame = pd.DataFrame() | ||
| for result in results: | ||
| frame = frame.append(result.metrics_dataframe, ignore_index=True) | ||
| frame = pd.concat([frame, result.metrics_dataframe], ignore_index=True) |
There was a problem hiding this comment.
what change caused this change?
There was a problem hiding this comment.
df.append has been deprecated and removed in e.g. pandas 1.5.x, which will be installed in some combinations of our requirements (1.3.5 is the last py3.7 release and still supported the deprecated API)
| modin==0.12.1; python_version < '3.8' | ||
| modin==0.12.1; python_version >= '3.8' |
There was a problem hiding this comment.
these are the same versions? why need the version split?
There was a problem hiding this comment.
See https://github.com/ray-project/ray/pull/35040/files#r1220435414 - we will remove the version pin in a follow-up (which will also upgrade pandas). This is to make sure that the py 3.7 version pin remains in that follow-up.
| # Requires decord which is unavailable for arm64 | ||
| gluoncv==0.10.5.post0; platform_machine != "arm64" | ||
| gpy==1.10.0 | ||
| gpytorch==1.8.1 |
There was a problem hiding this comment.
IIRC newer versions (which are available for py 3.8) would lead some tests to fail
bveeramani
left a comment
There was a problem hiding this comment.
LGTM! If you could add a comment about the Modin dependency issue, that'd be awesome.
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
Signed-off-by: Kai Fricke <kai@anyscale.com>
amogkam
left a comment
There was a problem hiding this comment.
Can we also update the corresponding documentation (installing ray guide) and the dockerhub info that the default python version is now 3.8 and not 3.7
…35040) Implementing ray-project#34863, this PR updates our CI and Docker images to use Python 3.8 as the default Python version. It also updates CUDA version numbers and removes stale CUDA images. The compatibility test becomes and explicit Python 3.7 compatibility test. Currently the tests we are running are (with some filters): - python/ray/tests/horovod - python/ray/tests/lightgbm - python/ray/tests/xgboost - python/ray/tests/ray_lightning - python/ray/tests/ml_py37_compat (keras + torch) - python/ray/air - python/ray/train (AIR tests) - python/ray/data (AIR tests) Signed-off-by: Kai Fricke <kai@anyscale.com> Signed-off-by: e428265 <arvind.chandramouli@lmco.com>
Why are these changes needed?
Implementing #34863, this PR updates our CI and Docker images to use Python 3.8 as the default Python version.
It also updates CUDA version numbers and removes stale CUDA images.
The compatibility test becomes and explicit Python 3.7 compatibility test. Currently the tests we are running are (with some filters):
We may want to add more tests here from other libraries, e.g. rllib (cc @sven1977 @kouroshHakha) and Serve (cc @iycheng @edoakes)
Technical details
Due to a bug in libstdc++ (e.g. tracked here: #33858), the Ray processes sometimes error out. It looks like this occurred rarely in Python 3.7, but after the switch to Python 3.8 we see it more often in some of our tests.
Note that this is an ephemeral failure and not due to specific tests. For instance, we saw
test_trial_runner_3.pyfail regularly, but it was always different tests. It seemed that just by the sheer amount of tests, and the quick succession of starting/stopping Ray clusters, one of these tests was bound to fail.For those tests, we introduced an intermediate solution using pytest reruns, namely the
--reruns 3option. This specifies reruns on the subtest level, as opposed to bazels test-suite level. The reruns option was able to make these flaky tests green. However, this undermines our flaky test tracking, as actually flaky tests will be detected less often. After moving to Ubuntu 22 and a new stdlibc++, we should remove these subtest-reruns.Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.