-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-43348][PYTHON] Support Python 3.8 in PyPy3
#41024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you, @HyukjinKwon . |
|
Thank you, @Yikun . |
| from pickle import _Pickler as Pickler # noqa: F401 | ||
| else: | ||
| import pickle # noqa: F401 | ||
| from _pickle import Pickler # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From PyPy Python3.8, _pickle is removed.
| import pickle # noqa: F401 | ||
| from _pickle import Pickler # noqa: F401 | ||
| import pickle # noqa: F401 | ||
| from pickle import Pickler # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same with the upstream.
Python 3.8 in PyPy3
|
Could you review once more, @HyukjinKwon and @Yikun . |
|
Thank you, @HyukjinKwon ! |
|
Merged to master. Thank you all! |
…st only with PyPy 3.8 ### What changes were proposed in this pull request? This PR is a followup of #41024 that skips the test only with PyPy 3.8. ### Why are the changes needed? To narrow the scope of testing skipped. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR should verify the change. Closes #41085 from HyukjinKwon/SPARK-43354-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
### What changes were proposed in this pull request?
This PR aims two goals.
1. Make PySpark support Python 3.8+ with PyPy3
2. Upgrade PyPy3 to Python 3.8 in our GitHub Action Infra Image to enable test coverage
Note that there was one failure at `test_create_dataframe_from_pandas_with_day_time_interval` test case. This PR skips the test case and SPARK-43354 will recover it after further investigation.
### Why are the changes needed?
Previously, PySpark fails at PyPy3 `Python 3.8` environment.
```
pypy3 version is: Python 3.8.16 (a9dbdca6fc3286b0addd2240f11d97d8e8de187a, Dec 29 2022, 11:45:13)
[PyPy 7.3.11 with GCC 10.2.1 20210130 (Red Hat 10.2.1-11)]
Starting test(pypy3): pyspark.sql.tests.pandas.test_pandas_cogrouped_map (temp output: /__w/spark/spark/python/target/f1cacde7-d369-48cf-a8ea-724c42872020/pypy3__pyspark.sql.tests.pandas.test_pandas_cogrouped_map__rxih6dqu.log)
Traceback (most recent call last):
File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 188, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/pypy/pypy3.8/lib/pypy3.8/runpy.py", line 111, in _get_module_details
__import__(pkg_name)
File "/__w/spark/spark/python/pyspark/__init__.py", line 59, in <module>
from pyspark.rdd import RDD, RDDBarrier
File "/__w/spark/spark/python/pyspark/rdd.py", line 54, in <module>
from pyspark.java_gateway import local_connect_and_auth
File "/__w/spark/spark/python/pyspark/java_gateway.py", line 32, in <module>
from pyspark.serializers import read_int, write_with_length, UTF8Deserializer
File "/__w/spark/spark/python/pyspark/serializers.py", line 69, in <module>
from pyspark import cloudpickle
File "/__w/spark/spark/python/pyspark/cloudpickle/__init__.py", line 1, in <module>
from pyspark.cloudpickle.cloudpickle import * # noqa
File "/__w/spark/spark/python/pyspark/cloudpickle/cloudpickle.py", line 56, in <module>
from .compat import pickle
File "/__w/spark/spark/python/pyspark/cloudpickle/compat.py", line 13, in <module>
from _pickle import Pickler # noqa: F401
ModuleNotFoundError: No module named '_pickle'
```
To support Python 3.8 in PyPy3.
- From PyPy3.8, `_pickle` is removed.
- cloudpipe/cloudpickle#458
- We need this change.
- cloudpipe/cloudpickle#469
### Does this PR introduce _any_ user-facing change?
This is an additional support.
### How was this patch tested?
Pass the CIs.
Closes apache#41024 from dongjoon-hyun/SPARK-43348.
Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…st only with PyPy 3.8 ### What changes were proposed in this pull request? This PR is a followup of apache#41024 that skips the test only with PyPy 3.8. ### Why are the changes needed? To narrow the scope of testing skipped. ### Does this PR introduce _any_ user-facing change? No, test-only. ### How was this patch tested? CI in this PR should verify the change. Closes apache#41085 from HyukjinKwon/SPARK-43354-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR aims two goals.
Note that there was one failure at
test_create_dataframe_from_pandas_with_day_time_intervaltest case. This PR skips the test case and SPARK-43354 will recover it after further investigation.Why are the changes needed?
Previously, PySpark fails at PyPy3
Python 3.8environment.To support Python 3.8 in PyPy3.
_pickleis removed.Does this PR introduce any user-facing change?
This is an additional support.
How was this patch tested?
Pass the CIs.