Skip to content

BLD: comply with PEP518#3398

Merged
IAlibay merged 10 commits intoMDAnalysis:developfrom
tylerjereddy:treddy_pyproject_pep518
Feb 12, 2022
Merged

BLD: comply with PEP518#3398
IAlibay merged 10 commits intoMDAnalysis:developfrom
tylerjereddy:treddy_pyproject_pep518

Conversation

@tylerjereddy
Copy link
Member

@tylerjereddy tylerjereddy commented Aug 21, 2021

  • add a pyproject.toml file to comply
    with PEP518, and to allow for wheel builds
    that leverage PEP517

  • in short, in a fresh/naive Python environment
    with no dependencies available, it is not currently
    possible to use i.e., pip install . to install
    MDAnalysis (or to use any of the other emerging tools
    like poetry and flit that are PEP518 dependent)

  • this will change almost nothing about how we
    operate builds in the near-term (still very setup.py
    focused), but it does begin to bring us into alignment
    with the future of the Python packaging ecosystem

  • this solves various "chicken and egg" problems like
    "how does pip know your project needs setuptools
    if it has to read setup.py, which contains elements
    that require setuptools to read them to understand
    what dependencies are needed"

  • I verified this works on an aarch64 ThunderX2 node
    running RHEL with python -m pip install . successfully
    building a wheel for MDAnalysis and installing it; many
    other pieces of information can be added to this file, though
    this build-system section is the most fundamental

  • note that this is an oversimplified draft--it should
    be safe to use, but we'll probably want to dial back
    and pin to an older NumPy instead of using the latest NumPy
    we can get that is greater than our minimum; I'd suggest
    I/we do that in a follow-up since it will require confirming
    which minimum versions of NumPy are needed on which platforms--similar
    to this kind of thing:
    https://github.com/scipy/scipy/blob/master/pyproject.toml#L23

  • I also think we should probably be safe to require a much
    newer minimum version of Cython in this file really, though
    I've replicated our setuptools specification for now

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

@IAlibay
Copy link
Member

IAlibay commented Aug 21, 2021

Hold for 2.1.0?

@tylerjereddy
Copy link
Member Author

Hold for 2.1.0?

Yeah, I'm not trying to squeeze this in, just floating the idea with the team for now. It should basically "do nothing" as far as our current processes are concerned, but I wouldn't add entropy to the release by including it just to be safe.

@tylerjereddy
Copy link
Member Author

I should also mention that pip will basically still run the setup.py file and install all the other stuff specified in there like SciPy and networkx, after it does the initial install of the stuff in the .toml file (I've intentionally not touched setup.py since changes there require deeper thought).

@codecov
Copy link

codecov bot commented Aug 21, 2021

Codecov Report

Merging #3398 (c34cb56) into develop (e4ed4ff) will increase coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #3398      +/-   ##
===========================================
+ Coverage    94.06%   94.10%   +0.04%     
===========================================
  Files          176      189      +13     
  Lines        23294    24596    +1302     
  Branches      3305     3305              
===========================================
+ Hits         21911    23146    +1235     
- Misses        1337     1404      +67     
  Partials        46       46              
Impacted Files Coverage Δ
MDAnalysisTests/formats/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/visualization/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/transformations/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/analysis/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/auxiliary/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/datafiles.py 100.00% <0.00%> (ø)
MDAnalysisTests/topology/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/topology/base.py 97.82% <0.00%> (ø)
MDAnalysisTests/lib/__init__.py 100.00% <0.00%> (ø)
MDAnalysisTests/core/__init__.py 100.00% <0.00%> (ø)
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4ed4ff...c34cb56. Read the comment docs.

@orbeckst
Copy link
Member

Should there be tests showing that we can build packages from the current checkout?

Such tests would also serve as an example for how to use the new toml file, especially with some of the newer packaging tools that you mention.

@tylerjereddy
Copy link
Member Author

We could build a wheel in CI, install it, then do the runtime testing with the oldest version of NumPy we support to check the ABI compatibility perhaps. I believe this would currently fail that by using a "too new" version of NumPy to build the wheel proper, so would need to address that I think. May be interesting to confirm that CI catches that.

* add a a `pyproject.toml` file to comply
with PEP518, and to allow for wheel builds
that leverage PEP517

* in short, in a fresh/naive Python environment
with no dependencies available, it is not currently
possible to use i.e., `pip install .` to install
MDAnalysis (or to use any of the other emerging tools
like poetry and flit that are PEP518 dependent)

* this will change almost nothing about how we
operate builds in the near-term (still very `setup.py`
focused), but it does begin to bring us into alignment
with the future of the Python packaging ecosystem

* this solves various "chicken and egg" problems like
"how does pip know your project needs setuptools
if it has to read setup.py, which contains elements
that require setuptools to read them to understand
what dependencies are needed"

* I verified this works on an aarch64 ThunderX2 node
running RHEL with `python -m pip install .` successfully
building a wheel for MDAnalysis and installing it; many
other pieces of information can be added to this file, though
this `build-system` section is the most fundamental

* note that this is an oversimplified draft--it should
be safe to use, but we'll probably want to dial back
and pin to an older NumPy instead of using the latest NumPy
we can get that is greater than our minimum; I'd suggest
I/we do that in a follow-up since it will require confirming
which minimum versions of NumPy are needed on which platforms--similar
to this kind of thing:
https://github.com/scipy/scipy/blob/master/pyproject.toml#L23

* I also think we should probably be safe to require a much
newever minimum version of Cython in this file really, though
I've replicated our setuptools specification for now
@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch from 5dbcd18 to acaa8f5 Compare September 1, 2021 02:37
@tylerjereddy
Copy link
Member Author

Ok, I've pushed in a sample CI test that should fail because the wheel build will currently use the latest NumPy available and result in an ABI compatibility issue against an older NumPy (this is commonly checked in wheel building workflows).

Reviewers should be able to see something like E ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject since newer NumPy versions can add struct members that older runtime NumPy versions do not know about.

The next step I propose is to adjust the pyproject.toml such that it allows this new CI test to pass (but note that this would just scratch the surface, since there are other Python versions/platforms to deal with, so this is just to get the file rolling).

@tylerjereddy
Copy link
Member Author

tylerjereddy commented Sep 1, 2021

Indeed, that exact failure is already visible here: https://dev.azure.com/mdanalysis/cefe1135-cd6a-4689-8710-15bd37ad81e6/_apis/build/builds/2589/logs/39

It may disappear b/c I pushed in a fix already--see original failure from my fork: https://dev.azure.com/tylerjereddy/95673c6f-2c02-4a76-94e1-d58ebf2b2056/_apis/build/builds/1201/logs/46

@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch 2 times, most recently from 5eee32f to e8a9e74 Compare September 1, 2021 03:03
@tylerjereddy
Copy link
Member Author

ok, I pushed in an attempted fix to get the CI wheel test to pass--may require some iteration to get it just right

@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch 2 times, most recently from 943f4b3 to 43ef007 Compare September 1, 2021 03:47
* adjust `pyproject.toml` to use the lowest viable
version of NumPy with Python `3.8` on "regular"
platforms to enhance ABI compatibility of our wheels
@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch from 43ef007 to 78a73d8 Compare September 1, 2021 03:53
@tylerjereddy
Copy link
Member Author

Looks like we currently require numpy>=1.18.0, even though >=1.16.0 is also mentioned in our setup.py error handling--that may need an update.

@IAlibay
Copy link
Member

IAlibay commented Sep 1, 2021

Looks like we currently require numpy>=1.18.0, even though >=1.16.0 is also mentioned in our setup.py error handling--that may need an update.

Oops my bad - I'll do a quick fix for that, is it just line 192?

* adjust Azure CI to also test wheel builds
for Python 3.7, 3.9

* try bumping up the minimum Cython version
used in `pyproject.toml` because of this
observed error with MDAnalysis wheel build
test suite:
```
AttributeError: type object 'gsd.fl.GSDFile' has no attribute
'__reduce_cython__'
```

* add `pyproject.toml` placeholder entries for Python 3.7
and 3.9 as well; let's see how CI does with these..
@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch from 7e81255 to da51541 Compare September 2, 2021 01:49
@tylerjereddy
Copy link
Member Author

Dealing with AttributeError: type object 'gsd.fl.GSDFile' has no attribute '__reduce_cython__' for the wheel build test suite failures looks "fun"; IIRC we've had various issues with gsd support on Windows from time to time.

A quick search online suggests that bumping the version of Cython used to build can help, but those are pre-packaged wheels of gsd used at run/test-time I think. I wonder if we're getting two versions of gsd involved because of the isolated wheel build, and then later the runtime deps installs.

@IAlibay
Copy link
Member

IAlibay commented Sep 2, 2021

Dealing with AttributeError: type object 'gsd.fl.GSDFile' has no attribute '__reduce_cython__' for the wheel build test suite failures looks "fun"; IIRC we've had various issues with gsd support on Windows from time to time.

A quick search online suggests that bumping the version of Cython used to build can help, but those are pre-packaged wheels of gsd used at run/test-time I think. I wonder if we're getting two versions of gsd involved because of the isolated wheel build, and then later the runtime deps installs.

I've not had time to look at the error specifically - would it make sense to propose raising the global minimum version of gsd to 1.9.3? (that way we don't have to differentiate between windows and *nix/macOS?). 1.9.3 is now ~ 2 years old, so it's not completely NEP29-like, but it might not be too much of a problem?

The other (and maybe better) option is to raise the version, but also make GSD optional (instead of required by setup)? This would pave the way towards a lower dependency mdanalysis-base package?

@tylerjereddy
Copy link
Member Author

@tylerjereddy I have unofficial access to an M1 Mac now

@lilyminium Cool. It also occurs to me, since this has been stale for so long, that we could potentially save a bunch of work by using https://github.com/scipy/oldest-supported-numpy for a more concise pyproject.toml, assuming MDAnalysis doesn't deviate too much from that scheme in i.e., our setuptools version restrictions.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the massive delay on this @tylerjereddy - just a couple of questions.

"Cython>=0.28",
# lowest NumPy we can use for a given Python,
# except for more exotic platforms (linux/Mac ARM flavors)
"numpy==1.18; python_version=='3.7' and (platform_machine!='arm64' or platform_system!='Darwin') and platform_machine!='aarch64'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something about how this works, is there a reason why this isn't just "numpy>=1.18"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I'm not sure I fully understand your comments about going to a lower numpy version. Given that we're meant to be sticking to NEP29 going forward, probably all we'll want to do is bump things up as necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You always want to build with the lowest possible NumPy version because there is forward but not backward ABI compatibility at runtime.

That's also why we should probably just use: https://github.com/scipy/oldest-supported-numpy (that leaves the decisions about minimum viable NumPy up to the experts) See also, i.e., pandas example:
https://github.com/pandas-dev/pandas/blob/main/pyproject.toml#L8

What's the timeline for getting this stuff ready?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I'd like to cut a release soon-ish (week or two max). If it doesn't make it then it can always go in 2.2.0. (we're going to try to trim releases so they are closer to every 3 months).

I'm not sure I understand how this works though. If someone calls python setup.py install without any dependencies then will it pull the lowest numpy version and stick to that only?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup.py install is effectively deprecated (though should still work for quite some time), and shouldn't be affected by pyproject.toml

For building MDAnalysis with a modern approach (i.e., pip install .)--pip will create an isolated build environment with the stuff we specify in this file, and build MDAnalysis in a manner that should be maximally compatible with whatever runtime environment the user has--whether that be new or old NumPy. An obvious application is building a binary wheel that won't barf with older versions of NumPy, within reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry took me so much time to get back to this.

pip will create an isolated build environment

🤯 Thanks for explaining this @tylerjereddy, I had no idea that was a thing.

In terms of contents of this PR I think it all makes sense now.


jobs:
- job: Windows
- job: Azure_Tests
Copy link
Member

@IAlibay IAlibay Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling I'd mess it up if I try to help deal with conflicts here - but I'm happy to give it a go if you're short on time for this @tylerjereddy

@IAlibay IAlibay mentioned this pull request Jan 31, 2022
5 tasks
@IAlibay
Copy link
Member

IAlibay commented Feb 12, 2022

🤔 the Cython failures here are non-sensical to me, especially for RTD - Cython is definitely installed by mamba, so the conflict is just weird unless it doesn't like what's in setup.cfg for some reason.

note: I attempted to fix the conflicts on the azure pipelines file but that shouldn't have led to our current issues.

* use oldest-supported-numpy instead of duplicating
the upstream logic in `pyproject.toml`

* `packaging` is now a dependency of `setup.py`, so we
need to add this in `pyproject.toml`
@tylerjereddy
Copy link
Member Author

tylerjereddy commented Feb 12, 2022

@IAlibay

the Cython failures here are non-sensical to me

That's because there's a try/except block in setup.py that requires packaging, and when the packaging import fails during the PEP-compliant pip install, the except block prints a misleading message about Cython.

I've pushed in a revision to change two things:

  • depend on oldest-supported-numpy PyPI package in pyproject.toml, to prevent us from having to duplicate the upstream logic for picking the minimum sufficient NumPy version in a given scenario (OS/architecture/Python implementation, etc.)
  • add packaging as a hard build-time dependency in pyproject.toml, to avoid the Cython error noted above

Other thoughts:

  • perhaps out of scope here, but I'd gently suggest moving packaging import outside of the try/except block that checks for Cython--that will prevent the deceptive catch on ImportError for packaging
  • some projects opt to just vendor the implementation of PEP440 instead of having a hard build-time requirement on packaging (SciPy does this, for example)--for such simple machinery the tradeoff to avoid an extra external dependency can be worth it, but that's probably a separate discussion to have with team..

@tylerjereddy
Copy link
Member Author

argh, our NumPy version requirements are too complex for oldest-supported-numpy.. I'll have to restore the manual stuff, but that means we'll have to carefully iterate/monitor that..

@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch from 1ee4974 to 9a078ec Compare February 12, 2022 21:45
* restore custom NumPy version handling logic to
`pyproject.toml` and also add a case for Python
`3.10`
@tylerjereddy tylerjereddy force-pushed the treddy_pyproject_pep518 branch from 9a078ec to c34cb56 Compare February 12, 2022 21:45
@IAlibay
Copy link
Member

IAlibay commented Feb 12, 2022

Thanks @tylerjereddy, I was somewhat confused as to why we didn't catch this at earlier commit stages (e.g. ba197e1), maybe the pre-2019 azure images had Version baked in as a default dependency?

re: packaging, I'm in support of both your points.

Actually on the point of PEP440, can we move

if sys.version_info[:2] < (3, 7):
print('MDAnalysis requires Python 3.7 or better. Python {0:d}.{1:d} detected'.format(*
sys.version_info[:2]))
print('Please upgrade your version of Python.')
sys.exit(-1)
to just using python_requires?

@tylerjereddy
Copy link
Member Author

Actually on the point of PEP440, can we move ...

I wouldn't recommend it just yet--I think we'd still want the best possible functionality/error feedback for old-school python setup.py install for now. The other reason is that I see i.e., SciPy hasn't gutted this out of setup.py either, probably for the same reason.

One other comment is that the chances that this draft pyproject.toml works for every possible scenario are pretty low at first--for example, I don't have an M1 mac, but my hope is we'll get useful feedback and iterate on it. The logic in that file is deceptively annoying, but likely better than the arbitrary code allowed in setup.py, if you buy in to the PEP ideals.

I think there are some more useful things we can add to pyproject.toml, and some of them we may need to add before uploading to PyPI to get the automatic metadata generation on the PyPI page looking nice, like this stuff for classifiers:

image

I was hoping to avoid doing all of that in this PR though, instead incrementally building it up.

@tylerjereddy
Copy link
Member Author

tylerjereddy commented Feb 12, 2022

I was somewhat confused as to why we didn't catch this at earlier commit stages

I think it is a simpler explanation--this PR is pretty old and the dependency on packaging was added in when you merged in develop, which would include commit d1fd748, that includes this change:

diff --git a/package/setup.py b/package/setup.py
index 8c37a7385..3634a972c 100755
--- a/package/setup.py
+++ b/package/setup.py
@@ -83,10 +83,10 @@ try:
     import Cython
     from Cython.Build import cythonize
     cython_found = True
-    from distutils.version import LooseVersion
+    from packaging.version import Version
 
     required_version = "0.16"
-    if not LooseVersion(Cython.__version__) >= LooseVersion(required_version):
+    if not Version(Cython.__version__) >= Version(required_version):
         # We don't necessarily die here. Maybe we already have
         #  the cythonized '.c' files.
         print("Cython version {0} was found but won't be used: version {1} "

setuptools vendors its own distutils version these days, and setuptools was already in pyproject.toml in the initial draft of this PR, so the isolated environment was "ok."

@IAlibay
Copy link
Member

IAlibay commented Feb 12, 2022

I was hoping to avoid doing all of that in this PR though, instead incrementally building it up.

Yeah that's fair. If you want to open up issues so we remember to do them that would be great.

I am hoping to release 2.1.0 ~ Wednesday, but anything that's easy enough to implement we can push through before then.

@tylerjereddy
Copy link
Member Author

Is test_frames_times a known/stochastic failure?

@IAlibay
Copy link
Member

IAlibay commented Feb 12, 2022

Is test_frames_times a known/stochastic failure?

I think I have seen it fail before, it's rare though.

@tylerjereddy
Copy link
Member Author

/azp run --name Azure_Tests Linux-Python37-64bit-full-wheel

@azure-pipelines
Copy link

No pipelines are associated with this pull request.

@tylerjereddy
Copy link
Member Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@tylerjereddy
Copy link
Member Author

I abstracted what I think we might want to add in our first pass before releasing here: gh-3526, though it is likely not critical (I think the build reqs are the most important part).

@IAlibay IAlibay merged commit 3b7590e into MDAnalysis:develop Feb 12, 2022
@IAlibay
Copy link
Member

IAlibay commented Feb 12, 2022

Thanks @tylerjereddy !

@tylerjereddy tylerjereddy deleted the treddy_pyproject_pep518 branch February 12, 2022 23:47
tylerjereddy added a commit to tylerjereddy/mdanalysis that referenced this pull request Feb 13, 2022
* hoist the import of `packaging` outside of a Cython-focused
`try`/`except` block to avoid a deceptive error message when
`packaging` is missing, per discussion at:
MDAnalysis#3398 (comment)

* we may also want to consider removing the external dependency
completely and vendoring the pertinent code since it is so small, but
maybe not this close to a release; an example:
https://github.com/scipy/scipy/blob/main/scipy/_lib/_pep440.py
IAlibay pushed a commit that referenced this pull request Mar 2, 2022
* hoist the import of `packaging` outside of a Cython-focused
`try`/`except` block to avoid a deceptive error message when
`packaging` is missing, per discussion at:
#3398 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants