-
Notifications
You must be signed in to change notification settings - Fork 29
Include actual python path (sys.executable) in bootstrap cache key #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
benjyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want/need to call os.path.realpath() on that?
|
Ah, good catch. I think My test: if running within a first virtualenv with the python executable a symbolic link, (Mildly interestingly, use of |
| @@ -404,6 +408,7 @@ function run_bootstrap_tools { | |||
| "os_name=$(uname -s)" | |||
| "arch=$(uname -m)" | |||
| "python_path=${python}" | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we omit python_path from the key now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, now I'm wondering if realpath is actually a mistake - possibly we do want different virtualenvs that symlink to the same interpreter to have different cache keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the realpath is required to ensure this key works to reliably stop issues like pantsbuild/actions#5. That is: the cached directory includes a symlink to the system, so the key should change if the literal target of the symlink changes, and since virtualenv seems to flatten symlinks (rather than having symlinks to symlinks), the key should too.
As you say, maybe different virtualenvs that happen to link to the same interpreter should end up with different keys, and I think python_path may capture that. That said, I do also wonder if python_path is no longer necessary. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess even the same venv could change over time, if someone gets really fiddly. There is no easy way to truly fingerprint a python interpreter's full state. But yeah, let's leave python_path in as an extra bulwark.
This fixes a (small) bug in #128: the virtualenv may have a symlink to the 'actual' path of the python executable, and the
python_pathvalue used may've itself been a symlink or a shim. This usessys.executableas an estimate of this path, which will hopefully be good enough for the circumstances wherebootstrap-cache-keyis useful.After this patch, running on my system shows (note the difference between
python_pathandpython_executable_path):It looks like
virtualenvdoes more complicated things looking atsys._base_executable(if it exists).I don't think it's worth reproducing that logic, and it's not possible to invoke virtualenv directly at this point (this needs to be able to run before installing any packages). An alternative would be to no longer use
virtualenvand instead switch to the stdlibvenv, which can be run as part of the cache-key process. I suspect the use ofvirtualenvmay've been from before pants required Python version >= 3.3 (when venv was added)? Or maybe pants uses features beyond whatpython -m venv ...can offer?