Pants: Add BUILD metadata to handle git submodule #6258
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR has commits extracted from #6202 where I'm working on getting all of our unit tests to run under pants+pytest.
This PR focuses on managing the git submodule we use to test pack features, especially the content version selection that relies on
git worktree. Pants runs processes like pytest in a sandbox, but the.gitdirectory does not get copied into the sandbox. Plus,.gitis generally ignored, so I had to find a way to work around the pants assumptions that ignore.gitand copy the submodule bits into the sandbox.This is the target that copies the
.gitdirectory for the submodule:st2/BUILD
Lines 105 to 109 in a25bcfd
This relies on a few pants features, some of which are experimental (which means the interface can change in a future release of pants):
shell_command(...): This target tells pants to run a command when another target depends on this one. See the docs. For example, this is a test that depends on this target:st2/contrib/runners/python_runner/tests/unit/BUILD
Lines 15 to 18 in a25bcfd
environment=...: Environments is a newer, experimental feature in pants. Typically things run in thelocalenvironment, but they can also run remotely or in docker depending on the config. See the docs for this feature, and the blog post about it."in_repo_workspace": this is the name of the environment we're using. To define the environment, I had to register it inpants.tomland add aBUILDtarget. Here is the snippet frompants.toml(note that environments are an experimental/preview feature):st2/pants.toml
Lines 252 to 254 in a25bcfd
The
"in_repo_workspace"environment uses an even newer experimental feature, theexperimental_workspace_environment(...)target. See the docs overview and the docs reference. This environment is the key to capturing.git, because it allows us to run theshell_commandin the repo (in the workspace) instead of running in a sandbox. Pants could run a command using another target likerun_shell_command, but they either don't allow capturing output files for use in other sandboxed tasks, or they were more cumbersome. I tried to add plenty of documentation to the newBUILD.environmentfile. In the future we might add one or moredocker_environmenttargets as well. Here is the target definition:st2/BUILD.environment
Lines 10 to 14 in a25bcfd
Returning to the
shell_command, withcommand="cp -r .git/modules {chroot}/.git":{chroot}is very important. Though the command does not run in a sandbox (aka "chroot"), the command still has a sandbox that can contain generated files, or, for our purposes, files captured as "outputs" that can be placed, as "inputs", in the sandbox of targets that depend on the command. Therun_shell_commanddocs are helpful in understanding this.shell_commanddefines the files to capture from its sandbox. The command merely copies the files we need (which are very small because the test repo is tiny) into the sandbox so they can be captured:st2/BUILD
Line 118 in a25bcfd
execution_dependenciesmakes pants copy stuff into the sandbox, and more importantly, tells pants it has to rerun the command if the files change.output_dependenciesgets passed onto any targets that depend on the command, so they also transitively depend on the files.st2/BUILD
Lines 116 to 117 in a25bcfd
gitmodulestarget is defined here. Note that.gitis a file, not a directory, in the submodule. It points git to the actual location of the git metadata in the st2 repo's.gitdirectory. We can't directly capture.git/moduleslike this, because the.gitdirectory is ignored.st2/BUILD
Lines 97 to 103 in a25bcfd
st2/BUILD
Lines 110 to 115 in a25bcfd
Finally, the last piece of getting these tests to run with pytest+pants in #6202 was working around a quirk of cloning in GHA. The actions/checkout module only fetches 1 commit, which is great for st2.git, but it is not enough for the submodule which needs the full history and git tags. So, I updated The pants test GHA workflow to work around this.