Maintenance: update snapshot script for git#24
Maintenance: update snapshot script for git#24yadij merged 4 commits intosquid-cache:masterfrom yadij:master
Conversation
| ## | ||
|
|
||
| AC_INIT([Squid Web Proxy],[5.0.0-BZR],[http://bugs.squid-cache.org/],[squid]) | ||
| AC_INIT([Squid Web Proxy],[5.0.0-VCS],[http://bugs.squid-cache.org/],[squid]) |
There was a problem hiding this comment.
This is now master, not the v5.0 branch. Would it be possible to say "master" instead of "5.0.0" here?
There was a problem hiding this comment.
The snapshots from master branch are alpha releases. The alpha version is currently 5.0.0-*. When that changes to v6 there will be explicit commits branching v5 into beta and updating the master to 6.0.0.
The design of snapshot version numeric is to enable clean integration with beta and stable release numberings. So package versioning systems in our downstream distros can handle custom builds and identify builds of v5 alphas from v4 alphas, etc. Many of them have automation performing "a new version is available" notifications to both packagers and administrators based on these numerics.
There was a problem hiding this comment.
I doubt that calling master code v5.0.0 is a good idea. Yes, it solves some problems, but it also creates others.
However, I am not going to make that disagreement a blocking issue or waste your time discussing pros and cons. If you insist on assigning a numeric version to the master code in maintenance scripts and configure.ac, I can live with that as long as we have a proper master branch (which we now do).
mksnapshot.sh
Outdated
There was a problem hiding this comment.
-x is normally done in the caller. I am a fan of -e but if it does not work well for this script, I am not objecting to its removal (I know it is difficult to use in some cases), even though it seems out of this PR scope.
There was a problem hiding this comment.
I am making that -x temporarily until the CI job log can be reviewed on some snapshots made with git. When that is working the above chunk gets reverted to -e for normal use.
There was a problem hiding this comment.
Your answer does not compute for me because when -x is needed (and I am not disputing or even discussing its need) it should go into the caller, not the script itself.
This is not a blocking issue for me.
There was a problem hiding this comment.
Removing -x now, the bits I was worried about going wrong now cannot.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
You seem to be thinking in bzr terms. In git, tags are global -- they do not belong to a branch. In other words, specifying a tag alone is sufficient to identify a code snapshot. Do we need the branch name here? See git describe below for a related discussion.
There was a problem hiding this comment.
I saw comments about git archive working from anywhere. That is how bzr and svn work. They turn out to be wrong for git. git dies with "not a repository" errors for any out-of-tree attempts to archive/export, such as our CI was doing with bzr.
Where bzr was happy with just a branch path and took care of checkout etc. git requires archive to be run from inside a clone repository, with a specific branch checkout.
So to retain the same snapshot result git requires two parameters to three commands where bzr took only one parameter to one command.
Your statement about tags being global is incorrect for the magical HEAD tags. A "git archive HEAD" on checkout master produces a bundle of revision=56174e91c whereas in the same repository clone after "checkout v4.0" that git command produces a bundle of revision=ce53d9579.
That is why the branch is required separate from tag now on the script command line. To do anything beyond the default HEAD tag it cannot be assumed from the branch path like bzr was doing.
There was a problem hiding this comment.
Your statement about tags being global is incorrect for the magical HEAD tags
LOL. My statement also does not apply to the magical unicorns and flying carpets. AFAICT, git HEAD is not a tag. Unlike tags, HEAD belongs to the "heads" namespace:
ls .git/refs/
heads notes remotes tags
I recommend not using HEAD as the script parameter. To take a snapshot of the latest branch commit, the caller should probably pass the branch name (or that last commit if they need to synchronize across multiple scripts/calls).
This problem remains unresolved.
git requires archive to be run from inside a clone repository, with a specific branch checkout.
BTW, let's try to limit each Github review comment to one problem. It is easier to keep track of unresolved problems that way.
There was a problem hiding this comment.
So you are laughing at me for thinking this thing often mentioned in the git documentation as The "tag" to use for certain things and recommended for use in parameters which take tags, actually is a special-case tag? very funny.
I've tried going back to the bzr way of doing it, and found my initial mistake. bzr took the full path of the target branch, git only takes the name and assumes it is part of the current repository. That is sufficient for this scripts needs and is far simpler - so going with that now. Please review latest changes.
There was a problem hiding this comment.
The "X is incorrect because it does not cover magical Y" statement made me laugh. I am sorry if that laugh caused any harm. It was not my intention.
Git, including git documentation, is far from perfect, my current understanding of git is very far from comprehensive, and context matters a lot. I am fairly certain that my recommendation to stick with specific names (such as tags, commit SHAs, and, if really needed, branches) is the right direction for the snapshot-making script, but the path in that direction may be bumpy.
git only takes the name and assumes it is part of the current repository
This is your choice. You can specify the remote (a.k.a. repository) explicitly. In fact, when a single git directory contains multiple remotes, you often have to specify them explicitly. For example:
$ git describe --tags upstream/v3.0
SQUID_3_0_STABLE26-6-g175130e
git log origin/v3.5..upstream/v3.5
mksnapshot.sh
Outdated
There was a problem hiding this comment.
Please remove VCSROOT as unused.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
Please check whether you need git checkout. IIRC, git archive can produce an archive without a checkout.
There was a problem hiding this comment.
It validates that we have a valid repository and the branch actually exists before the archive is attempted
We cannot archive HEAD without being inside a checkout of the wanted branch. I did see people writing about it being possible, but git contradicts them:
- git+ archive --format=tar HEAD
tar -xC /src/squid/v5_outoftree_snapshot_test/squid-HEAD-mksnapshot/
fatal: Not a git repository (or any of the parent directories): .git
And, if done from a temporary folder within a valid repository HEAD has the behaviour written about above. So for now we need to ensure specific branch checkout, though it remains configurable in the script command line.
There was a problem hiding this comment.
It sounds like the problems with archiving are stemming from using HEAD, which we should not use, and which is discussed elsewhere. Yes, using (unqualified) HEAD requires a checkout because there are many HEADs (one per branch).
[git checkout] validates that we have a valid repository and the branch actually exists before the archive is attempted
While true, that statement seems like an argument against doing checkout before archive! We do not actually care whether the branch existed some time before the archive. We care that it existed for the archive command to succeed.
This is not a huge deal, but if we can avoid checkout, we should.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
Please see whether we can switch to using git describe instead. For example, I get the following output on your PR branch:
$ git describe --tags
SQUID_4_0_16-426-g56174e9
which is not usable "as is" but that could be because we do not have the right tag for the master branch (yet). I hope (but do not know for sure) that with the right setup you can replace all hard-coded branch names with git describe.
There was a problem hiding this comment.
Maybe in future, but right now it has several things against it:
-
I am keeping the changes simple for now since there are a lot of scripts to go through. It may happen in a followup, but for now pulling just the revision ID is sufficient. Using the git rev-parse command instead of writing a custom parser under time pressure seems like the best choice right now.
-
I doubt we could avoid the branch name yet, since as your own output demonstrates clearly git thinks the Squid-5 alpha code is part of the Squid-4.0.16 release. That is a result of us not having alpha branching tags in the bzr repo, not the git conversion, so medium-term fixable.
-
The same command for me produces different results, which does not bode well for cross-platform reliability - given the CI infrastructure is building the snapshots in yet another environment than both of us run.
-
The master server scripts do not (yet) guarantee the current checkout is the branch being snapshot. At least one place is running the snapshot script from master branch on out-of-repository copies of the stable branches to ensure they get snapshot appropriate to their released code build. I expect that to disappear later in this process, but for now this script needs to be compatible.
There was a problem hiding this comment.
- Your item 1 can be a valid excuse but I am not convinced that using
git describewould lead to (a lot) more changes or (significantly) increase complexity. The proposedgit rev-parse --short HEADgives your the latest commit SHA on the current branch whilegit describe $branchmay be able to give you the right Squid version number (even for master!) and possibly the same SHA (if needed). - Your item 2 could be a side effect of us lacking the right tag for master. That tag would tell
git describethat we are on a v5 path. - Your item 3 could be a side effect of your PR lacking a dedicated branch (normally, PRs originate from developer branches, not master). It could also be a side effect of how I fetched your PR -- I am still learning how to do that. In either case, it would not be applicable to branches.
- I do not fully understand your item 4 but it seems to be irrelevant because
git describecan be given a branch as a parameter.
Handling this properly requires understanding how git describe works and, possibly, adding a few tags. The long-term benefit may be significant, but this is not a blocking problem. If you do not want to study this any further, I certainly cannot insist that you do (because I do not know whether this investigation is likely to have a positive outcome).
| ## | ||
|
|
||
| AC_INIT([Squid Web Proxy],[5.0.0-BZR],[http://bugs.squid-cache.org/],[squid]) | ||
| AC_INIT([Squid Web Proxy],[5.0.0-VCS],[http://bugs.squid-cache.org/],[squid]) |
There was a problem hiding this comment.
I doubt that calling master code v5.0.0 is a good idea. Yes, it solves some problems, but it also creates others.
However, I am not going to make that disagreement a blocking issue or waste your time discussing pros and cons. If you insist on assigning a numeric version to the master code in maintenance scripts and configure.ac, I can live with that as long as we have a proper master branch (which we now do).
mksnapshot.sh
Outdated
There was a problem hiding this comment.
Your answer does not compute for me because when -x is needed (and I am not disputing or even discussing its need) it should go into the caller, not the script itself.
This is not a blocking issue for me.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
Your statement about tags being global is incorrect for the magical HEAD tags
LOL. My statement also does not apply to the magical unicorns and flying carpets. AFAICT, git HEAD is not a tag. Unlike tags, HEAD belongs to the "heads" namespace:
ls .git/refs/
heads notes remotes tags
I recommend not using HEAD as the script parameter. To take a snapshot of the latest branch commit, the caller should probably pass the branch name (or that last commit if they need to synchronize across multiple scripts/calls).
This problem remains unresolved.
git requires archive to be run from inside a clone repository, with a specific branch checkout.
BTW, let's try to limit each Github review comment to one problem. It is easier to keep track of unresolved problems that way.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
It sounds like the problems with archiving are stemming from using HEAD, which we should not use, and which is discussed elsewhere. Yes, using (unqualified) HEAD requires a checkout because there are many HEADs (one per branch).
[git checkout] validates that we have a valid repository and the branch actually exists before the archive is attempted
While true, that statement seems like an argument against doing checkout before archive! We do not actually care whether the branch existed some time before the archive. We care that it existed for the archive command to succeed.
This is not a huge deal, but if we can avoid checkout, we should.
mksnapshot.sh
Outdated
There was a problem hiding this comment.
- Your item 1 can be a valid excuse but I am not convinced that using
git describewould lead to (a lot) more changes or (significantly) increase complexity. The proposedgit rev-parse --short HEADgives your the latest commit SHA on the current branch whilegit describe $branchmay be able to give you the right Squid version number (even for master!) and possibly the same SHA (if needed). - Your item 2 could be a side effect of us lacking the right tag for master. That tag would tell
git describethat we are on a v5 path. - Your item 3 could be a side effect of your PR lacking a dedicated branch (normally, PRs originate from developer branches, not master). It could also be a side effect of how I fetched your PR -- I am still learning how to do that. In either case, it would not be applicable to branches.
- I do not fully understand your item 4 but it seems to be irrelevant because
git describecan be given a branch as a parameter.
Handling this properly requires understanding how git describe works and, possibly, adding a few tags. The long-term benefit may be significant, but this is not a blocking problem. If you do not want to study this any further, I certainly cannot insist that you do (because I do not know whether this investigation is likely to have a positive outcome).
| echo "STATE..." | ||
| echo "PACKAGE: ${PACKAGE}" | ||
| echo "VERSION: ${VERSION}" | ||
| echo "BRANCH: ${branch}" |
There was a problem hiding this comment.
Adding a dummy comment in hope of triggering a CI check. Please ignore.
Edit: This is not helping. Perhaps we cannot trigger a CI check for PRs that were open before CI integration. We will know when there is a second/new PR. Pushing new code into the old PR may also trigger a CI check.
Reverting to the BZR way of doing things solely on branch name. It turns out that all the git documentation saying to use tags and indicating a tag named HEAD was best to use for latest code snapshots was wrong. So long as we consistently use the branch name in all git commands they are applied to that branchs HEAD revision instead of the working directory's checkout.
rousskov
left a comment
There was a problem hiding this comment.
I am suggesting two changes. Both can be ignored so I am marking the review as "approved". Both can be addressed without reposting for review.
Please either edit your branch to end up with a single commit OR use squash commit.
| if [ $# -lt 1 ]; then | ||
| echo "Usage: $0 [branch]" | ||
| echo "Where [branch] is the path under /bzr/ to the branch to snapshot." | ||
| echo "Where [branch] is the name of the branch to snapshot." |
There was a problem hiding this comment.
I would replace branch with revision (or similar) because one does not need a git branch to create a snapshot:
Where [revision] is the git tree-ish to snapshot (e.g., a branch name)
Creating a snapshot of a tagged version or a commit is a perfectly reasonable thing to do. This minor change does not affect the script code much.
N.B. Tree-ish is documented in "man git". Your git archive expects a tree-ish argument.
| revision=`git rev-parse --short ${branch}` | ||
| suffix="${date}-r${revision}" | ||
| eval `grep "^ *PACKAGE_VERSION=" configure | sed -e 's/-BZR//' | sed -e 's/PACKAGE_//'` | ||
| eval `grep "^ *PACKAGE_VERSION=" configure | sed -e 's/-VCS//' | sed -e 's/PACKAGE_//'` |
There was a problem hiding this comment.
This is one of the places where it would be better to extract the "version" information from git rather than essentially expect all git branches to change configure.ac. I understand that you may not want to change this part of the snapshot "architecture" now, but it would be a perfect opportunity to do so IMHO.
|
It looks like Semaphore CI cannot see a pre-CI PR, despite you pushing new PR code. I have temporary removed CI tests as the PR merge prerequisite so that you can merge this PR when you are ready. |
The changes are now small enough not to worry about correct branch alignment between code and revision ID.
* Update snapshot script after git migration - Remove unused BZRROOT environment variable - Replace tag with branch name
* Maintenance: update snapshot script for git (#24) * Update snapshot script after git migration - Remove unused BZRROOT environment variable - Replace tag with branch name * Update source-maintenance script for git (#26) * replace bzr calls with git equivalent * remove obsolete ROOT and PWD variables (git does not support non-recursive file listing) * add exceptions to ignore more files caught by git than bzr * Maintenance: remove unused .bzrignore (#29) * Remove .bzrignore from git * Prep for 3.5.27 * SourceFormat Enforcement
* Maintenance: update snapshot script for git (squid-cache#24) * Update snapshot script after git migration - Remove unused BZRROOT environment variable - Replace tag with branch name * Update source-maintenance script for git (squid-cache#26) * replace bzr calls with git equivalent * remove obsolete ROOT and PWD variables (git does not support non-recursive file listing) * add exceptions to ignore more files caught by git than bzr * Maintenance: remove unused .bzrignore (squid-cache#29) * Remove .bzrignore from git * Prep for 3.5.27 * SourceFormat Enforcement
* Prep for 3.5.27 (#48) * Maintenance: update snapshot script for git (#24) * Update snapshot script after git migration - Remove unused BZRROOT environment variable - Replace tag with branch name * Update source-maintenance script for git (#26) * replace bzr calls with git equivalent * remove obsolete ROOT and PWD variables (git does not support non-recursive file listing) * add exceptions to ignore more files caught by git than bzr * Maintenance: remove unused .bzrignore (#29) * Remove .bzrignore from git * Prep for 3.5.27 * SourceFormat Enforcement * 3.5.27
No description provided.