Skip to content

Maintenance for git#29

Merged
yadij merged 4 commits intosquid-cache:masterfrom
yadij:maintenance-git
Aug 6, 2017
Merged

Maintenance for git#29
yadij merged 4 commits intosquid-cache:masterfrom
yadij:maintenance-git

Conversation

@yadij
Copy link
Contributor

@yadij yadij commented Jul 22, 2017

Update the repository ignored files list for git, also include a few entries that should have been added long ago.

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since your PR adds new files, but does not add many files that I think should be added, I have to ask: What are our criteria for .gitignore membership?

I would suggest that any (potentially) generated by build file should be listed. However, my previous attempt to add files based on that criteria was met with resistance from your side.

@yadij
Copy link
Contributor Author

yadij commented Jul 23, 2017 via email

@yadij
Copy link
Contributor Author

yadij commented Jul 23, 2017

Ah, I remember one worry a while back was being able to see what the master server snapshot and maintenance scripts were leaving sitting around their workspaces. Those scripts are now a lot better after two rounds of repo transition cleanup, though I'm not completely trusting them for removals.

@rousskov
Copy link
Contributor

whether the file is part of the build system which distros typically replace or update themselves

Relying on what "distros" do seems like a bad idea for something that affects Squid developers rather than distros, but even if we go with that definition, how do you justify ignoring SPONSORS? Do distros replace or update SPONSORS?!

being able to see what the master server snapshot and maintenance scripts were leaving sitting around their workspaces

If they leave something they should not leave, what happens? What kind of problems am I suffering for?

@yadij
Copy link
Contributor Author

yadij commented Jul 24, 2017

I consider SPONSORS to be sort-of part of the build chain since it is required to exist for tarball creation. But fair call, it is not strictly scripts or code. If we decide on a proper policy I will add/remove as necessary.

I've dug up the conversation I think you were referring to as past attempt to get things ignored. In that patch you were trying to get us to ignore .o, .la etc objects inside the build tree - not just auto-built source code. I stand by my initial reasons against those objects being ignored. If they are causing you pain then you should be building the code the much less painful way, out-of-tree. Which some call "The Right Way(tm)(r)" etc.

To re-quote that discussion point here:
"
What you seem to be missing is that if updating a Makefile.am or
configure.ac (eg. with bzr update) to remove the generated file. Then a
following "make clean" or even distclean will not remove the old files
from the build area. This will keep them hidden and possible affecting
the build for people who may not be aware of the change.

It can also hide makefile mistakes between $srcdir and $builddir or the
matching top_* variables if the developer changing the makefile is
working in-tree.
"

@rousskov
Copy link
Contributor

I consider SPONSORS to be sort-of part of the build chain since it is required to exist for tarball creation. But fair call, it is not strictly scripts or code.

To avoid misunderstanding, please note that I do not think "part of the build chain" or "scripts or code" conditions belong to the gitignore inclusion policy. I am not implying that you have proposed those conditions as policy terms; just saying that we should not use them.

I think we should have a well-defined policy. I hope we agree on that and can use that agreement as a starting point. So far, I know of two policy candidates (summarized from this PR comments):

  1. An untracked file that can be generated in a working directory by any valid Squid build command(s) should be listed in .gitignore. Build commands include ./bootstrap.sh, ./configure, make, test-builds.sh, and source-maintenance.sh (each with any supported parameters). Other files should not be listed in .gitignore.
  2. An untracked file that distros typically replace or update should be listed in .gitignore. Other files should not be listed in .gitignore.

Please invalidate or correct proposal 2 summary above if it does not accurately reflect your proposal.

As you know, I am not a fan of proposal 2 because .gitignore is meant for Project developers (not external distros) use, and because the proposal is not specific enough for me to know whether a given file X should be listed (and so I cannot review it further). You have not addressed that critique yet.

one worry a while back was being able to see what the master server snapshot and maintenance scripts were leaving sitting around their workspaces

If this is an argument against including, say, src/store.o, then why is it important to see whether they leave any .gitignored files?

If they are causing you pain then you should be building the code the much less painful way, out-of-tree.

You are not in a good position to judge the amount of pain out-of-tree builds would cause me. Why should we force one approach to building Squid on everyone in every situation instead of adding (commonly listed) make-generated files to .gitignore?

Which some call "The Right Way(tm)(r)" etc.

And some call in-tree builds a "grand Unix tradition". Such essentially meaningless labels do not make a good argument. IMO, the decision where to build should be left to the developer unless there are some very good reasons to force one way on everybody.

updating a Makefile.am to remove the generated file will not remove the old files from the build area. [listing them in gitignore] will keep them hidden and possible affecting the build for people who may not be aware of the change.

This is true. A broken master (e.g., with an out-of-sync .gitignore) may have bad side effects if we list generated files in .gitignore. However, a perfect master (with an in-sync .gitignore) has bad side effects if we do not list generated files in .gitignore. IMO, with all other factors being equal, we should optimize the latter case rather than trying to work around bugs that create the former case.

FWIW, I would not be against auto-generating (a portion of) .gitignore to minimize out-of-sync cases. With the gitignore inclusion rule I have proposed, it should be fairly straightforward to cover 90+% of cases, and that is all we may need to automate.

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarize my current position, I see two options:

  1. Agree on a .gitignore policy and apply it.
  2. Remove new entries and just commit the renamed .bzrignore file.

I recommend the first approach, but I do not insist on it. Either way, PR changes are needed.

@yadij
Copy link
Contributor Author

yadij commented Jul 25, 2017

Your proposal #1 is flawed. The auto-generated files you list fall into three distinct groups:

The first group is files generated by the ./bootstrap.sh file. Since that script is never bundled in any released package all changes and auto-generated content from executing that script should be ignored by the VCS. Allowing the VCS to purge them by mistake on any of its regular cleaning operations would be a major problem for packagers. With the exclusion of SPONSORS and libltdl folder all those generated files were already listed in the .bzrignore.
These are the files I refer to as downstream distros replacing. If they wish you use their own auto-tools version they either run ./bootstrap.sh itself or autoreconf to do the same process. The same set of files get replaced both ways.

The second group is the auto-tools generated files, created during the execution of ./configure and make. These are the files where dependency issues have been an ongoing major problem for both in-tree and out-of-tree builds. When things do go badly the VCS commands for recovering from the trouble require them not to be ignored, so the VCS can do its thing. Ignoring these files would increase the trouble tracking down the dependency and build linker issues we keep coming up against, and vastly complicate the cleanup/recovery process.

The third group are the files generated during ./test-builds.sh execution. These should also be explicitly not-ignored so they can be discarded automatically by the VCS tools when cleaning the test workspace for further tests. Doing so independent of the test scripts own cleanup provides both surety that the cleanup really is clean and to highlight any mistakes in the build process auto-cleaning. This is in itself a test of the testing logic.

./source-maintenance does not create any new files. It alters existing ones, in some cases completely re-generating their content. However they should all be tracked by the VCS if they are worth changing, so must not be ignored. The point of that script is to ensure the VCS contents is consistent and complies with our coding guidelines.

@yadij
Copy link
Contributor Author

yadij commented Jul 25, 2017

You are not in a good position to judge the amount of pain out-of-tree builds would cause me. Why should we force one approach to building Squid on everyone in every situation instead of adding (commonly listed) make-generated files to .gitignore?

I feel I am actually, having spent many years building a wide variety of code using both methods and a variety of IDE to do so. Out-of-tree is undoubtedly the less painful and annoying of the two methodologies whatever the absolute level of pain may be (that I cannot tell). Any in-tree pains which might be solved by .gitignore entries are a natural consequence of not choosing out-of-tree, and are resolved fully by changing the choice of build method. No I am not forcing anyone to do anything, we all face the pain our chosen workflow incurs. I just recommend a better workflow whenever I see someone suffering needlessly from an obviously bad choice - a favour which was done for me years ago by wiser heads.

If you absolutely must use in-tree builds for any reason, git can be made to use local ignore patterns without global entries in .gitignore (https://stackoverflow.com/questions/653454/how-do-you-make-git-ignore-files-without-using-gitignore). I propose that as a better workaround for in-tree specific pains rather than forcing unnecessary and long-term extra pains on other build workflows and confusion problems into debugging efforts.

@rousskov
Copy link
Contributor

Alex: You are not in a good position to judge the amount of pain out-of-tree builds would cause me.
Amos: I feel I am actually

I admire your confidence. Fortunately, if you agree that we should not force out-of-tree builds on everybody (even if you think they are better for everybody), then this disagreement is unimportant and can be safely ignored.

git can be made to use local ignore patterns without global entries in .gitignore

There are many ways to configure git indeed. Hopefully, we can agree on a .gitignore policy that accommodates most developers reasonably well, but if we fail, those other ways can, of course, be used. FWIW, I have been using them with bzr. I trust you would not insist that those alternative ways are used by others while you enjoy a policy-free control of the official .gitignore.

all changes and auto-generated content from executing [bootstrap.sh] should be ignored by the VCS

Agreed. This is covered by my proposal as well.

When things do go badly the VCS commands for recovering from the trouble require [make-generated files] not to be ignored

Your implied assumption that .gitignore contains files that git must always ignore is natural but wrong. .gitignore is just a source of information that git can use. How git uses that info depends on the context. For example, I can use git to remove all generated files (if those files are listed in .gitignore) by running "git clean -X". However, if my .gitignore is deficient, then I would have to either rely on some dangerous/destructive commands like "git clean -x" that may delete something valuable OR go outside a pure-git solution (because I refused to give git enough information).

[build-test.sh-generated files] should also be explicitly not-ignored so they can be discarded automatically by the VCS tools when cleaning the test workspace for further tests.

Sounds like the same misunderstanding as the previous group: Listing generated files in .gitignore has the opposite effect -- it allows us to safely discard them using git itself. It is not listing them that creates cleanup (and other, more serious) problems.

./source-maintenance does not create any new files.

It does. It tries to remove the files it creates, but when things go wrong, those files are left behind. One example I saw a few days ago were a few files matching the *.astylebak* pattern, but there might be others. Fortunately, my proposed policy covers them all, even if we cannot immediately predict all the patterns.

In summary, we agree on the first group, including groups 2 and 3 would solve the problems you want to solve (at least the specific ones I could understand/analyze), and your assumption about group 4 being empty is wrong (but my proposed policy would cover that group as well).

@yadij
Copy link
Contributor Author

yadij commented Jul 26, 2017

However, if my .gitignore is deficient, then I would have to either rely on some dangerous/destructive commands like "git clean -x" that may delete something valuable OR go outside a pure-git solution (because I refused to give git enough information).

So you are in complete agreement with me that .gitignore should NOT contain ./configure and make generated files? so that one does not have to resort to documented unsafe commands like -x to delete them.

Listing generated files in .gitignore has the opposite effect -- it allows us to safely discard them using git itself. It is not listing them that creates cleanup (and other, more serious) problems.

Try it yourself. Get a completely empty checkout of todays master. Run:
./bootstrap.sh ; git status ; echo "Makefile.in" > .gitignore ; git status ; git clean -dn

@rousskov
Copy link
Contributor

So you are in complete agreement with me that .gitignore should NOT contain ./configure and make generated files

No, I (still) favor the opposite design. I interepreted your "should be ignored by VCS" as "should be included in .gitingore". If that is not what you meant, then there is no agreement regarding the first group.

that one does not have to resort to documented unsafe commands like -x to delete them

I do not know which commands you are referring to, but git clean -X in my examples is documented to be as safe as any automated removal can get AFAICT.

Run: ./bootstrap.sh ; git status ; echo "Makefile.in" > .gitignore ; git status ; git clean -dn

You lost me here: The above sequence gives me the result I would expect, but I do not know what your point is. I assume that you are responding to the following assertion:

Alex: Listing generated files in .gitignore allows us to safely discard them using git itself. It is not listing them that creates cleanup (and other, more serious) problems.

However, your command sequence is unrelated to the first part of the assertion (because your sequence does not use the safe method for discarding listed files -- git clean -X) and can illustrate the second part (because your sequence can delete valuable files once the --dry-run flag is dropped to allow git to actually do something).

@yadij
Copy link
Contributor Author

yadij commented Jul 31, 2017

Alex: I do not know which commands you are referring to, but git clean -X in my examples is documented to be as safe as any automated removal can get AFAICT.

Only so long as you are happy erasing "removing all untracked files, including build products.". Note that includes all the manually created files one might be working on still.

The way I understand it, the -x / -X options are for testing purposes only. When a completely pristine checkout workspace is required for a rebuild test that needs to check the build system itself. For example that the repository content can completely build from-scratch after a clone into a previously empty directory - without having the costs of re-cloning the whole repo each test.

The more "normal" operation I was taught is to "git add" (or stash) temporary work when its thought to be ready. "git clean" (no options necessary). Do pre-commit rebuild to check the changes actually are complete. Then "git commit" the changes.

If you are relying on -x / -X then neither omissions nor bad listings in .gitignore show up as test failures, though omissions may show up with the -X variant if one is performing the commands manually instead of kicking off a script in a background shell and moving on to other important work while the machines churn over the QA.

Amos: Run: ./bootstrap.sh ; git status ; echo "Makefile.in" > .gitignore ; git status ; git clean -dn

Alex: You lost me here: The above sequence gives me the result I would expect, but I do not know what your point is.

My point was that listing files in .gitignore and using the normal git clean command does not result in those files being deleted - they are simply invisible to the unwary. In response to your statement that listing things in .gitignore is the best way to get things deleted.

@rousskov
Copy link
Contributor

Alex: git clean -X in my examples is documented to be as safe as any automated removal can get
Amos: Only so long as you are happy erasing "removing all untracked files, including build products.".

You may be looking at the wrong command. According to git clean --help, git clean -X does not remove all untracked files. Here are the docs:

-X
    Remove only files ignored by Git. This may be useful to rebuild
    everything from scratch, but keep manually created files.

The more "normal" operation I was taught is to "git add" temporary work when its thought to be ready. "git clean" (no options necessary). Do pre-commit rebuild to check the changes actually are complete. Then "git commit" the changes.

The above approach (after removing the unrelated "stash" command from your paragraph) is a recipe for disaster because it will remove newly authored files that humans often forget to "git add":

$ echo '10 hours of work' > file.h
$ echo '5 hours of work' > file.cc
$ git add file.cc
$ git clean
Removing file.h
$!@#%$)@%$)!

The affects are so bad that (some) modern gits try to prevent such disasters by disabling git clean without arguments by default! On the other hand, git clean -Xis safe:

$ echo '10 hours of work' > file.h
$ echo '5 hours of work' > file.cc
$ git add file.cc
$ git clean -X
$ ls
file.cc  file.h

Needless to say, if one relies on git clean -X a lot, they should alias it.

A better approach is to use "git status" (or, even better, one of its succinct variations) before commits and any potentially destructive commands, but that works well only when .gitignore contains generated files. Which naturally brings us back to the .gitignore rule 1 that I have proposed...

If you are relying on -X then neither omissions nor bad listings in .gitignore show up as test failures

You lost me here. Are you introducing another challenge that I should find a solution for? Should I devise a new "test" that detects .gitignore omissions and bad listings? And how will rule 2 allow for creation of such a test when it is so vague about what a .gitignore omission or bad listing is?! Or do you want some existing "test" to fail when .gitignore contains an omission or bad listing? I fear shifting goalposts, but please be more specific about this new problem that you think rule 1 creates and rule 2 does not, and I will try to address it.

Please note that development work requires a lot of different actions. I mentioned git clean -X only as a simply solution to the specific problems you have listed (to the extent I could extract specific problems from vague descriptions). You have then tried to discredit this command by claiming that it does something that it does not do. As of now, I have not seen any evidence that git clean -X would not solve those specific problems, but I am not saying that this command is a panacea. It is just one of many git commands that benefit from .gitignore being correct...

My point was that listing files in .gitignore and using the normal git clean command does not result in those files being deleted

Very true but I do not see the relevance -- I have not proposed using what you call "normal git clean command" (because it results in removal of files that developers usually do not want to remove). The "normal git clean command" should be avoided by humans (and is disabled in many git installations). See the above example for an illustration. Deleting a generated file is often desirable and never fatal. Deleting a file that you worked on for hours is rarely desirable and usually fatal. This is why a git clean -X alias works well, but git clean and git clean -x should be avoided (by humans).

In response to your statement that listing things in .gitignore is the best way to get things deleted.

Listing in .gitignore does not delete anything (naturally), but it allows git clean -X to safely delete listed files. I do not know of any better alternative (as far as deletion of generated files is concerned), and I do not know why you are discussing fatally flawed git clean and git clean -d that I have not proposed.

Let's focus on rules that define correct .gitignore contents. By now, you have seen several specific examples where my rule 1 helps developers. Ignoring incorrect statements about various git commands, I have not seen any specific examples where following that rule would hurt developers (or where your rule 2 would be better). Thus, I do not understand why you resist adopting rule 1.

@rousskov
Copy link
Contributor

rousskov commented Aug 1, 2017

As of 250af31, your PR simply removes .bzrignore. If that is intentional and you do not want to continue with devising a .gitignore policy (as a part of this PR), then please update PR commit message accordingly, and I will approve this PR.

@yadij
Copy link
Contributor Author

yadij commented Aug 5, 2017

I would kind of like to reach agreement on this, but my enthusiasm for continuing this PR is low. The driver for it was .bzrignore hanging around, and the content of that file was largely inherited from CVS requirements. So I think maybe we should test-drive a bit more without a formal .gitignore and just setup our own local exclusions to suit our own preferences.

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly cannot object to removal of stale .bzrignore. Please do fix the commit message accordingly when merging this.

@rousskov
Copy link
Contributor

rousskov commented Aug 5, 2017

So I think maybe we should test-drive a bit more without a formal .gitignore and just setup our own local exclusions to suit our own preferences.

I am doing exactly that already but I doubt this "test" can help us resolve our differences unless you start test-driving my .gitignore (which is a superset of yours). I already know that a smaller .gitignore is not usable for me (at least unless I stop using in-source builds).

@yadij
Copy link
Contributor Author

yadij commented Aug 6, 2017

I already test-drove that style of workflow with bzr (it has --detritus and --ignored equivalents of -df and -Xf). The result was several Squid releases that "worked perfectly" for me but no downloader could build the tarball - because of various auto-generated build products not being included.

I'm glad the workflow works for you. It just doesn't always. I want the bits we push out to everybody to be the bits that we can all stand behind as always working, and leave the risky bits to options (wiki docs?) people can use at their choice.

@yadij yadij force-pushed the maintenance-git branch from 250af31 to 9d162ba Compare August 6, 2017 07:24
@yadij yadij force-pushed the maintenance-git branch from 9d162ba to 61fd001 Compare August 6, 2017 09:46
@yadij yadij merged commit d17ea41 into squid-cache:master Aug 6, 2017
@rousskov
Copy link
Contributor

rousskov commented Aug 6, 2017

I already test-drove that style of workflow with bzr

For the record, I do not know what style of workflow you are referring to. IIRC, I have not suggested and you have not described any workflow styles in this discussion. We talked about deleting generated files, which is not workflow (but could be a workflow step).

The result was several Squid releases that ... no downloader could build

With virtually no relevant knowledge about your release scripts/process, I can only speculate that the problems you have experienced are unrelated to .gitignore and/or can be detected automatically by better scripts that should be executed anyway. The problems I was talking about are related to .gitignore and can be detected automatically only by adding expensive "optional" steps that most developers hate (and, hence, often skip).

Finally, if .gitignore X is better for developers while .gitignore Y is better for release managers, then we should make .gitignore X official because there are more developers (and relevant development actions) than release managers (and releases).

Let's come back to this discussion when/if the lack of official .gitignore starts causing problems for automation.

squidadm pushed a commit to squidadm/squid that referenced this pull request Aug 19, 2017
* Rename .bzrignore to .gitignore

* Remove .bzrignore from git

* Add extra auto-generated content to .gitignore

* Remove new files
yadij pushed a commit that referenced this pull request Aug 19, 2017
* Maintenance: update snapshot script for git (#24)

* Update snapshot script after git migration
 - Remove unused BZRROOT environment variable
 - Replace tag with branch name

* Update source-maintenance script for git (#26)

* replace bzr calls with git equivalent
* remove obsolete ROOT and PWD variables (git does not support non-recursive file listing)
* add exceptions to ignore more files caught by git than bzr

* Maintenance: remove unused .bzrignore (#29)

* Remove .bzrignore from git

* Prep for 3.5.27

* SourceFormat Enforcement
squidadm added a commit to squidadm/squid that referenced this pull request Aug 19, 2017
* Maintenance: update snapshot script for git (squid-cache#24)

* Update snapshot script after git migration
 - Remove unused BZRROOT environment variable
 - Replace tag with branch name

* Update source-maintenance script for git (squid-cache#26)

* replace bzr calls with git equivalent
* remove obsolete ROOT and PWD variables (git does not support non-recursive file listing)
* add exceptions to ignore more files caught by git than bzr

* Maintenance: remove unused .bzrignore (squid-cache#29)

* Remove .bzrignore from git

* Prep for 3.5.27

* SourceFormat Enforcement
yadij pushed a commit that referenced this pull request Aug 19, 2017
* Prep for 3.5.27 (#48)

* Maintenance: update snapshot script for git (#24)

* Update snapshot script after git migration
 - Remove unused BZRROOT environment variable
 - Replace tag with branch name

* Update source-maintenance script for git (#26)

* replace bzr calls with git equivalent
* remove obsolete ROOT and PWD variables (git does not support non-recursive file listing)
* add exceptions to ignore more files caught by git than bzr

* Maintenance: remove unused .bzrignore (#29)

* Remove .bzrignore from git

* Prep for 3.5.27

* SourceFormat Enforcement

* 3.5.27
@kinkie kinkie mentioned this pull request Jun 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants