Bug 5021: Spelling errors fixed by running scripts/spell-check.sh by mrumph · Pull Request #578 · squid-cache/squid

mrumph · 2020-03-25T20:29:37Z

No description provided.

Ran "scripts/spell-check.sh" from the Squid tree root directory.

squid-prbot · 2020-03-25T20:29:40Z

Can one of the admins verify this patch?

rousskov · 2020-03-26T02:53:03Z

Thank you for posting this! Please close your old WIP PR #562 as obsoleted by the recently merged PR and this one.

yadij

Some questions inline about whether we can do the related changes better. Mostly notes about things that have been found by the tool but need manual attention - I do not insist on those being fixed here.

yadij · 2020-03-26T06:39:43Z

scripts/formater.pl

    my $pid;
    if($pid=fork()){
-	#do parrent staf
+	#do parent staf


"staf" is also wrong here. It should be "stuff"

The spell-check.sh script will not catch every spelling error.
That is to be expected.
In this case, "staf" is not in our white list so we are not excluding it on purpose.
It must simply that it is not listed as a misspelling in the default dictionary.
So one way to handle this would be to create our own customized dictionary.
But that is beyond the scope of this PR.

yadij · 2020-03-26T06:44:52Z

src/ClientInfo.h

    bool hasQueue(const CommQuotaQueue*) const;  ///< has a given queue
    unsigned int quotaEnqueue(int fd); ///< client starts waiting in queue; create the queue if necessary
-    int quotaPeekFd() const; ///< retuns the next fd reservation
+    int quotaPeekFd() const; ///< returns the next fd reservation


Broken doxygen syntax. This will need a separate manual fix.

What rule is broken here?
We can solve this by adding "retuns" to our white list.

Amos meant that returns should be \returns. This problem is not related to this PR (and, IMO, should not be fixed in this PR).

In this change request, Amos is documenting a case where this PR modifies a line but does not fix all the problems on that line. For regular PRs, this would normally mean that the author should fix all those other problems as well. This is not a regular PR. The "author" here is, essentially, a script. That is why Amos correctly said that this problem "will need a separate manual fix".

I suggest interpreting the word "separate" as "not in this PR" :-).

So the change from "retuns" to "returns" is not in itself an error?
Then no need to add it to the white list.

So the change from "retuns" to "returns" is not in itself an error?
Then no need to add it to the white list.

Correct on both counts.

yadij · 2020-03-26T06:45:28Z

src/ConfigParser.h

    static bool ParseQuotedOrToEol_; ///< The next tokens will be handled as quoted or to_eol token
    static bool RecognizeQuotedPair_; ///< The next tokens may contain quoted-pair (\-escaped) characters
-    static bool PreviewMode_; ///< The next token will not poped from cfg files, will just previewd.
+    static bool PreviewMode_; ///< The next token will not popped from cfg files, will just previewd.


Non-English grammar. Does codespell handle grammar at all or just spelling? if not this will need a manual fix later.

Codespell is only checking for spelling errors.
So "will not poped" should be changed to "will not be popped"?
The spell-check.sh script caught the spelling error but does not have the means to check the grammar error.
We could add "poped" to the white list, but in this case, the spelling fix is an improvement, correct?

the spelling fix is an improvement, correct?

Correct. I am pretty sure the word "later" in the change request implies "not in this PR" :-).

This situation is very similar to #578 (comment)

yadij · 2020-03-26T06:48:30Z

src/clientStream.h

 *    correct.
 *
- \section WhatsInANode Whats in a node
+ \section WhatsInANode What's in a node


The informal form of "What is" breaks the doxygen outputs.

There are a few other places using informal language. Can codespell convert these spellings to formal style? "what is", "did not", "cannot", "there is", "that is" etc instead of; what's didn't can't there's that's

This is important. We should not break Doxygen. @mrumph, after this is addressed, please see if you can generate documentation to make sure nothing got broken (by comparing before and after snapshots of [generated] files using diff -ur or similar).

The best we can do at this point is to add words (whats) to the white list.
If we want "whats" to be automatically translated to "what is", then we would need our own customized dictionary.

Could someone point me to documentation on how to generate documentation for Squid?

I do not know (or remember) the official answer to your question, but you can start with running doxygen squid.dox from the root repository directory. You may need to install "doxygen" (and "graphviz") first, but that is easy to do on Ubuntu.

There are also doxygen-related commands in doc/Programming-Guide/Makefile, but that Makefile does not work out of the box for me (because its phony targets are not marked as such?). YMMV.

If we want "whats" to be automatically translated to "what is", then we would need our own customized dictionary.

If that custom dictionary will replace the standard one (rather than just overwrite or add a few entries), then a custom dictionary is not a good solution IMO (and we should leave the desirable goal of expanding contractions alone for the time being). Otherwise, we should consider it. In other words, we are not going to maintain our own full dictionary, but we certainly can maintain a small set of the official dictionary exceptions/additions.

If a custom dictionary is not a good solution (as discussed above), then whitelisting whats to address this change request (assuming What's breaks Doxygen) is a good solution IMO.

yadij · 2020-03-26T06:58:24Z

src/auth/digest/LDAP/ldap_backend.cc

                    fprintf(stderr, PROGRAM_NAME " WARNING, LDAP search error, trying to recover'%s'\n", ldap_err2string(rc));
                    ldap_msgfree(res);
-                    /* try to connect to the LDAP server agin, maybe my persisten conexion failed. */
+                    /* try to connect to the LDAP server again, maybe my persistent conexion failed. */


"conexion" is non-English - it should be "connection". Here and elsewhere.

This is not in our white list so it is probably not in the default dictionary.

yadij · 2020-03-26T07:00:01Z

src/auth/digest/eDirectory/ldap_backend.cc

        fprintf(stderr, "Usage: " PROGRAM_NAME " -b basedn -f filter [options] ldap_server_name\n\n");
        fprintf(stderr, "\t-A password attribute(REQUIRED)\t\tUser attribute that contains the password\n");
-        fprintf(stderr, "\t-l password realm delimiter(REQUIRED)\tCharater(s) that devides the password attribute\n\t\t\t\t\t\tin realm and password tokens, default ':' realm:password\n");
+        fprintf(stderr, "\t-l password realm delimiter(REQUIRED)\tCharater(s) that divides the password attribute\n\t\t\t\t\t\tin realm and password tokens, default ':' realm:password\n");


"Charater" also need a fix. Maybe manual later due to the \t issues.

Yes. This is related to the \t issue.
Codespell is comparing the word "tcharater" against its default dictionary.
The \t and \n issue should be patched in codespell itself.
Maybe it is already being addressed?

yadij · 2020-03-26T07:34:14Z

OK to test

rousskov · 2020-03-26T13:50:08Z

@mrumph, just to clarify the scope here: This PR is about applying the official spell-check.sh script and nothing else. Consequently,

If the current script breaks Doxygen, we must address that. It is a blocking issue. There should be one more PR fixing this issue, and that (small) PR should go in before this one. I did not know that Doxygen cannot handle single quotes in these contexts, or I would have flagged it earlier. Sorry! I agree with Amos that full versions of those contractions are preferred, but if we cannot have full versions, leaving the corresponding misspellings untouched is also acceptable. That would still require that special PR, of course.
All other issues are not blocking AFAICT. If we can adjust codespell configuration (the whitelist, command-line parameters, etc.) to address some of them, we should do that in the same special PR that is mentioned in item 1. Those that cannot or should not be addressed via codespell configuration, should not be addressed manually in this PR (and probably should not be addressed in the foreseeable future either but that is open for debate). I bet that at least 20% of git-controlled text can (and perhaps should) be rewritten, but that is not what this PR (and this whole effort) is about. We need to stay focused on fixes that can and should be correctly automated with codespell, ignoring all other problems.

rousskov · 2020-03-26T13:54:54Z

03:42:12 Unable to find image 'squidcache/buildfarm:x86_64-centos-7' locally

@kinkie, please see if you can fix the above Jenkins problem. There are more broken Jenkins tests, but I have not checked whether they are broken for the same reason.

mrumph · 2020-03-26T17:20:05Z

So the comments show two types of problems:

Spelling errors that were missed.
Spelling fixes that caused harm.

The first case is to be expected and is outside the scope of this PR and PR #565.
The intent of these PRs is to address spelling errors that can be fixed without manual intervention.
There was a lot of discussion on this in the bug report 5021.
Do we want to consider a customized dictionary?
This can be accessed by -D option on codespell.

For case number 2, I can add some items to the white list.
This is what I see so far: whats

mrumph · 2020-03-26T19:42:37Z

So the only thing that I am seeing to change at this time is to reverse the "whats" to "what's" change.
I could do a commit that reverses this manually and adds "whats" to the white list.
Then run the spell-check.sh again to verify that that change is prevented.
But first do some Doxygen builds to make sure that that is the only build problem.
Is this small change to the white list worth a separate PR?
We may be finding other words that need to be added to the white list as spell-check.sh is applied to other branches or other patches are applied.

rousskov · 2020-03-26T19:59:32Z

Is this small change to the white list worth a separate PR?

I think so, but I am also OK with making it here if that is your strong preference. I know that creating a dedicated PR for a one-line whitelist change does not really make sense on the surface. For the master branch code, it does not really matter -- the end result will be the same. However, there may be a difference in other contexts:

Those applying selective changes to other branches may want to patch their whitelists without getting 100 conflicts due to all other changes in this PR.
Those studying the effects of applying the spell checking script to their branches may want to compare their diff with this PR commit. Having whitelist change there would just add an exception that would have to handle.
Those studying the evolution of the whitelist a few years from now would prefer a dedicated commit with an explanation.
Those applying other automated changes to the Squid repository may abuse this exception as creating a precedent to do nefarious things in their PRs.

None of the above problems are major, all have workarounds, and a separate PR does add quite a bit of an overhead for you. Thus, I do not insist on a separate PR for the whitelist change in this context.

And yes, future code-changing PRs may include whitelist changes. In those sitiations, a separate PR would definitely be an overkill.

Reverted change from "Whats" to "What's" to prevent Doxygen breakage. Added "whats" to the white list. Changes to be committed: modified: scripts/codespell-whitelist.txt modified: src/clientStream.h

yadij

I'm okay with this going in as-is now. The rest of the previous review items are just wishlist towards perfection.

Bug 5021: Spelling errors fixed by running scripts/spell-check.sh

04f2469

Ran "scripts/spell-check.sh" from the Squid tree root directory.

rousskov self-requested a review March 26, 2020 02:53

yadij requested changes Mar 26, 2020

View reviewed changes

yadij added S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion Squid-5 labels Mar 26, 2020

yadij mentioned this pull request Mar 26, 2020

WIP: Bug 5021 - Add a script to fix spelling errors with codespell #562

Closed

Bug 5021: Address comments in pull request squid-cache#578

a0a32ef

Reverted change from "Whats" to "What's" to prevent Doxygen breakage. Added "whats" to the white list. Changes to be committed: modified: scripts/codespell-whitelist.txt modified: src/clientStream.h

yadij approved these changes Mar 28, 2020

View reviewed changes

rousskov added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Mar 29, 2020

rousskov approved these changes Mar 29, 2020

View reviewed changes

squid-anubis pushed a commit that referenced this pull request Mar 29, 2020

Bug 5021: Spelling errors fixed by running scripts/spell-check.sh (#578)

2f8abb6

squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Mar 29, 2020

squid-anubis closed this Mar 29, 2020

yadij removed the backport-to-v5 label Feb 20, 2023

Conversation

mrumph commented Mar 25, 2020 • edited by rousskov Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

squid-prbot commented Mar 25, 2020

Uh oh!

rousskov commented Mar 26, 2020

Uh oh!

yadij left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yadij commented Mar 26, 2020

Uh oh!

rousskov commented Mar 26, 2020

Uh oh!

rousskov commented Mar 26, 2020

Uh oh!

mrumph commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mrumph commented Mar 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rousskov commented Mar 26, 2020

Uh oh!

yadij left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mrumph commented Mar 25, 2020 •

edited by rousskov

Loading

mrumph commented Mar 26, 2020 •

edited

Loading

mrumph commented Mar 26, 2020 •

edited

Loading

yadij left a comment •

edited

Loading