Skip to content

Bug 5021: Spelling errors fixed by running scripts/spell-check.sh#578

Closed
mrumph wants to merge 2 commits intosquid-cache:masterfrom
mrumph:bug-5021-spelling-fixes
Closed

Bug 5021: Spelling errors fixed by running scripts/spell-check.sh#578
mrumph wants to merge 2 commits intosquid-cache:masterfrom
mrumph:bug-5021-spelling-fixes

Conversation

@mrumph
Copy link
Contributor

@mrumph mrumph commented Mar 25, 2020

No description provided.

Ran "scripts/spell-check.sh" from the Squid tree root directory.
@squid-prbot
Copy link
Collaborator

Can one of the admins verify this patch?

@rousskov
Copy link
Contributor

Thank you for posting this! Please close your old WIP PR #562 as obsoleted by the recently merged PR and this one.

@rousskov rousskov self-requested a review March 26, 2020 02:53
Copy link
Contributor

@yadij yadij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions inline about whether we can do the related changes better. Mostly notes about things that have been found by the tool but need manual attention - I do not insist on those being fixed here.

my $pid;
if($pid=fork()){
#do parrent staf
#do parent staf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"staf" is also wrong here. It should be "stuff"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spell-check.sh script will not catch every spelling error.
That is to be expected.
In this case, "staf" is not in our white list so we are not excluding it on purpose.
It must simply that it is not listed as a misspelling in the default dictionary.
So one way to handle this would be to create our own customized dictionary.
But that is beyond the scope of this PR.

bool hasQueue(const CommQuotaQueue*) const; ///< has a given queue
unsigned int quotaEnqueue(int fd); ///< client starts waiting in queue; create the queue if necessary
int quotaPeekFd() const; ///< retuns the next fd reservation
int quotaPeekFd() const; ///< returns the next fd reservation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken doxygen syntax. This will need a separate manual fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What rule is broken here?
We can solve this by adding "retuns" to our white list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amos meant that returns should be \returns. This problem is not related to this PR (and, IMO, should not be fixed in this PR).

In this change request, Amos is documenting a case where this PR modifies a line but does not fix all the problems on that line. For regular PRs, this would normally mean that the author should fix all those other problems as well. This is not a regular PR. The "author" here is, essentially, a script. That is why Amos correctly said that this problem "will need a separate manual fix".

I suggest interpreting the word "separate" as "not in this PR" :-).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the change from "retuns" to "returns" is not in itself an error?
Then no need to add it to the white list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the change from "retuns" to "returns" is not in itself an error?
Then no need to add it to the white list.

Correct on both counts.

static bool ParseQuotedOrToEol_; ///< The next tokens will be handled as quoted or to_eol token
static bool RecognizeQuotedPair_; ///< The next tokens may contain quoted-pair (\-escaped) characters
static bool PreviewMode_; ///< The next token will not poped from cfg files, will just previewd.
static bool PreviewMode_; ///< The next token will not popped from cfg files, will just previewd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-English grammar. Does codespell handle grammar at all or just spelling? if not this will need a manual fix later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codespell is only checking for spelling errors.
So "will not poped" should be changed to "will not be popped"?
The spell-check.sh script caught the spelling error but does not have the means to check the grammar error.
We could add "poped" to the white list, but in this case, the spelling fix is an improvement, correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the spelling fix is an improvement, correct?

Correct. I am pretty sure the word "later" in the change request implies "not in this PR" :-).

This situation is very similar to #578 (comment)

* correct.
*
\section WhatsInANode Whats in a node
\section WhatsInANode What's in a node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The informal form of "What is" breaks the doxygen outputs.

There are a few other places using informal language. Can codespell convert these spellings to formal style? "what is", "did not", "cannot", "there is", "that is" etc instead of; what's didn't can't there's that's

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important. We should not break Doxygen. @mrumph, after this is addressed, please see if you can generate documentation to make sure nothing got broken (by comparing before and after snapshots of [generated] files using diff -ur or similar).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best we can do at this point is to add words (whats) to the white list.
If we want "whats" to be automatically translated to "what is", then we would need our own customized dictionary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could someone point me to documentation on how to generate documentation for Squid?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know (or remember) the official answer to your question, but you can start with running doxygen squid.dox from the root repository directory. You may need to install "doxygen" (and "graphviz") first, but that is easy to do on Ubuntu.

There are also doxygen-related commands in doc/Programming-Guide/Makefile, but that Makefile does not work out of the box for me (because its phony targets are not marked as such?). YMMV.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want "whats" to be automatically translated to "what is", then we would need our own customized dictionary.

If that custom dictionary will replace the standard one (rather than just overwrite or add a few entries), then a custom dictionary is not a good solution IMO (and we should leave the desirable goal of expanding contractions alone for the time being). Otherwise, we should consider it. In other words, we are not going to maintain our own full dictionary, but we certainly can maintain a small set of the official dictionary exceptions/additions.

If a custom dictionary is not a good solution (as discussed above), then whitelisting whats to address this change request (assuming What's breaks Doxygen) is a good solution IMO.

fprintf(stderr, PROGRAM_NAME " WARNING, LDAP search error, trying to recover'%s'\n", ldap_err2string(rc));
ldap_msgfree(res);
/* try to connect to the LDAP server agin, maybe my persisten conexion failed. */
/* try to connect to the LDAP server again, maybe my persistent conexion failed. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"conexion" is non-English - it should be "connection". Here and elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in our white list so it is probably not in the default dictionary.

fprintf(stderr, "Usage: " PROGRAM_NAME " -b basedn -f filter [options] ldap_server_name\n\n");
fprintf(stderr, "\t-A password attribute(REQUIRED)\t\tUser attribute that contains the password\n");
fprintf(stderr, "\t-l password realm delimiter(REQUIRED)\tCharater(s) that devides the password attribute\n\t\t\t\t\t\tin realm and password tokens, default ':' realm:password\n");
fprintf(stderr, "\t-l password realm delimiter(REQUIRED)\tCharater(s) that divides the password attribute\n\t\t\t\t\t\tin realm and password tokens, default ':' realm:password\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Charater" also need a fix. Maybe manual later due to the \t issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is related to the \t issue.
Codespell is comparing the word "tcharater" against its default dictionary.
The \t and \n issue should be patched in codespell itself.
Maybe it is already being addressed?

@yadij yadij added S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion Squid-5 labels Mar 26, 2020
@yadij
Copy link
Contributor

yadij commented Mar 26, 2020

OK to test

@rousskov
Copy link
Contributor

@mrumph, just to clarify the scope here: This PR is about applying the official spell-check.sh script and nothing else. Consequently,

  1. If the current script breaks Doxygen, we must address that. It is a blocking issue. There should be one more PR fixing this issue, and that (small) PR should go in before this one. I did not know that Doxygen cannot handle single quotes in these contexts, or I would have flagged it earlier. Sorry! I agree with Amos that full versions of those contractions are preferred, but if we cannot have full versions, leaving the corresponding misspellings untouched is also acceptable. That would still require that special PR, of course.

  2. All other issues are not blocking AFAICT. If we can adjust codespell configuration (the whitelist, command-line parameters, etc.) to address some of them, we should do that in the same special PR that is mentioned in item 1. Those that cannot or should not be addressed via codespell configuration, should not be addressed manually in this PR (and probably should not be addressed in the foreseeable future either but that is open for debate). I bet that at least 20% of git-controlled text can (and perhaps should) be rewritten, but that is not what this PR (and this whole effort) is about. We need to stay focused on fixes that can and should be correctly automated with codespell, ignoring all other problems.

@rousskov
Copy link
Contributor

03:42:12 Unable to find image 'squidcache/buildfarm:x86_64-centos-7' locally

@kinkie, please see if you can fix the above Jenkins problem. There are more broken Jenkins tests, but I have not checked whether they are broken for the same reason.

@mrumph
Copy link
Contributor Author

mrumph commented Mar 26, 2020

So the comments show two types of problems:

  1. Spelling errors that were missed.
  2. Spelling fixes that caused harm.

The first case is to be expected and is outside the scope of this PR and PR #565.
The intent of these PRs is to address spelling errors that can be fixed without manual intervention.
There was a lot of discussion on this in the bug report 5021.
Do we want to consider a customized dictionary?
This can be accessed by -D option on codespell.

For case number 2, I can add some items to the white list.
This is what I see so far: whats

@mrumph
Copy link
Contributor Author

mrumph commented Mar 26, 2020

So the only thing that I am seeing to change at this time is to reverse the "whats" to "what's" change.
I could do a commit that reverses this manually and adds "whats" to the white list.
Then run the spell-check.sh again to verify that that change is prevented.
But first do some Doxygen builds to make sure that that is the only build problem.
Is this small change to the white list worth a separate PR?
We may be finding other words that need to be added to the white list as spell-check.sh is applied to other branches or other patches are applied.

@rousskov
Copy link
Contributor

Is this small change to the white list worth a separate PR?

I think so, but I am also OK with making it here if that is your strong preference. I know that creating a dedicated PR for a one-line whitelist change does not really make sense on the surface. For the master branch code, it does not really matter -- the end result will be the same. However, there may be a difference in other contexts:

  1. Those applying selective changes to other branches may want to patch their whitelists without getting 100 conflicts due to all other changes in this PR.
  2. Those studying the effects of applying the spell checking script to their branches may want to compare their diff with this PR commit. Having whitelist change there would just add an exception that would have to handle.
  3. Those studying the evolution of the whitelist a few years from now would prefer a dedicated commit with an explanation.
  4. Those applying other automated changes to the Squid repository may abuse this exception as creating a precedent to do nefarious things in their PRs.

None of the above problems are major, all have workarounds, and a separate PR does add quite a bit of an overhead for you. Thus, I do not insist on a separate PR for the whitelist change in this context.

And yes, future code-changing PRs may include whitelist changes. In those sitiations, a separate PR would definitely be an overkill.

Reverted change from "Whats" to "What's" to prevent Doxygen breakage.
Added "whats" to the white list.

 Changes to be committed:
	modified:   scripts/codespell-whitelist.txt
	modified:   src/clientStream.h
Copy link
Contributor

@yadij yadij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with this going in as-is now. The rest of the previous review items are just wishlist towards perfection.

@rousskov rousskov added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Mar 29, 2020
@squid-anubis squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Mar 29, 2020
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels labels Mar 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

M-merged https://github.com/measurement-factory/anubis#pull-request-labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants