Don't count form feed character (`\f` or `^L`) as newline, optionally

In Emacs, the [form feed](https://en.wikipedia.org/wiki/Page_break#Form_feed) `\f` or `^L` (Ctrl+L) character, 0xC in ASCII, is not considered to start a new line. Wikipedia explains:

> The form feed character is sometimes used in plain text files of source code as a delimiter for a page break, or as marker for sections of code. Some editors, in particular [emacs](https://en.wikipedia.org/wiki/Emacs) and [vi](https://en.wikipedia.org/wiki/Vi), have built-in commands to page up/down on the form feed character. This convention is predominantly used in [Lisp](https://en.wikipedia.org/wiki/Lisp_(programming_language)) code, and is also seen in [C](https://en.wikipedia.org/wiki/C_(programming_language)) and [Python](https://en.wikipedia.org/wiki/Python_(programming_language)) source code. [GNU](https://en.wikipedia.org/wiki/GNU) Coding Standards require such form feeds in C.[[2]](https://en.wikipedia.org/wiki/Page_break#cite_note-2) Editors like [Vim](https://en.wikipedia.org/wiki/Vim_(text_editor)) and [Emacs](https://en.wikipedia.org/wiki/Emacs) understand such sections and have shortcuts for moving among them.

In Emacs, a file containing just the three characters `\n\f\d` will be considered to have _two_ (2) lines, the first of them being `\n` and the second line `\f\n`. But codespell counts this file as a file with _three_ (3) lines.

Recipe to reproduce:

```
$ echo -ne "foo\n\f\nte\n" > /tmp/foo.txt
$ codespell /tmp/foo.txt
/tmp/foo.txt:4: te ==> the, be, we, to
```

If I open `/tmp/foo.txt` in Emacs, and try to jump to line 4, I end up on the empty line at the end of the file. I do not go to the line containing the typo, which in Emacs is line 3. This obviously gets worse the more `\n\f\n` there are in a file: every one means we land further and further from the actual typo.

Would it be possible to add an option to treat `\f` in the way that is expected by Emacs? I'm not sure what it should be called, but something like `--form-feed-no-newline` or `--emacs-form-feed` perhaps.

This would help tremendously when using codespell on Emacs Lisp source code. There are many, many Lisp files which contain the character sequence `\n\f\n`.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't count form feed character (`\f` or `^L`) as newline, optionally #2609

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Don't count form feed character (\f or ^L) as newline, optionally #2609

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Don't count form feed character (`\f` or `^L`) as newline, optionally #2609