Skip to content

Conversation

@esainane
Copy link
Contributor

This expands the set of definite text files, allows linux users to check out files with their native line endings without interfering with the ability of windows users to check out files with their native line endings, ensures files in the index are consistent in having LF (git can automatically convert them to CRLF on checkout for platforms that want this), ensures text files have a trailing newline at the end of file, encodes files as UTF-8 where possible, and strips trailing whitespace for lines that have other content.

  • 4 file extensions are added to the list of definite text files, .h, .glsl, .fs and .json, joining the existing list of .lua, .txt, and .tdf.
  • 53 files were previously checked in as CRLF without conversion, and are now converted to LF in the index - windows users can still automatically checkout as CRLF, git will handle conversion automatically.
  • 509 files were previously missing a trailing newline.
  • 6 files were encoded in iso-8859-1 and could be trivially converted to UTF-8.
  • sounds/music/license.txt appeared to have been misencoded at some point - I've replaced what appears from context to originally have been a hyphen with an ASCII hyphen.
  • 757 files previously contained trailing whitespace on lines which had other content.

Not touched:

  • Text files in LuaUI/Fonts. These contain literal bytes between \x0 and \x255, so their "encoding" is left in binary.
  • Encoding of LuaUI/Widgets/cmd_area_attack_tweak.lua and lups/headers/tablebin.lua, detected as 'unknown-8bit'. The former file has binary bytes as part of the widget description; I suspect the correct solution is adjusting it to use escaped codes, similar to what eg. Global Build Command uses, but that is probably best discussed in a follow-up PR. The latter file ends with CHILLCODE\x99 in a comment; I have no idea what this is supposed to indicate or if there is an alternative we can use that leaves files friendlier to external tooling.
  • Lines consisting entirely of whitespace. Working out what the correct level of indentation is and adjusting is going to be slightly more involved. There are many lines which consist entirely of whitespace but are not meant to have any indentation at all, for example, still in the index.

Overall, this makes for a much cleaner working tree which plays nicely with external tools.

eol was forcing crlf on all lua, tdf and txt files, even on linux
systems.

The standard setting for line conversion is autocrlf - on linux, this
defaults to false, and on windows, the installer defaults to true -
though this can be changed, if for example this is installed on a
windows VM where the underlying filesystem is shared with linux.

Given .h, .glsl, .fs, and .json files were not listed and seemingly able
to be worked with by the windows developers without complaint,
presumably they already have this configured correctly (being the
default), and these settings were only serving as an annoyance for linux
users.

Linux users may want to double check that they're at least set to use
core.autocrlf input, and possibly re-checkout files, run dos2unix, or
otherwise make sure that they have native line endings in their working
tree going forward.
This ensures we have no CRLF checked into the index.

This process can be automated by the following:

git add --renormalize .
This process can be automated by the following:

while read f; do
  if diff /dev/null "$f" | tail -1 | grep '^\\ No newline' > /dev/null; then
    echo >> "$f";
  fi;
done < <(git ls-files '*.lua' '*.tdf' '*.h' '*.glsl' '*.fs' '*.json' '*.txt')
This process can be automated by the following:

while read f; do
  iconv -f iso-8859-1 -t utf-8 -o "$f.tmp" "$f";
  mv -f "$f.tmp" "$f";
done < <(
  while read f; do
    file -i "$f";
  done < <(
    git ls-files '*.lua' '*.tdf' '*.h' '*.glsl' '*.fs' '*.json' '*.txt' | grep -v ^LuaUI/Fonts
  ) |
  grep -v -e 'us-ascii$' -e 'utf-8$' | grep 'iso-8859-1$' | cut -f1 -d:
)

Some components are redundant, but are useful to retain for inspectability.
I'm not sure which encoding this was meant to be originally, and FILE(1)
doesn't have any suggestions either.
This applies only to trailing whitespace after real content. Lines consisting
entirely of whitespace have been left untouched.

This process can be automated by the following:

while read f; do sed -i 's/\([^\t ]\)[\t ]*$/\1/' "$f"; done < <(git ls-files '*.lua' '*.tdf' '*.h' '*.glsl' '*.fs' '*.json' '*.txt')
@GoogleFrog
Copy link
Contributor

bos files were changed, but I suppose it is ok. The chili changes should be applied to Chobby immediately.

@sprunk
Copy link
Member

sprunk commented Aug 16, 2019

I don't know why so many files have changes and pushing elsewhere seems to do nothing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants