While updating the anagram test cases (issue: #413)
Discussion of handling non-ascii characters came up, and we decided that we would NOT use non-ascii characters in those tests.
@NobbZ made a good point:
I do not think, that we should test for anything that is not in ASCII.
- Most languages do not cope very well with codepoints beyond ASCII
- There are letters beyond ASCII, that would need another step of normalization. Eg the german "es-zett" (ß) which is is only allowed as lowercase (but there is a capital version available which is allowed to be used on titlepages and headlines), on capitalisation it is usually turned into "SS". There has been the long-s, and similar characters in ancient greek (gamma was one of them AFAIK).
Isogram (as of 20161031) also has non-ascii test cases.
Are there other problems that have non-ascii test cases?
I've created this issue so we can discuss the general policy of whether non-ascii characters should be used in test cases and have a a thread to point to when it comes up again in the future.
Proposal:
All test cases should only use ASCII characters
(Unless extended character handling is integral to the problem.)
While updating the
anagramtest cases (issue: #413)Discussion of handling non-ascii characters came up, and we decided that we would NOT use non-ascii characters in those tests.
@NobbZ made a good point:
Isogram (as of 20161031) also has non-ascii test cases.
Are there other problems that have non-ascii test cases?
I've created this issue so we can discuss the general policy of whether non-ascii characters should be used in test cases and have a a thread to point to when it comes up again in the future.
Proposal:
All test cases should only use ASCII characters
(Unless extended character handling is integral to the problem.)