Skip to content

Discussion: Non-Ascii test cases. #428

@Insti

Description

@Insti

While updating the anagram test cases (issue: #413)
Discussion of handling non-ascii characters came up, and we decided that we would NOT use non-ascii characters in those tests.

@NobbZ made a good point:

I do not think, that we should test for anything that is not in ASCII.

  1. Most languages do not cope very well with codepoints beyond ASCII
  2. There are letters beyond ASCII, that would need another step of normalization. Eg the german "es-zett" (ß) which is is only allowed as lowercase (but there is a capital version available which is allowed to be used on titlepages and headlines), on capitalisation it is usually turned into "SS". There has been the long-s, and similar characters in ancient greek (gamma was one of them AFAIK).

Isogram (as of 20161031) also has non-ascii test cases.

Are there other problems that have non-ascii test cases?

I've created this issue so we can discuss the general policy of whether non-ascii characters should be used in test cases and have a a thread to point to when it comes up again in the future.

Proposal:
All test cases should only use ASCII characters
(Unless extended character handling is integral to the problem.)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions