Skip to content

Apply NFKC normalisation #1

@cmcaine

Description

@cmcaine

Otherwise I can fingerprint on diacritic form, ligatures, etc.

I don't know if it also removes the homoglyphs. Might want to look into that.

NFKC does change the appearance of the text a bit if you're using display variants e.g. blacktype h Vs Latin h, but NFC normalisation permits too many fingerprinting options.

http://unicode.org/reports/tr15/#Canon_Compat_Equivalence

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions