Allow to set error handler for decoding errors#1314
Conversation
Proof of concept; currently only as parameter to the internal dump_escaped function; that is, not yet exposed to the dump function.
Test every prefix of Unicode sequences against the different dump functions.
|
Looks great!! Out of curiosity, I have added some tests for the "correct" number of replacement characters (as in Unicode 11, Section 3.9 -- U+FFFD Substitution of Maximal Subparts). All tests pass. Great work! |
|
@abolz Thanks for the tests, I shall add them to the test suite. FYI: I used the library's Unicode test suite to systematically create a 7.5 million valid and invalid byte sequences and compared the dump outputs with those of Python. Good to know that I also covered those from the Unicode spec. |
niklas88
left a comment
There was a problem hiding this comment.
LGTM apart from a typo. I must say however that I'm not really familiar mit the test macros used e.g. in the unit-unicode.cpp (in particular CAPTURE())
|
|
|
Thanks everyone! |
Fixes #1198