Remove obsolete workaround for slow encoding of unicode characters. by clokep · Pull Request #30 · matrix-org/python-canonicaljson

clokep · 2020-08-07T16:35:47Z

Bump the version of simplejson to a version that properly handles ensure_ascii=False (>3.14.0)
Remove the manual unascii-ifying and use ensure_ascii=False, which is now at least as fast.

* Bump the version of simplejson to a version that properly handles ensure_ascii=False * Remove the manual unascii-ifying and use ensure_ascii=False, which is now at least as fast.

clokep · 2020-08-07T16:41:33Z

(I feel like the title of this PR isn't really descriptive, by the way. Not sure of a better way to succinctly describe this change.)

clokep · 2020-08-07T17:02:40Z

Requesting review from @richvdh since this was broken off of #29, but feel free to redirect if you'd like!

richvdh · 2020-08-07T17:51:46Z

(I feel like the title of this PR isn't really descriptive, by the way. Not sure of a better way to succinctly describe this change.)

yeah. _unascii didn't really "decode ASCII" - it decoded the `\uXXXX escape sequences emitted by the json encoder, which were certainly not ascii.

Remove obsolete workaround for slow encoding of unicode characters.

maybe?

richvdh

code lgtm.

richvdh

hrm, I'm not seeing any tests for encoding of control characters (\u0000 through \u001F). They should come out as "\x00" through "\x1F", iirc. could you add this?

richvdh · 2020-08-07T17:58:22Z

hrm, according to https://matrix.org/docs/spec/appendices#grammar they should (with a few exceptions for "\n", "\r", etc), come out as "\u001F". Either way, can you add some tests to check behaviour is maintained?

clokep · 2020-08-07T18:55:50Z

hrm, according to https://matrix.org/docs/spec/appendices#grammar they should (with a few exceptions for "\n", "\r", etc), come out as "\u001F". Either way, can you add some tests to check behaviour is maintained?

I added a test that checks from 0x00 to 0x7E (the last printable ASCII character). I might have gone overboard, but I figured why not. 😄

I did check the behavior of these on master first -- I can move them to a separate PR if we want to CI pass on them first.

clokep · 2020-08-07T20:17:06Z

Arg, lint gets me every time on this repo. I think I might just black-ify it.

Switch to directly decoding to UTF-8.

3a7e455

* Bump the version of simplejson to a version that properly handles ensure_ascii=False * Remove the manual unascii-ifying and use ensure_ascii=False, which is now at least as fast.

clokep mentioned this pull request Aug 7, 2020

Add the option to iteratively encode JSON. #29

Merged

Remove more unused code.

7f17bdc

clokep requested a review from richvdh August 7, 2020 17:02

richvdh approved these changes Aug 7, 2020

View reviewed changes

Comment thread setup.py

richvdh reviewed Aug 7, 2020

View reviewed changes

Add tests for characters 0x00 - 0x7E.

e98dc2f

clokep changed the title ~~Stop manually decoding ASCII~~ Remove obsolete workaround for slow encoding of unicode characters. Aug 7, 2020

clokep added 2 commits August 7, 2020 15:00

Add a comment to the dependencies.

d95cb64

Lint

0d1f4e7

clokep merged commit 7e16350 into master Aug 10, 2020

clokep deleted the clokep/ascii-fixes branch August 10, 2020 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove obsolete workaround for slow encoding of unicode characters.#30

Remove obsolete workaround for slow encoding of unicode characters.#30
clokep merged 5 commits into
masterfrom
clokep/ascii-fixes

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

richvdh commented Aug 7, 2020

Uh oh!

richvdh left a comment

Uh oh!

Uh oh!

richvdh left a comment •

edited

Loading

Uh oh!

richvdh commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

richvdh commented Aug 7, 2020

Uh oh!

richvdh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

richvdh left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

richvdh commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

clokep commented Aug 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

richvdh left a comment •

edited

Loading