Force the tokenizer to load files in utf-8 by damphyr · Pull Request #20 · arashm/PoParser

damphyr · 2017-06-22T14:31:39Z

This comes about after trying to use PoParser on the Windows command line which has codepage 850 as its default encoding.

Most modern *ix will set LC_ALL et.al. to a reasonable UTF-* encoding thus you won't notice that reading a file with File.open(bla,'r') will just try to use the terminal's encoding and thus fail miserably and spectacularly.

Given that .po files are usually meant for internationalisation it is a safe assumption that the contents will not be limited to the old 850 codepage and that UTF-8 is a reasonable default.

This is an emergency patch to make it usable on Windows. To fix it permanently I would restructure the library to not do any file IO interactions, instead to expect the content of the .po file as a string.
You then let the user decide how to load the file. Alternately, provide a few convenience methods for specific encodings (e.g. PoParser.load_utf8(po_file) or some such)

coveralls · 2017-06-22T14:34:19Z

Coverage remained the same at 97.872% when pulling 85879d5 on damphyr:utf_handling into 848a085 on arashm:master.

arashm · 2017-06-22T14:58:11Z

Thank you! 👍
I checked and this didn't fail on any of my PO files, so it should be harmless to add this. I'm agreed on your point. I will work on a better API for next version so we accept strings by default instead of a path.

arashm · 2017-06-29T17:38:49Z

On version 3.2.0 I deprecated this behavior and parse now accepts content of a PO file as string. I also introduced parse_file to be backport compatible.

force the tokenizer to load files in utf-8

85879d5

arashm merged commit 01b9d3d into arashm:master Jun 22, 2017

damphyr deleted the utf_handling branch June 22, 2017 15:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force the tokenizer to load files in utf-8#20

Force the tokenizer to load files in utf-8#20
arashm merged 1 commit intoarashm:masterfrom
damphyr:utf_handling

damphyr commented Jun 22, 2017 •

edited

Loading

Uh oh!

coveralls commented Jun 22, 2017 •

edited

Loading

Uh oh!

arashm commented Jun 22, 2017 •

edited

Loading

Uh oh!

arashm commented Jun 29, 2017 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

damphyr commented Jun 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Jun 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arashm commented Jun 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arashm commented Jun 29, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

damphyr commented Jun 22, 2017 •

edited

Loading

coveralls commented Jun 22, 2017 •

edited

Loading

arashm commented Jun 22, 2017 •

edited

Loading

arashm commented Jun 29, 2017 •

edited

Loading