Adding missing unicode_literals imports by EliAndrewC · Pull Request #83 · magfest/sideboard

EliAndrewC · 2017-05-09T23:25:06Z

MAGFest uses Python 3 and so doesn't care about from __future__ import unicode_literals but at my day job we still use Python 2.7 for a couple of legacy codebases and so we have our local builds check for the presence of that line in every Python file in the Sideboard codebase. I've added it in to a few new files where it was missing; I didn't think to check for it when I was reviewing those files since all of my new projects have been on Python 3 for some time.

I also removed the unnecessary Travis line we discussed in the other PR.

RobRuana · 2017-05-09T23:32:11Z

from __future__ import unicode_literals is terrible and should be stricken from the codebase.

I direct your attention to this abomination, among others. The Python 2 standard library frequently expects byte strings and chokes on unicode. Every time you wrap an argument to a standard libary call in str() because of unicode literals, an angel loses its wings.

RobRuana

Won't fix.

RobRuana · 2017-05-09T23:40:44Z

sideboard/sideboard/tests/test_sa.py

Line 246 in 24d9fd6

@declarative_base(name=str('NameOverride'))

RobRuana · 2017-05-09T23:45:01Z

Feel free to merge this, but know you do against my wishes!

Not angry, just disappointed 😢

RobRuana · 2017-05-09T23:48:42Z

$ python2.7
Python 2.7.11 (default, Jan  7 2017, 12:33:29) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> uuid.uuid5(uuid.NAMESPACE_URL, 'http://google.com')
UUID('2e32ee40-baaa-52e8-8227-a4ced6c95d14')
>>> from __future__ import unicode_literals
>>> uuid.uuid5(uuid.NAMESPACE_URL, 'http://google.com')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ratface/.pyenv/versions/2.7.11/lib/python2.7/uuid.py", line 589, in uuid5
    hash = sha1(namespace.bytes + name).digest()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128)
>>>

RobRuana

Approved... begrudgingly...

EliAndrewC · 2017-05-09T23:52:14Z

from future import unicode_literals is terrible and should be stricken from the codebase.

Haha, but tell me how you really feel!

FWIW, I've updated several codebases to Python 3 and have found that from __future__ import unicode_literals is the least bad approach out of all of the annoying options available. For reference, the codebases I've migrated to Python 3 include:

Sideboard itself
protlib
ConfigObj
Stripe's Python client library (I'm still shocked that MAGFest was apparently the first Python 3 codebase to use Stripe)

In each of those cases, the annoying tricks we have to do in order to support from __future__ import unicode_literals were worth the tradeoff IMO. And that's taking into account that there are workarounds way more annoying than the one you linked to; my pick for "worst in show" would be this gem:
https://github.com/magfest/sideboard/blob/master/sideboard/server.py#L158

If you're ever in a situation where you have a big legacy Python 2.7 codebase and want to port it to Python 3, I strongly recommend that you start by adding from __future__ import unicode_literals and then getting that working before you just jump right into Python 3.

With all that being said, I'm okay with relaxing the unicode_literals requirement on new files in the Sideboard codebase. At this point there are only a few legacy Sideboard codebases which use Python 2.7 anyway, and while we still need to support them for awhile longer, new Sideboard modules are unlikely to ever be used by legacy applications anyway. All of the Python 2.7 plugins are either in the process of being upgraded to Python 3 or are in maintenance mode and aren't ever going to suddenly start caring about e.g. the new profiler.

I'm hoping that by the end of the year we'll have dropped support for enough of our Python 2.7 plugins that we can pull the trigger on officially deprecating Python 2.7 support in Sideboard and then rip out all of the kludgy hacks like that.

In the meantime, we can either merge this in or I can go into our local Jenkins jobs and disable the check for unicode_literals. I'm on board for either approach.

RobRuana · 2017-05-09T23:54:40Z

Okay, you have still failed to outline the benefits of unicode_literals, other than some hand-wavy "unicode, brah!"

RobRuana · 2017-05-10T00:00:41Z

Also, I reject your appeal to authority regarding the projects you've ported to python 3. Just because you've stabbed yourself in the eye a hundred times doesn't mean that we should all continue stabbing ourselves in the eye.

EliAndrewC · 2017-05-10T00:24:20Z

Okay, you have still failed to outline the benefits of unicode_literals, other than some hand-wavy "unicode, brah!"

Fair enough! Saying "I've ported several codebases and found it helpful" is not the same as actually taking the time to explain why. Which I'm probably going to be a jerk and head home for the night without doing since it's well after 8pm :P

Two things before I leave:

How do you feel about the compromise I proposed above, specifically:

new files added to Sideboard don't get from __future__ import unicode_literals (I'll turn off the Jenkins check for that so our builds can pass without it)
existing files keep it since they've already been tested that way
later this year we'll hopefully be able to drop Python 2.7 support from Sideboard entirely, at which point we can remove it everywhere and also get rid of six

I'm really curious about the naming convention of your usernames, based on the fact that the stacktrace you posted involved the path /Users/ratface - feel free to fill me in on your naming scheme or let it remain a mystery!

RobRuana · 2017-05-10T00:50:49Z

Feel free to put unicode_literals in every file. I have so few issues to be salty about nowadays, I get extra salty about the small things 😄

For what it's worth I have ALSO ported several projects from python 2 to python 3. I've worked on projects that use unicode_literals and projects that don't. I've had far fewer issues on the projects that don't use unicode_literals.

On your next project, try doing it without unicode_literals and see if it isn't easier.

EliAndrewC · 2017-05-10T01:49:27Z

Haha, fair enough.

FWIW, you do have me questioning whether unicode_literals is as helpful as I've been thinking. My first effort didn't use it (protlib) and that was a lot harder than the next one that did, in large part because I kept ending up with bytestrings when I thought I'd have unicode. (Protlib needed to support older versions of Python 2.x that didn't support unicode_literals.) The next time I needed to port a library, I immediately added those imports, found a few problems, and the whole thing went a lot smoother.

But now I'm wondering whether that was just a coincidence, since protlib involves a lot of serialization and I/O, which is more prone to caring about bytes/unicode distinctions. http://www.reactiongifs.com/wp-content/uploads/2013/10/tim-and-eric-mind-blown.gif

On the other hand, I probably still stand by my recommendation to use unicode_literals because it forces you to think about whether you're using bytestrings or unicode strings. I'll put up with some ugly kludges in exchange for knowing that the author gave that some thought. Though if a codebase has enough unit tests then I don't really care either way. (I'm sure one of these days I'll have the pleasure to work on a codebase with enough unit tests for that :P)

EliAndrewC added 2 commits May 9, 2017 19:15

added unicode_literals import to files missing it

f5722c1

removed unnecessary dependency line in travis config

a01ff76

RobRuana requested changes May 9, 2017

View reviewed changes

RobRuana approved these changes May 9, 2017

View reviewed changes

RobRuana merged commit b13e6c2 into master May 10, 2017

RobRuana deleted the public-resync-2017-05-09-2 branch May 10, 2017 02:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding missing unicode_literals imports#83

Adding missing unicode_literals imports#83
RobRuana merged 2 commits intomasterfrom
public-resync-2017-05-09-2

EliAndrewC commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana left a comment

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana left a comment

Uh oh!

EliAndrewC commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 10, 2017

Uh oh!

EliAndrewC commented May 10, 2017

Uh oh!

RobRuana commented May 10, 2017

Uh oh!

EliAndrewC commented May 10, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EliAndrewC commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana left a comment

Choose a reason for hiding this comment

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana left a comment

Choose a reason for hiding this comment

Uh oh!

EliAndrewC commented May 9, 2017

Uh oh!

RobRuana commented May 9, 2017

Uh oh!

RobRuana commented May 10, 2017

Uh oh!

EliAndrewC commented May 10, 2017

Uh oh!

RobRuana commented May 10, 2017

Uh oh!

EliAndrewC commented May 10, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants