Skip to content

Adding missing unicode_literals imports#83

Merged
RobRuana merged 2 commits intomasterfrom
public-resync-2017-05-09-2
May 10, 2017
Merged

Adding missing unicode_literals imports#83
RobRuana merged 2 commits intomasterfrom
public-resync-2017-05-09-2

Conversation

@EliAndrewC
Copy link
Copy Markdown
Contributor

MAGFest uses Python 3 and so doesn't care about from __future__ import unicode_literals but at my day job we still use Python 2.7 for a couple of legacy codebases and so we have our local builds check for the presence of that line in every Python file in the Sideboard codebase. I've added it in to a few new files where it was missing; I didn't think to check for it when I was reviewing those files since all of my new projects have been on Python 3 for some time.

I also removed the unnecessary Travis line we discussed in the other PR.

@RobRuana
Copy link
Copy Markdown
Contributor

RobRuana commented May 9, 2017

from __future__ import unicode_literals is terrible and should be stricken from the codebase.

I direct your attention to this abomination, among others. The Python 2 standard library frequently expects byte strings and chokes on unicode. Every time you wrap an argument to a standard libary call in str() because of unicode literals, an angel loses its wings.

Copy link
Copy Markdown
Contributor

@RobRuana RobRuana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't fix.

@RobRuana
Copy link
Copy Markdown
Contributor

RobRuana commented May 9, 2017

@declarative_base(name=str('NameOverride'))

@RobRuana
Copy link
Copy Markdown
Contributor

RobRuana commented May 9, 2017

Feel free to merge this, but know you do against my wishes!

Not angry, just disappointed 😢

@RobRuana
Copy link
Copy Markdown
Contributor

RobRuana commented May 9, 2017

$ python2.7
Python 2.7.11 (default, Jan  7 2017, 12:33:29) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import uuid
>>> uuid.uuid5(uuid.NAMESPACE_URL, 'http://google.com')
UUID('2e32ee40-baaa-52e8-8227-a4ced6c95d14')
>>> from __future__ import unicode_literals
>>> uuid.uuid5(uuid.NAMESPACE_URL, 'http://google.com')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ratface/.pyenv/versions/2.7.11/lib/python2.7/uuid.py", line 589, in uuid5
    hash = sha1(namespace.bytes + name).digest()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 1: ordinal not in range(128)
>>> 

Copy link
Copy Markdown
Contributor

@RobRuana RobRuana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved... begrudgingly...

@EliAndrewC
Copy link
Copy Markdown
Contributor Author

from future import unicode_literals is terrible and should be stricken from the codebase.

Haha, but tell me how you really feel!

FWIW, I've updated several codebases to Python 3 and have found that from __future__ import unicode_literals is the least bad approach out of all of the annoying options available. For reference, the codebases I've migrated to Python 3 include:

  • Sideboard itself
  • protlib
  • ConfigObj
  • Stripe's Python client library (I'm still shocked that MAGFest was apparently the first Python 3 codebase to use Stripe)

In each of those cases, the annoying tricks we have to do in order to support from __future__ import unicode_literals were worth the tradeoff IMO. And that's taking into account that there are workarounds way more annoying than the one you linked to; my pick for "worst in show" would be this gem:
https://github.com/magfest/sideboard/blob/master/sideboard/server.py#L158

If you're ever in a situation where you have a big legacy Python 2.7 codebase and want to port it to Python 3, I strongly recommend that you start by adding from __future__ import unicode_literals and then getting that working before you just jump right into Python 3.

With all that being said, I'm okay with relaxing the unicode_literals requirement on new files in the Sideboard codebase. At this point there are only a few legacy Sideboard codebases which use Python 2.7 anyway, and while we still need to support them for awhile longer, new Sideboard modules are unlikely to ever be used by legacy applications anyway. All of the Python 2.7 plugins are either in the process of being upgraded to Python 3 or are in maintenance mode and aren't ever going to suddenly start caring about e.g. the new profiler.

I'm hoping that by the end of the year we'll have dropped support for enough of our Python 2.7 plugins that we can pull the trigger on officially deprecating Python 2.7 support in Sideboard and then rip out all of the kludgy hacks like that.

In the meantime, we can either merge this in or I can go into our local Jenkins jobs and disable the check for unicode_literals. I'm on board for either approach.

@RobRuana
Copy link
Copy Markdown
Contributor

RobRuana commented May 9, 2017

Okay, you have still failed to outline the benefits of unicode_literals, other than some hand-wavy "unicode, brah!"

@RobRuana
Copy link
Copy Markdown
Contributor

Also, I reject your appeal to authority regarding the projects you've ported to python 3. Just because you've stabbed yourself in the eye a hundred times doesn't mean that we should all continue stabbing ourselves in the eye.

@EliAndrewC
Copy link
Copy Markdown
Contributor Author

Okay, you have still failed to outline the benefits of unicode_literals, other than some hand-wavy "unicode, brah!"

Fair enough! Saying "I've ported several codebases and found it helpful" is not the same as actually taking the time to explain why. Which I'm probably going to be a jerk and head home for the night without doing since it's well after 8pm :P

Two things before I leave:

  1. How do you feel about the compromise I proposed above, specifically:
  • new files added to Sideboard don't get from __future__ import unicode_literals (I'll turn off the Jenkins check for that so our builds can pass without it)
  • existing files keep it since they've already been tested that way
  • later this year we'll hopefully be able to drop Python 2.7 support from Sideboard entirely, at which point we can remove it everywhere and also get rid of six
  1. I'm really curious about the naming convention of your usernames, based on the fact that the stacktrace you posted involved the path /Users/ratface - feel free to fill me in on your naming scheme or let it remain a mystery!

@RobRuana
Copy link
Copy Markdown
Contributor

Feel free to put unicode_literals in every file. I have so few issues to be salty about nowadays, I get extra salty about the small things 😄

For what it's worth I have ALSO ported several projects from python 2 to python 3. I've worked on projects that use unicode_literals and projects that don't. I've had far fewer issues on the projects that don't use unicode_literals.

On your next project, try doing it without unicode_literals and see if it isn't easier.

@EliAndrewC
Copy link
Copy Markdown
Contributor Author

Haha, fair enough.

FWIW, you do have me questioning whether unicode_literals is as helpful as I've been thinking. My first effort didn't use it (protlib) and that was a lot harder than the next one that did, in large part because I kept ending up with bytestrings when I thought I'd have unicode. (Protlib needed to support older versions of Python 2.x that didn't support unicode_literals.) The next time I needed to port a library, I immediately added those imports, found a few problems, and the whole thing went a lot smoother.

But now I'm wondering whether that was just a coincidence, since protlib involves a lot of serialization and I/O, which is more prone to caring about bytes/unicode distinctions. http://www.reactiongifs.com/wp-content/uploads/2013/10/tim-and-eric-mind-blown.gif

On the other hand, I probably still stand by my recommendation to use unicode_literals because it forces you to think about whether you're using bytestrings or unicode strings. I'll put up with some ugly kludges in exchange for knowing that the author gave that some thought. Though if a codebase has enough unit tests then I don't really care either way. (I'm sure one of these days I'll have the pleasure to work on a codebase with enough unit tests for that :P)

@RobRuana RobRuana merged commit b13e6c2 into master May 10, 2017
@RobRuana RobRuana deleted the public-resync-2017-05-09-2 branch May 10, 2017 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants