Skip to content

Support string input #21

@ecederstrand

Description

@ecederstrand

I have an str input stream that I would like to decode. base64.b64decode() handles str just fine, so I would like to avoid ASCII encoding the input, which would consume the entire stream unless I'm very clever. I tried this:

>>> from base64 import b64decode
>>> from io import StringIO
>>> b64decode('SGVsbG8gZnJvbSB1bmljb2RlIMOmw7jDpQ==')
b'Hello from unicode \xc3\xa6\xc3\xb8\xc3\xa5'
>>> Base64IO(StringIO('SGVsbG8gZnJvbSB1bmljb2RlIMOmw7jDpQ==')).read()
  File "/usr/lib/python3.5/site-packages/base64io/__init__.py", line 276, in read
    if any([char.encode("utf-8") in data for char in string.whitespace]):
  File "/usr/lib/python3.5/site-packages/base64io/__init__.py", line 276, in <listcomp>
    if any([char.encode("utf-8") in data for char in string.whitespace]):
TypeError: 'in <string>' requires string as left operand, not bytes

which fails in https://github.com/aws/base64io-python/blob/master/src/base64io/__init__.py#L276 because the code assumes data is a bytes instance.

  1. It strikes me as a quite heavy operation to pass over each piece of data 5 times (the length of string.whitespace) just to test for possible whitespace. Maybe handling whitespace should be configurable?
  2. If I change the line to if any([char in data for char in string.whitespace]): then the example code works fine. So we could test the type of data and then run the version that applies.
  3. A similar, small patch is required in _read_additional_data_removing_whitespace()

Any comments?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions