Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions sdks/python/apache_beam/io/avroio.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@
from avro import datafile
from avro import schema

from six import binary_type

import apache_beam as beam
from apache_beam.io import filebasedsink
from apache_beam.io import filebasedsource
Expand Down Expand Up @@ -309,8 +311,8 @@ def _decompress_bytes(data, codec):

# Compressed data includes a 4-byte CRC32 checksum which we verify.
# We take care to avoid extra copies of data while slicing large objects
# by use of a buffer.
result = snappy.decompress(buffer(data)[:-4])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this change? When I ran it, it fails with: TypeError: argument 1 must be string or read-only buffer, not memoryview.

This is because, a slice of a buffer will return the raw data, but in case of memoryview a slice will return a memoryview object for that sub section.

# by use of a memoryview.
result = snappy.decompress(binary_type(memoryview(data)[:-4]))
avroio.BinaryDecoder(cStringIO.StringIO(data[-4:])).check_crc32(result)
return result
else:
Expand Down