How to handle snappy files generated by Trino?

Hello,

With the new release to 0.7.1 the I can't decompress CSV files generated by Trino, I think the issue is related with the Hadoop_snappy. Does anyone know how it can fixed?

```import io
from snappy import snappy_formats

csv_file = 'csv_67dba65a.snappy'

def read_file(file_path):
    return open(file_path, 'rb')

decompress_func, read_chunk  = snappy_formats.get_decompress_function(
    'auto',
    read_file(csv_file)
)
decompressed_stream = io.BytesIO()
# Decompress the data
decompress_func(
    read_file(csv_file),
    decompressed_stream,
    start_chunk=read_chunk
)
decompressed_stream.seek(0)

print(f"Compressed file: {read_file(csv_file).read()}")
print(f"DeCompressed file: {decompressed_stream.read()}")

```
This code has different outputs based on the version:

-  0.7.0
`
Compressed file: b'\x00\x00\x00\x04\x00\x00\x00\x06\x04\x0c"a"\n' 
`
`DeCompressed file: b'"a"\n"a"\n'`


-  0.7.1
```
  .venv/lib/python3.12/site-packages/snappy/snappy_formats.py", line 64, in get_decompress_function
      decompress_func, read_chunk = guess_format_by_header(fin)

  .venv/lib/python3.12/site-packages/snappy/snappy_formats.py", line 59, in guess_format_by_header
      raise UncompressError("Can't detect archive format")
  snappy.snappy.UncompressError: Can't detect archive format

```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to handle snappy files generated by Trino? #140

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to handle snappy files generated by Trino? #140

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions