Skip to content

[R] Calling ParquetFileWriter$WriteTable with a non-Table crashes #42240

@amoeba

Description

@amoeba

Describe the bug, including details regarding any error messages, version, and platform.

While comparing how PyArrow does incremental Parquet file writing, I noticed you can crash ParquetFileWriter$WriteTable if you don't pass a Table as it expects:

library(arrow)

tf <- tempfile()
fos <- FileOutputStream$create(tf)
schm <- schema(a = int32())
pfw <- ParquetFileWriter$create(sink=fos, schema=schm, ParquetWriterProperties$create(column_names=names(schm)))

# create a batch and crash when writing it
batch <- RecordBatch$create(data.frame(a=1:10))
pfw$WriteTable(batch, chunk_size = 10)

When run, this produces:

 *** caught segfault ***
address 0x0, cause 'invalid permissions'

Traceback:
 1: parquet___arrow___FileWriter__WriteTable(self, table, chunk_size)
 2: pfw$WriteTable(batch, chunk_size = 10)
An irrecoverable exception occurred. R is aborting now ...
fish: Job 1, 'Rscript crash.R' terminated by signal SIGSEGV (Address boundary error)

The package generally provides more user-friendly wrappers around the R6 classes but I thought I'd file a bug in case others would want to see this fixed.

Component(s)

R

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions