-
Notifications
You must be signed in to change notification settings - Fork 4k
Closed
Description
I have been working on Spark integration with Arrow.
I realised that there are no ways to use socket as input to use Arrow stream format. For instance,
I want to something like:
connStream <- socketConnection(port = 9999, blocking = TRUE, open = "wb")
rdf_slices <- # a list of data frames.
stream_writer <- NULL
tryCatch({
for (rdf_slice in rdf_slices) {
batch <- record_batch(rdf_slice)
if (is.null(stream_writer)) {
stream_writer <- RecordBatchStreamWriter(connStream, batch$schema) # Here, looks there's no way to use socket.
}
stream_writer$write_batch(batch)
}
},
finally = {
if (!is.null(stream_writer)) {
stream_writer$close()
}
})Likewise, I cannot find a way to iterate the stream batch by batch
RecordBatchStreamReader(connStream)$batches() # Here, looks there's no way to use socket.This looks easily possible in Python side but looks missing in R APIs.
Reporter: Hyukjin Kwon
Assignee: Dewey Dunnington / @paleolimbot
Related issues:
Note: This issue was originally created as ARROW-4512. Please see the migration documentation for further details.