Skip to content

Enable data subsetting directly from S3 #2

@bsiranosian

Description

@bsiranosian

Data subsetting when reading directly from S3 does not currently work when implemented like this:

library(aws.s3)
object.loc <- "s3://bioinformatics-loyal/processed_methylation_data/HEALTHSPAN/GH40_RRBS/matrices_processed/methylation_filtered.gctx"
mgct <- s3read_using(FUN = function(x) parse_gctx(x, rid=1), object = object.loc)

Instead, the whole file is downloaded to a temp directory, and a portion of it is read from there.

This should be possible as rhdf5 supports read-only access to files in S3: https://www.bioconductor.org/packages/devel/bioc/vignettes/rhdf5/inst/doc/rhdf5_cloud_reading.html

However, I'm currently hit with the error described here, and haven't gone any further: https://support.bioconductor.org/p/9134972/

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions