I am using fst version 0.9.8 with R-4.3.2.
I am writing 'data.table' class data frames in a for loop. The tables have different number of rows as they are the result of in-silico chemical modification of a list of peptides (protein fragments).
When writing these data tables to csv format using data.table::fwrite (with append = TRUE), all modified peptides are written correctly and all are present.
When writing them to fst format with compress = 50 or compress = 100 , data tables with 10- 11 rows (resulted from peptides with 1-2 modifications) are skipped while the bigger ones are written as expected.
Unfortunately proprietary rights do not allow me to present a full example, just the code for write_fst with argument uniform_encoding set to default (for speed as there are millions of such tables to be written):
write_fst(dt1, paste0(fname, '-', i, '.fst'), compress = 100)
Here, dt1 is a 'data.table' class data frame residing in memory. fname is a character length 1 , i stands for current iteration and the arguments of paste0 form the unique name of the fst file written to disk. The fst tables that are written have names formatted as expected.
The upstream code is the same, the only difference at this point is the file format writing decided by an if control selected by User: if (compress == yes) write "fst" else write "csv".
Thank you!
I am using
fstversion 0.9.8 with R-4.3.2.I am writing 'data.table' class data frames in a
forloop. The tables have different number of rows as they are the result of in-silico chemical modification of a list of peptides (protein fragments).When writing these data tables to
csvformat usingdata.table::fwrite(with append = TRUE), all modified peptides are written correctly and all are present.When writing them to
fstformat withcompress = 50orcompress = 100, data tables with 10- 11 rows (resulted from peptides with 1-2 modifications) are skipped while the bigger ones are written as expected.Unfortunately proprietary rights do not allow me to present a full example, just the code for
write_fstwith argumentuniform_encodingset to default (for speed as there are millions of such tables to be written):Here,
dt1is a 'data.table' class data frame residing in memory.fnameis a character length 1 , i stands for current iteration and the arguments ofpaste0form the unique name of thefstfile written to disk. Thefsttables that are written have names formatted as expected.The upstream code is the same, the only difference at this point is the file format writing decided by an
ifcontrol selected by User:if (compress == yes) write "fst" else write "csv".Thank you!