having a key argument for fread is quite nice, it helps cut down on code like:
DT = fread('path.csv')
setkey(DT, ID)
As well as being robust to changing the name of DT over time without having to track down every line where DT was used.
Since this wasn't any deep change to how fread works, but rather just entails running setkeyv just before returning ans, it would be easy to offer an index argument with similar goals.
DT = fread('path.csv', key = 'ID')
setindex(DT, ID2)
# could be streamlined as
DT = fread('path.csv', key = 'ID', index = 'ID2')
The only roadblock to implementing this right away is a small API issue -- should we allow multiple indices to be set, and if so what's the proper syntax? Most natural to me is: if index is given as atomic with length 1, strsplit it along , (as is done for key); with length >1, this is a single index (again, like key); and with index a list, *apply the logic from the first two cases to each element.
having a
keyargument forfreadis quite nice, it helps cut down on code like:As well as being robust to changing the name of
DTover time without having to track down every line whereDTwas used.Since this wasn't any deep change to how
freadworks, but rather just entails runningsetkeyvjust before returningans, it would be easy to offer anindexargument with similar goals.The only roadblock to implementing this right away is a small API issue -- should we allow multiple indices to be set, and if so what's the proper syntax? Most natural to me is: if
indexis given as atomic with length1,strsplitit along,(as is done forkey); with length>1, this is a single index (again, likekey); and withindexalist,*applythe logic from the first two cases to each element.