I've noticed that as.ITime is surprisingly slow when given character inputs (ie: of the format %H:%M:%S), and discovered that it's actually much faster to convert strings to ITime via an intermediate step. I'll show the time difference first:
# set up some inputs
set.seed( 1 )
x <- as.character( setattr(
sample( seq_len( 24*60*59 ), 1E5, replace = TRUE ),
"class",
"ITime"
) )
head( x )
# [1] "06:15:58" "08:46:56" "13:31:10" "21:26:02" "04:45:35" "21:12:08"
Doing a direct conversion using as.ITime is very slow compared to using an intermediate as.chron.ITime step (the same is true using chron::times first:
microbenchmark::microbenchmark(
direct = { direct <- as.ITime( x ) },
twostep = { twostep <- as.ITime( as.chron.ITime( x ) ) },
times = 10
)
# (I'll just show the median times here to make it easier to read)
# Unit: milliseconds
# expr median
# direct 1808.9899
# twostep 115.9776
# check the output
identical( direct, twostep )
# [1] TRUE
Notice the significant speed increase (~15x), just by adding an intermediate step.
The same can be achieved in a less direct manner, by converting the numeric times values to integer, then converting to ITime with setattr instead of as.ITime. This way ("fivestep") gives basically the same speed increase, so I'm really just adding here as an option.
microbenchmark::microbenchmark(
twostep = { twostep <- as.ITime( as.chron.ITime( x ) ) },
fivestep = {
fivestep <- as.integer( round( as.chron.ITime( x ) * 86400 ) )
setattr( fivestep, "class", "ITime" )
},
times = 100
)
# Unit: milliseconds
# expr median
# twostep 122.2765
# fivestep 119.9396
identical( direct, fivestep )
# [1] TRUE
My suggestion would be to build this into as.ITime.character, but my own attempts have failed to maintain the same reliability as the existing function.
NOTE: This does have the complication that is works perfectly when the format is "%H:%M:%S", but I believe the round may need to be floor if it's to maintain consistency when the input format is "%H:%M:%OS". Note round is still more appropriate in my opinion, and would maintain better consistency with chron::times, but as.ITime currently rounds down, so floor would maintain that for input format "%H:%M:%OS". Eg:
data.table::as.ITime( "12:00:00.99" )
# [1] "12:00:00"
chron::times( "12:00:00.99" )
# [1] 12:00:01
data.table::as.ITime( chron::times( "12:00:00.99" ) )
# [1] "12:00:01"
I've noticed that
as.ITimeis surprisingly slow when givencharacterinputs (ie: of the format%H:%M:%S), and discovered that it's actually much faster to convert strings toITimevia an intermediate step. I'll show the time difference first:Doing a direct conversion using
as.ITimeis very slow compared to using an intermediateas.chron.ITimestep (the same is true usingchron::timesfirst:Notice the significant speed increase (~15x), just by adding an intermediate step.
The same can be achieved in a less direct manner, by converting the numeric
timesvalues to integer, then converting toITimewithsetattrinstead ofas.ITime. This way ("fivestep") gives basically the same speed increase, so I'm really just adding here as an option.My suggestion would be to build this into
as.ITime.character, but my own attempts have failed to maintain the same reliability as the existing function.NOTE: This does have the complication that is works perfectly when the format is "%H:%M:%S", but I believe the
roundmay need to befloorif it's to maintain consistency when the input format is "%H:%M:%OS". Noteroundis still more appropriate in my opinion, and would maintain better consistency withchron::times, butas.ITimecurrently rounds down, sofloorwould maintain that for input format "%H:%M:%OS". Eg: