Skip to content

Date and POSIXct coerced to numeric when calculating median by group  #3079

@Henrik-P

Description

@Henrik-P

I have data with class Date ('date') and POSIXct ('time'), and a grouping variable 'g'

d <- data.table(
   date = as.Date(c("2018-01-01", "2018-01-03", "2018-01-08",
                 "2018-01-10", "2018-01-25", "2018-01-30")),
   g = rep(letters[1:2], each = 3))
d[ , time := as.POSIXct(date)]

When calculating median of 'date' and 'time' by group, the result is coerced to numeric:

d[ , median(date), by = g]
#    g    V1
# 1: a 17534
# 2: b 17556

d[ , median(time), by = g]
#    g         V1
# 1: a 1514937600
# 2: b 1516838400

However, 'date' and 'time' is not coerced when calculating median without grouping:

d[ , median(date)]
# [1] "2018-01-09"

d[ , median(time)]
# [1] "2018-01-09 01:00:00 CET"

Other things I've tried which don't coerce:

Mean 'date' and 'time' by group:

d[ , mean(date), by = g]
#    g         V1
# 1: a 2018-01-04
# 2: b 2018-01-21

d[ , mean(time), by = g]
#    g                  V1
# 1: a 2018-01-04 01:00:00
# 2: b 2018-01-21 17:00:00

Median 'date' and 'time' by group using aggregate:

aggregate(date ~ g, data = d, median)
#   g       date
# 1 a 2018-01-03
# 2 b 2018-01-25

aggregate(time ~ g, data = d, median)
#   g                time
# 1 a 2018-01-03 01:00:00
# 2 b 2018-01-25 01:00:00

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server >= 2012 x64 (build 9200)
data.table_1.11.6

Metadata

Metadata

Assignees

No one assigned

    Labels

    GForceissues relating to optimized grouping calculations (GForce)IDate/ITime

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions