One of pattern of using data.table is to return a data.table at the end of a function, like:
data.table(
A = xxx,
B = yyy,
Z = zzz,
)
I often need to include some constant columns, like the datetime, the id, etc. Most of time it returns what I want except when all other inputs are zero length vector. For example:
data.table::data.table(
a = double(),
b = 1
)
#> Warning in as.data.table.list(x, keep.rownames = keep.rownames, check.names
#> = check.names): Item 1 has 0 rows but longest item has 1; filled with NA
#> a b
#> 1: NA 1
So I have to explicitly use rep() everytime in order to ensure it returns the desired result.
a <- double()
n <- length(a)
data.table::data.table(
a = a,
b = rep(a, n)
)
#> Empty data.table (0 rows and 2 cols): a,b
It's verbose, inconvenient and often being forgotten in the first place because in my mind, the length 1 column should always be recycled... so it's actually a little bit inconsistent, IMO:
### gets a data.table with row 2
data.table::data.table(
a = double(2),
b = 1
)
### gets a data.table with row 3
data.table::data.table(
a = double(3),
b = 1
)
### Should get a zero-row data.table but gets a 1-row table with a warning
data.table::data.table(
a = double(0),
b = 1
)
while tibble gets what I expect:
tibble::tibble(
a = double(0),
b = 1
)
#> # A tibble: 0 x 2
#> # ... with 2 variables: a <dbl>, b <dbl>
One of pattern of using data.table is to return a data.table at the end of a function, like:
I often need to include some constant columns, like the datetime, the id, etc. Most of time it returns what I want except when all other inputs are zero length vector. For example:
So I have to explicitly use
rep()everytime in order to ensure it returns the desired result.It's verbose, inconvenient and often being forgotten in the first place because in my mind, the length 1 column should always be recycled... so it's actually a little bit inconsistent, IMO:
while tibble gets what I expect: