When setDT is called on a list there are a series of sanity checks, e.g. the check for homogenous column length:
l = list(A=1:5, B=1)
setDT(l)
# Error in setDT(l) :
# All elements in argument 'x' to 'setDT' must be of same length, but the profile of input lengths (length:frequency) is: [1:1, 5:1]
# The first entry with fewer than 5 entries is 2
However this check is not triggered if we set the class to "data.table" or "data.frame" beforehand:
l = list(A=1:5, B=1)
class(l) = "data.table"
setDT(l)
l
# A B
# <int> <num>
# 1: 1 1
# 2: 2 1
# 3: 3 1
# 4: 4 1
# 5: 5 1
str(l)
# Classes ‘data.table’ and 'data.frame': 5 obs. of 2 variables:
# $ A: int 1 2 3 4 5
# $ B: num 1
# - attr(*, ".internal.selfref")=<externalptr>
Only the check for multi-column columns (E.g. matrix columns) is triggered before the check is.data.table(x) - which then simply fixes the key, names, and internal reference.
Instead, it would be better to proceed with the list sanity checks even when the input has class "data.table" or "data.frame". To avoid these checks when a user mistakenly calls setDT on an existing valid data.table we could simply return if selfrefok(x) == 1L.
When setDT is called on a list there are a series of sanity checks, e.g. the check for homogenous column length:
However this check is not triggered if we set the class to "data.table" or "data.frame" beforehand:
Only the check for multi-column columns (E.g. matrix columns) is triggered before the check
is.data.table(x)- which then simply fixes the key, names, and internal reference.Instead, it would be better to proceed with the list sanity checks even when the input has class "data.table" or "data.frame". To avoid these checks when a user mistakenly calls
setDTon an existing valid data.table we could simply return ifselfrefok(x) == 1L.