Skip to content

List checks in setDT should proceed when class data.table but selfref not ok #4176

@sritchie73

Description

@sritchie73

When setDT is called on a list there are a series of sanity checks, e.g. the check for homogenous column length:

l = list(A=1:5, B=1)
setDT(l)
# Error in setDT(l) : 
#  All elements in argument 'x' to 'setDT' must be of same length, but the profile of input lengths (length:frequency) is: [1:1, 5:1]
# The first entry with fewer than 5 entries is 2

However this check is not triggered if we set the class to "data.table" or "data.frame" beforehand:

l = list(A=1:5, B=1)
class(l) = "data.table"
setDT(l)
l
#        A     B
#    <int> <num>
# 1:     1     1
# 2:     2     1
# 3:     3     1
# 4:     4     1
# 5:     5     1
str(l)
# Classes ‘data.table’ and 'data.frame':	5 obs. of  2 variables:
#  $ A: int  1 2 3 4 5
#  $ B: num 1
#  - attr(*, ".internal.selfref")=<externalptr> 

Only the check for multi-column columns (E.g. matrix columns) is triggered before the check is.data.table(x) - which then simply fixes the key, names, and internal reference.

Instead, it would be better to proceed with the list sanity checks even when the input has class "data.table" or "data.frame". To avoid these checks when a user mistakenly calls setDT on an existing valid data.table we could simply return if selfrefok(x) == 1L.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions