Skip to content

malformed factor resulting from 'by' expression when using melt.data.table in 'j' expression  #2199

@vh-d

Description

@vh-d

I get unprintable data.table object whose printing results in error

Error in as.character.factor(x) : malformed factor

but in some cases (large dataset) also crashes R.

My code looks like this

require(data.table)

# generate demo dataset
ids   <- letters[1:3]
dates <- 1:2

dt <- data.table(CJ(dates, ids, ids))
setnames(dt, c("date", "id1", "id2"))

dt[, value := rnorm(18)]

# IMPORTANT: to reproduce the bug, drop some id
dt <- dt[!(date == 1 & (id1 == "a" | id2 == "a"))]

# demo function
f1 <- function(sdt) {
  dt1 <- dcast.data.table(sdt, id1 ~ id2)
  dt2 <- melt.data.table(dt1, id.vars = "id1")
  print(dt2)
  
  dt2
}

res <- dt[, f1(.SD), by = date]
#id1 variable      value
#1:   b        b -1.3643635
#2:   c        b  0.5305674
#3:   b        c  0.2023935
#4:   c        c  0.1063894
#id1 variable       value
#1:   a        a -0.35193161
#2:   b        a -0.65570503
#3:   c        a  0.01524152
#4:   a        b -0.07234880
#5:   b        b -1.16267653
#6:   c        b -1.41490080
#7:   a        c  0.10225144
#8:   b        c -0.84277336
#9:   c        c -0.23164772

res
#Error in as.character.factor(x) : malformed factor

In my case I forgot to put the usual variable.factor = FALSE into the melt.data.table() which makes it work ok. But this behavior surprises me. It only appears when the two sets of factors differ (for date=1 and date=2 the ids are different sets), so if you I skip the line

dt <- dt[!(date == 1 & (id1 == "a" | id2 == "a"))]

it works alright.

I am on the latest stable version of data.table.

My sessionInfo()

R version 3.4.0 (2017-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: KDE neon User Edition 5.10

...

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions