Skip to content

Segfault when unlisting an exploded list column #5844

@rbutleriii

Description

@rbutleriii

I am running into a *** caught segfault *** address 0x7, cause 'memory not mapped' error trying to simplify a list column. For the previous step, I have a table with a list column, and I explode the list column (one row per list item) as the solution does in this example. However, the list column seems to combine two rows as a list instead of splitting them fully. So to solve this I try to unlist that column and get the segfault error. Is this unexpected behavior or am I exploding the columns incorrectly?

# sample data set
> dput(c)
structure(list(Reference = c("Chai et al. 2011", "Chai et al. 2011",
"Chai et al. 2011", "Chai et al. 2011", "Goodwin et al. 2021"
), Treatment = c("IgGs for MC1, PHF1 (tau pS396/S404) (DS: MC1 IgG)",
"IgGs for MC1, PHF1 (tau pS396/S404) [DS: PHF1 (tau pS396/S404) IgG]",
"IgGs for MC1, PHF1 (tau pS396/S404) (DS:MC1 IgG)", "IgGs for MC1, PHF1 (tau pS396/S404) [DS: PHF1 (tau pS396/S404) IgG]",
"Anti-tau scFvs (VL: Anti-tau intrabody PHF1i)"), Model = c("JNPL3 (P301L)",
"JNPL3 (P301L)", "hTau.P301S", "hTau.P301S", "JNPL3 (P301L), rTg4510 (P301L)"
)), row.names = c(NA, -5L), class = c("data.table", "data.frame"
))

# split the strings with multiple in Model, trying to explode each model to it's own row with Ref and Treat
b = a[, .(Model = tstrsplit(Model, ", ", fixed=TRUE)), by=.(Reference, Treatment)]
> b
             Reference
1:    Chai et al. 2011
2:    Chai et al. 2011
3:    Chai et al. 2011
4: Goodwin et al. 2021
5: Goodwin et al. 2021
                                                             Treatment
1:                   IgGs for MC1, PHF1 (tau pS396/S404) (DS: MC1 IgG)
2: IgGs for MC1, PHF1 (tau pS396/S404) [DS: PHF1 (tau pS396/S404) IgG]
3:                    IgGs for MC1, PHF1 (tau pS396/S404) (DS:MC1 IgG)
4:                       Anti-tau scFvs (VL: Anti-tau intrabody PHF1i)
5:                       Anti-tau scFvs (VL: Anti-tau intrabody PHF1i)
                      Model
1:            JNPL3 (P301L)
2: JNPL3 (P301L),hTau.P301S # << this one has been collapsed as a list
3:               hTau.P301S
4:            JNPL3 (P301L)
5:          rTg4510 (P301L)

options(datatable.verbose = TRUE)

# try to unlist to expand
b[, Model := unlist(Model)]
Detected that j uses these columns: Model
Assigning to all 5 rows
RHS_list_of_columns == false

 *** caught segfault ***
address 0x7, cause 'memory not mapped'

Traceback:
 1: `[.data.table`(b, , `:=`(Model, unlist(Model)))
 2: b[, `:=`(Model, unlist(Model))]

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:

R Session Info:

R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] ComplexHeatmap_2.16.0 RColorBrewer_1.1-3    ggrepel_0.9.3
[4] ggplot2_3.4.2         dplyr_1.1.2           stringi_1.7.12
[7] stringr_1.5.0         data.table_1.14.10

loaded via a namespace (and not attached):
 [1] utf8_1.2.3          generics_0.1.3      shape_1.4.6
 [4] digest_0.6.32       magrittr_2.0.3      iterators_1.0.14
 [7] circlize_0.4.15     foreach_1.5.2       doParallel_1.0.17
[10] GlobalOptions_0.1.2 fansi_1.0.4         scales_1.2.1
[13] codetools_0.2-19    textshaping_0.3.6   cli_3.6.1
[16] rlang_1.1.1         crayon_1.5.2        munsell_0.5.0
[19] withr_2.5.0         tools_4.3.1         parallel_4.3.1
[22] colorspace_2.1-0    GetoptLong_1.0.5    BiocGenerics_0.46.0
[25] vctrs_0.6.3         R6_2.5.1            png_0.1-8
[28] matrixStats_1.0.0   stats4_4.3.1        lifecycle_1.0.3
[31] S4Vectors_0.38.1    IRanges_2.34.1      clue_0.3-64
[34] ragg_1.2.5          cluster_2.1.4       pkgconfig_2.0.3
[37] pillar_1.9.0        gtable_0.3.3        glue_1.6.2
[40] Rcpp_1.0.10         systemfonts_1.0.4   tibble_3.2.1
[43] tidyselect_1.2.0    farver_2.1.1        rjson_0.2.21
[46] labeling_0.4.2      compiler_4.3.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions