Skip to content

COMETSC running but not giving output #8

@achamess

Description

@achamess

I'm excited to use this tool, but it's been a struggle to get it to work for me. I think the issue is my input files because I can run the example dataset.

I am working with a Seurat object. I've exported the markers, umap dimensions and cluster calls as tab separated text. I make a call to Comet from the command line it and looks like it's running. It certainly is from top. It goes for about 3 hours. An output folder is created. But the folder is empty. It doesn't throw any specific errors but it does show the following during run-time:

Creating discrete expression matrix...
Insufficient floating point precision for calculating or reporting the exact XL-mHG test statistic; the true value is too small. Using "0" instead.(The XL-mHG p-value will also be reported as "0".)
Insufficient floating point precision for calculating or reporting the exact XL-mHG test statistic; the true value is too small. Using "0" instead.(The XL-mHG p-value will also be reported as "0".)

I am not sure what the problem is. Is there a stderr or log file to see what's going on?

Also, relatedly, the docs would benefit greatly from a tutorial showing how to get the input files out from a Seurat object, since that is such a common procedure.

Here is my code to get the input files out from Seurat and to the command line.

# matrix
matrix_cometsc <- GetAssayData(so) # so is Seurat Object

write.table(as.matrix(matrix_cometsc), file=here("data", "COMETSC", "markers.txt"), row.names=TRUE, col.names=TRUE, sep = "\t", quote = FALSE)

#UMAP embeddings
umap_cometsc <- Embeddings(so, reduction = "umap")
write.table(umap_cometsc, file=here("data", "COMETSC", "vis.txt"), row.names=TRUE, col.names=FALSE, sep = "\t", quote = FALSE)

#cluster IDs
cluster_cometsc <- noquote(as.matrix(Idents(so)))
write.table(cluster_cometsc, file=here("data", "COMETSC", "cluster.txt"), row.names=TRUE, col.names=FALSE, sep = "\t", quote = FALSE)

Part of the issue is with the marker (matrix) because of that first tab above the row names. I had to manually add it like this:

sed '1s/.*/\t&/' markers.txt > markers2.txt

Also, my command to Comet is the following:

#! /bin/bash
source ~/comet/bin/activate
Comet markers2.txt vis.txt cluster.txt -C 16 -K 4 -Count true output/

And for some reference, here is a sample of markers2.txt with the tabs indicated by ^I

^ID1_TTCAGGATCAAGCCAT^ID1_GTGGAGATCTGCTTAT^ID1_GCACGGTCACTCAGAT^ID1_TATACCTGTCTTACTT
MIR1302-2HG^I0^I0.0766241526725224^I0^I0
FAM138A^I0^I0^I0^I0
OR4F5^I0^I0^I0^I0
AL627309.1^I0.103146952196364^I0.0766241526725224^I0.0823802232731239^I0.0918193591402592
AL627309.3^I0^I0^I0^I0
AL627309.2^I0^I0^I0^I0
AL627309.4^I0^I0^I0^I0
AL732372.1^I0^I0^I0^I0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions