Skip to content

Making a UMAP plot

Sequoia edited this page Mar 12, 2020 · 10 revisions

The CompassAnalyzer class has a method get_umap_components. It takes the following parameters.

Parameter Description
consistencies_matrix Either your CompassData instance's reaction_consistencies matrix, or its metareaction_consistencies matrix, depending on whether the high-dimensional representation of each cell should encapsulate the cell's reaction consistencies or its metareaction consistencies. In the former case the UMAP algorithm will find a num_components-dimensional embedding for each cell in (# reactions)-dimensional space, and in the latter case the UMAP algorithm will find a num_components-dimensional embedding for each cell in (# metareactions)-dimensional space.
num_components
(Default: 2)
The number of UMAP components to calculate (i.e. the dimensionality of the embedding).

It returns a tibble, where each row represents the low-dimensional UMAP embedding of a cell. It has the following columns.

Column Description
Your CompassSettings instance's cell_id_col_name The unique identifier for the cell.
component_1 The cell's first coordinate in the low-dimensional UMAP embedding.
component_2 The cell's second coordinate in the low-dimensional UMAP embedding.
component_{num_components} The cell's num_componentsth coordinate in the low-dimensional UMAP embedding.

Here's an example that you can try out yourself, using the Th17 data set that ships with the package. It finds the 2-dimensional UMAP embedding of each cell based on its metareaction consistencies, and combines that information with the gene expression statistics accessible through our CompassData instance.

library(compassR)
library(tidyverse)

compass_settings <- CompassSettings$new(
    user_data_directory = system.file("extdata", "Th17", package = "compassR"),
    cell_id_col_name = "cell_id",
    gene_id_col_name = "HGNC.symbol"
)

compass_data <- CompassData$new(compass_settings)
compass_analyzer <- CompassAnalyzer$new(compass_settings)

cell_info_with_umap_components <-
    compass_analyzer$get_umap_components(
        compass_data$metareaction_consistencies
    ) %>%
    inner_join(
        compass_data$cell_metadata,
        by = "cell_id"
    ) %>%
    left_join(
        compass_data$gene_expression_statistics,
        by = "cell_id"
    )

You can even plot the metareactions according to the UMAP embeddings we found, like so.

ggplot(
    cell_info_with_umap_components,
    aes(x = component_1, y = component_2, color = cell_type)
) +
scale_color_discrete(guide = FALSE) +
geom_point(size = 1, alpha = 0.8) +
theme_bw()

ggplot(
    cell_info_with_umap_components,
    aes(x = component_1, y = component_2, color = metabolic_activity)
) +
scale_color_viridis_c() +
geom_point(size = 1, alpha = 0.8) +
theme_bw()

Clone this wiki locally