+
+
+
Take in a cell embedding matrix, then find k nearest neighbors (KNN) for each cell, convert the
+KNN into a graph (adjacency matrix), then run a graph-based clustering algorithm. Each of these
+steps can be customized by passing a function that performs the step (see details).
+
+
+
+
Usage
+
cluster_cells_graph(
+ mat,
+ knn_method = knn_hnsw,
+ knn_to_graph_method = knn_to_geodesic_graph,
+ cluster_graph_method = cluster_graph_leiden,
+ threads = 0L,
+ verbose = FALSE
+)
+
+
+
+
Arguments
+
+
+
- mat
+(matrix) Cell embeddings matrix of shape (cells x n_embeddings)
+
+
+- knn_method
+(function) Function to that takes an embedding matrix as the first argument and returns a k nearest neighbors (KNN) object.
+For example, knn_hnsw(), knn_annoy(), or a parameterized version (see Details).
+
+
+- knn_to_graph_method
+(function) Function that takes a KNN object and returns a graph as an undirected graph (lower-triangular dgCMatrix adjacency matrix).
+For example, knn_to_graph(), knn_to_snn_graph(), knn_to_geodesic_graph(), or a parameterized version (see Details).
+
+
+- cluster_graph_method
+(function) Function that takes an undirected graph of cell similarity and returns a factor with cluster assignments for each cell.
+For example, cluster_graph_leiden(), cluster_graph_louvain(), cluster_graph_seurat(), or a parameterized version (see Details).
+
+
+- threads
+(integer) Number of threads to use in knn_method, knn_to_graph_method and cluster_graph_method. If these functions do not utilize
+a threads argument, this is silently ignored.
+
+
+- verbose
+(logical) Whether to print progress information in knn_method, knn_to_graph_method and cluster_graph_method. If these functions do not utilize
+a verbose argument, this is silently ignored.
+
+
+
+
Value
+
(factor) Factor vector containing the cluster assignment for each cell.
+
+
+
Details
+
Customizing clustering steps
+
All of the BPCells functions named like knn_*, knn_to_graph_*, and cluster_graph_* support customizing parameters
+via partial function application. For example, look for 20 neighbors during the k nearest neighbors search, setting
+knn_method=knn_hnsw(k=20) is a convenient shortcut for knn_method=function(x) knn_hnsw(x, k=20). Similarly, lowering the
+default clustering resolution can be done as cluster_graph_method=cluster_graph_louvain(resolution=0.5). This works because
+all these functions are written to return a partially parameterized copy of themselves as a function object when their
+first argument is missing.
+
For even more advanced customization, users can manually call the knn, knn_to_graph, and cluster_graph methods rather
+than using cluster_cells_graph() as a convenient wrapper.
+
Implementing custom clustering steps
+
The required interfaces for each step are as follows:
+
knn_method: First argument is a matrix of cell embeddings, shape (cells x n_embeddings).
+Returns a named list of two matrices of dimension (cells x k):
idx: Neighbor indices, where idx[c, n] is the index of the
+nth nearest neighbor to cell c.
+dist: Neighbor distances, where dist[c, n] is the distance between
+cell c and its nth nearest neighbor.
+Self-neighbors are allowed, so with sufficient search effort idx[c,1] == c for nearly all cells.
+
knn_to_graph_method: First argument is a KNN object as returned by knn_method.
+Returns a weighted similarity graph as a lower triangular sparse adjacency matrix (dgCMatrix).
+For cells i and j, their similarity score is in adjacency_mat[max(i,j), min(i,j)].
+
cluster_graph_method: First argument is a weighted similarity graph as returned by knn_to_graph_method.
+Returns a factor vector of length cells with a cluster assignment for each cell.
+
+
+
+