Clustree (0.5.0) - A good thing for clustering hierarchies!

Resize text

foreword

Clustreeis an R-based suite for visualizing the structure and hierarchy of clustering trees. The suite provides a simple and intuitive way for researchers to visually explore the results of cluster analysis.

clustree

How to download the Clustree suite?

Execute the following command in R or RStudio to install the clustree package:

# download package install.packages("clustree") # load package library(clustree) # check version packageVersion("clustree")

sample data

In this example, we will analyze the peripheral blood (PBMC) single cell data set from 10X Genomics, which is a data set containing 2700 single cells sequenced by Illumina NextSeq 500. The raw data can bedownload here.

existSeurat V4.9.9 – A powerful R suite for single-cell analysisWe have learned the basic steps of single-cell analysis, as shown in the following command:

# load package library(Seurat) library(dplyr) library(patchwork) # load PBMC data set, need to use the correct path in the computer, you must change "\" to "/" pbmc.data <- Read10X(data.dir = "C:/Users/Administrator/Desktop/hg19") # Create Seurat object pbmc <- CreateSeuratObject(counts = pbmc.data, project = "pbmc3k", min.cells = 3, min.features = 200) # cell granules Line body analysis pbmc[["percent.mt"]] <- PercentageFeatureSet(pbmc, pattern = "^MT-") # Use violin plot to visualize QC indicators VlnPlot(pbmc, features = c("nFeature_RNA", "nCount_RNA ", "percent.mt"), ncol = 3) # QC step pbmc <- subset(pbmc, subset = nFeature_RNA > 200 & nFeature_RNA < 2500 & percent.mt < 5) # normalized pbmc <- NormalizeData(pbmc, normalization. method = "LogNormalize", scale.factor = 10000) # select highly variable features pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000) # feature scale all.genes <- rownames(pbmc) pbmc <- ScaleData(pbmc, features = all.genes) # PCA dimension reduction analysis pbmc <- RunPCA(pbmc, features = VariableFeatures(object = pbmc)) # determine dimension reduction results ElbowPlot(pbmc) # determine cell aggregation results pbmc <- FindNeighbors(pbmc, dims = 1:10) pbmc <- FindClusters(pbmc, resolution = 0.5) # Use UMAP or tSNE to cluster pbmc <- RunUMAP(pbmc, dims = 1:10) pbmc <- RunTSNE(pbmc, dims = 1:10)
seurat results 7
The legend is the UMAP result graph

When standard analysis procedures are performed, we can obtain the above results. butCell clustering (cluster) is not static, it can be three clusters, five clusters, or even ten clusters, so how do we choose the most appropriate number of clusters for the next step of analysis? At this time, the R suite of Clustree can help us make an appropriate choice.

results 1 1

How to use Clustree

Change the clustree suite tolibrary()functionLoaded into the R environment, and withFindClusters()functionDo more clustering, then useclustree()functionDisplay all clustering results:

library(clustree) pbmc <- FindClusters(pbmc, resolution = c(0,0.1,0.5,1,2)) clustree(pbmc@meta.data, prefix = "RNA_snn_res.")
results 2 2

We can see that the results obtained by clustree are consistent with the results of the second legend in this article,There are 4 cell populations at resolution 0.1, 9 cell populations at resolution 0.5, and 16 cell populations at resolution 2.

At this point, the key functions of clustree have been taught. If you just want to observe the changes of cell clustering, the analysis here is actually quite enough. Of course, the clustree suite also has many auxiliary functions, such as:

1. Check the mitochondrial information of each cell population

clustree(pbmc@meta.data, prefix = "RNA_snn_res.", node_colour = "percent.mt", node_colour_aggr = "mean")
results 3 20230515 115227

2. Add logo

clustree(pbmc, prefix = "RNA_snn_res.", node_label = "RNA_snn_res.")
results 4 20230515 115621

3. Check the gene expression of each cell population

clustree(pbmc, prefix = "RNA_snn_res.", node_colour = "MS4A1", node_colour_aggr = "median")
results 5 20230515 120128

4. Overlay the clustering tree on the visual analysis results

use clustree_overlay()function, the structural information of the clustering tree can be superimposed on other visual analysis results of the data, so as to better understand the association between the clustering tree and other dimensions such as distribution, concentration or category.

pbmc <- AddMetaData(pbmc,pbmc@reductions$pca@cell.embeddings, col.name = c("UMAP_1","UMAP_2")) pbmc <- AddMetaData(pbmc,pbmc@reductions$pca@cell.embeddings, col.name = colnames (pbmc@reductions$pca@cell.embeddings)) clustree_overlay(pbmc, prefix = "RNA_snn_res.", x_value = "UMAP_1", y_value = "UMAP_2")
results 6 2

Color the analysis results:

clustree_overlay(pbmc, prefix = "RNA_snn_res.", x_value = "UMAP_1", y_value = "UMAP_2", use_colour = "points", alt_colour = "blue") 
results 7 20230515 122915

Label the message of the result node:

clustree_overlay(pbmc, prefix = "RNA_snn_res.", x_value = "UMAP_1", y_value = "UMAP_2", label_nodes = TRUE)
results 8

Display side view results:

overlay_list <- clustree_overlay(pbmc, prefix = "RNA_snn_res.", x_value = "PC_1", y_value = "PC_2", plot_sides = TRUE) names(overlay_list) > [1] "overlay" "x_side" "y_side" # shows x_side Result overlay_list$x_side
results 9
# show y_side result overlay_list $y_side
results 10

epilogue

Clustree is a powerful and easy-to-use R suite that provides intuitive ways to visualize the structure and hierarchy of cluster trees. Through Clustree, we can better understand and explain the results of cluster analysis, and can quickly discover cell aggregation patterns and relationships in the data, which is an essential suite for analyzing single-cell data.

references

1. Zappia L, Oshlack A. Clustering trees: a visualization for evaluating clusterings at multiple resolutions. Gigascience. 2018;7. DOI: gigascience/giy083.

我為九斗米折腰
I bow down for nine pecks of rice

This is a place that quietly collects words and imagination.

This collection contains numerous articles, including original novels depicting the intertwining of fantasy, martial arts, human nature, and destiny, as well as encyclopedia entries and personal essays that record mundane events, strange tales, and scattered inspirations, serving as outlines, side notes, and unfinished thoughts for the story.

【The Library】Does not seek noise, but only wishes to use words as a lamp to light up a quiet corner for readers.

The books are here, and so am I, who would bow down for a mere nine pecks of rice. Welcome to the [Library].

Articles484

Subscribe to [The Secret Notes of a Broken Waist]

We don't spam, sell, or mass-send advertisements; we simply quietly send what's truly important to those who are willing to stay. If you're willing to leave your email address, then the next letter will be addressed only to you.

Leave a Reply

Your email address will not be published. Required fields are marked *