• Harmony | 超好用的单细胞测序数据合并(3‘和5‘数据合并)(二)


    1写在前面

    对于只有只有部分重叠的datasets,合并方法我们依然可以采用SeuratHarmonyrliger包,本期介绍一下Harmony包的用法。🤩

    2用到的包

    rm(list = ls())
    library(Seurat)
    library(SeuratDisk)
    library(SeuratWrappers)
    library(patchwork)
    library(harmony)
    library(rliger)
    library(RColorBrewer)
    library(tidyverse)
    library(reshape2)
    library(ggsci)
    library(ggstatsplot)
    • 1

    3示例数据

    这里我们提供13’ PBMC dataset1whole blood dataset。🥰

    umi_gz <- gzfile("./GSE149938_umi_matrix.csv.gz",'rt')  
    umi <- read.csv(umi_gz,check.names = F,quote = "")

    matrix_3p <- Read10X_h5("./3p_pbmc10k_filt.h5",use.names = T)
    • 1

    创建Seurat对象。🐶

    srat_wb <- CreateSeuratObject(t(umi),project = "whole_blood")
    srat_3p <- CreateSeuratObject(matrix_3p,project = "pbmc10k_3p")
    rm(umi_gz)
    rm(umi)
    rm(matrix_3p)
    srat_wb
    srat_3p
    • 1
    alt

    alt

    4修改metadata

    为了方便后续分析,这里我们对metadata进行一下注释修改

    colnames(srat_wb@meta.data)[1] <- "cell_type"
    srat_wb@meta.data$orig.ident <- "whole_blood"
    srat_wb@meta.data$orig.ident <- as.factor(srat_wb@meta.data$orig.ident)
    head(srat_wb[[]])
    • 1
    alt

    5初步合并

    5.1 简单合并

    这里我们先用merge2个数据集简单合并在一起。(这里我们默认做过初步过滤了哈,具体的大家可以看一下上期的教学。)😘

    wb_harmony  <- merge(srat_3p,srat_wb)
    • 1

    5.2 标准操作

    我们在这里做一下Normalization,寻找高变基因等等标准操作。👀

    wb_harmony <- NormalizeData(wb_harmony, verbose = F)
    wb_harmony <- FindVariableFeatures(wb_harmony, selection.method = "vst", nfeatures = 2000, verbose = F)
    wb_harmony <- ScaleData(wb_harmony, verbose = F)
    wb_harmony <- RunPCA(wb_harmony, npcs = 30, verbose = F)
    wb_harmony <- RunUMAP(wb_harmony, reduction = "pca", dims = 1:30, verbose = F)
    • 1

    6harmony合并数据

    6.1 合并前

    p1 <- DimPlot(object = wb_harmony, reduction = "pca", 
    pt.size = .1, group.by = "orig.ident") +
    scale_color_npg()+
    NoLegend()

    p2 <- VlnPlot(object = wb_harmony, features = "PC_1",
    group.by = "orig.ident", pt.size = .1) +
    scale_color_npg()+
    NoLegend()

    p1+p2
    • 1
    alt

    DimPlot(wb_harmony,reduction = "umap",
    group.by = "orig.ident") +
    scale_color_npg()+
    plot_annotation(title = "10k 3' PBMC and whole blood, before integration")
    • 1
    alt

    6.2 开始合并

    wb_harmony <- wb_harmony %>% 
    RunHarmony("orig.ident", plot_convergence = T)
    • 1
    alt

    6.3 查看信息

    harmony_embeddings <- Embeddings(wb_harmony, 'harmony')
    harmony_embeddings[1:5, 1:5]
    • 1
    alt

    6.4 可视化-harmony

    harmony合并后。

    p1 <- DimPlot(object = wb_harmony, reduction = "harmony", pt.size = .1, group.by = "orig.ident") + 
    scale_color_npg()+
    NoLegend()

    p2 <- VlnPlot(object = wb_harmony, features = "harmony_1", group.by = "orig.ident", pt.size = .1) +
    scale_fill_npg()+
    NoLegend()
    p1 +p2
    • 1
    alt

    6.5 可视化-UMAP

    harmony合并后。

    wb_harmony <- SetIdent(wb_harmony,value = "orig.ident")
    DimPlot(wb_harmony,reduction = "umap") +
    scale_color_npg()+
    plot_annotation(title = "10k 3' PBMC and whole blood, after integration (Harmony)")
    • 1
    alt

    7降维与聚类

    7.1 寻找clusters

    wb_harmony <- wb_harmony %>% 
    RunUMAP(reduction = "harmony", dims = 1:30, verbose = F) %>%
    FindNeighbors(reduction = "harmony", k.param = 10, dims = 1:30) %>%
    FindClusters() %>%
    identity()
    • 1
    alt

    wb_harmony <- SetIdent(wb_harmony,value = "seurat_clusters")

    ncluster <- length(unique(wb_harmony[[]]$seurat_clusters))

    mycol <- colorRampPalette(brewer.pal(8, "Set2"))(ncluster)

    DimPlot(wb_harmony,label = T,
    cols = mycol, repel = T) +
    NoLegend()
    • 1
    alt

    7.3 具体查看及可视化

    我们看下各个clusters在两个datasets各有多少细胞。

    count_table <- table(wb_harmony@meta.data$seurat_clusters, wb_harmony@meta.data$orig.ident)
    count_table

    #### 可视化
    count_table %>%
    as.data.frame() %>%
    ggbarstats(x = Var2,
    y = Var1,
    counts = Freq)+
    scale_fill_npg()
    • 1
    alt

    西红柿
    最后祝大家早日不卷!~

    需要示例数据的小伙伴,在公众号回复Merge2获取吧!

    点个在看吧各位~ ✐.ɴɪᴄᴇ ᴅᴀʏ 〰

    📍 往期精彩

    📍 🤨 Google | 谷歌翻译崩了我们怎么办!?(附完美解决方案)
    📍 🤩 scRNA-seq | 吐血整理的单细胞入门教程
    📍 🤔 Reticulate | 如何在Rstudio中优雅地调用Python!?
    📍 🤣 NetworkD3 | 让我们一起画个动态的桑基图吧~
    📍 🤩 RColorBrewer | 再多的配色也能轻松搞定!~
    📍 🧐 rms | 批量完成你的线性回归
    📍 🤒 CMplot | 连Nature上的曼哈顿图都卷起来啦
    📍 🤩 CMplot | 完美复刻Nature上的曼哈顿图
    📍 🤠 Network | 高颜值动态网络可视化工具
    📍 🤗 boxjitter | 完美复刻Nature上的高颜值统计图
    📍 🤫 linkET | 完美解决ggcor安装失败方案(附教程)
    📍 ......

    本文由 mdnice 多平台发布

  • 相关阅读:
    性能测试面试问题,一周拿3个offer不嫌多
    如何查看Android 包依赖关系
    Leetcode 754. 到达终点数字
    基于JAVA舞蹈网站计算机毕业设计源码+数据库+lw文档+系统+部署
    聊聊领导力与带团队的那些事
    每日刷题记录 (十三)
    day16-面向对象综合练习(上)
    万字string类总结
    重新理解云原生
    网络安全比赛A模块任务书
  • 原文地址:https://blog.csdn.net/m0_72224305/article/details/127628279