扩增子系之绘制物种分类堆叠图
发表于:2021-09-27 | 分类: 可视化
字数统计: 2.5k | 阅读时长: 13分钟 | 阅读量:

为何要手绘堆叠图?
QIIME2虽然可以生成堆叠图,然而图片只能在网页上查看,无法下载。此外,QIIME2的图有几分丑,且不可个性化修改,因此,手绘最靠谱。手绘当然不是用手画,而是用手敲代码。

输入文件

  • 各级别的物种分布文件,可以直接从QIIME2 生成的taxa_barplot.qzv文件提交至 https://view.qiime2.org ,随后从网页左上角“CSV”按钮处导出各分类级别的丰度表,但是要删除风度表最后几列的元数据信息,只保留丰度信息,各列之间以逗号分隔,内容如下所示,该文件是LEVEL-2(Phylum)水平的物种丰度分布:
1
2
3
4
5
6
7
8
9
10
index,Bacteria;Chloroflexi,Bacteria;Planctomycetota,Archaea;Nanoarchaeota,Bacteria;Patescibacteria,Bacteria;Proteobacteria,Bacteria;Actinobacteriota,Archaea;Aenigmarchaeota,Bacteria;Acidobacteriota,Archaea;Altiarchaeota,Bacteria;Firmicutes,Bacteria;Bdellovibrionota,Bacteria;Dependentiae,Bacteria;Spirochaetota,Bacteria;Margulisbacteria,Bacteria;Myxococcota,Archaea;Halobacterota,Archaea;Crenarchaeota,Bacteria;Latescibacterota,Bacteria;NB1-j,Bacteria;Desulfobacterota,Bacteria;Verrucomicrobiota,Archaea;Iainarchaeota,Bacteria;DTB120,Bacteria;Elusimicrobiota,Eukaryota;SAR,Archaea;Asgardarchaeota,Archaea;Euryarchaeota,Bacteria;CK-2C2-2,Bacteria;Sva0485,Archaea;Thermoplasmatota,Bacteria;Bacteroidota,Bacteria;Marinimicrobia (SAR406 clade),Bacteria;MBNT15,Bacteria;Nitrospinota,Bacteria;Zixibacteria,Bacteria;Nitrospirota,Archaea;Micrarchaeota,Bacteria;LCP-89,Bacteria;Armatimonadota,Bacteria;WOR-1,Bacteria;Gemmatimonadota,Bacteria;Fibrobacterota,Bacteria;Calditrichota,Bacteria;Sumerlaeota,Bacteria;Aerophobota,Bacteria;WS1,Bacteria;NKB15,Bacteria;SAR324 clade(Marine group B),Bacteria;Dadabacteria,Bacteria;BHI80-139,Bacteria;Desantisbacteria,Bacteria;Schekmanbacteria,Archaea;Hadarchaeota,Archaea;Hydrothermarchaeota,Bacteria;Caldatribacteriota,Bacteria;TA06,Bacteria;WS2,Eukaryota;Amorphea,Bacteria;Cyanobacteria,Bacteria;Hydrogenedentes,Bacteria;Cloacimonadota,Bacteria;WPS-2,Bacteria;Modulibacteria,Eukaryota;Archaeplastida,Bacteria;10bav-F6,Bacteria;Methylomirabilota,Bacteria;Deferrisomatota,Bacteria;Fusobacteriota,Bacteria;Fermentibacterota,Bacteria;Entotheonellaeota,Eukaryota;Discoba,Bacteria;Campylobacterota,Bacteria;PAUC34f,Bacteria;Poribacteria,Bacteria;AncK6,Bacteria;GN01,Bacteria;Acetothermia,Bacteria;FCPU426
B01,383,1058,157,274,7755,1317,0,1025,0,39,22,16,0,0,71,0,6841,20,633,49,145,0,0,0,247,13,0,0,0,13,1012,0,0,53,0,137,0,0,39,0,271,7,17,4,0,0,0,7,471,0,0,0,0,0,3,0,0,69,3,53,0,0,0,0,0,46,0,0,0,0,0,0,0,0,0,0,0,0
B02,225,507,98,27,3279,228,0,522,0,10,4,0,0,0,25,0,7995,8,336,6,54,0,0,0,9,4,0,0,0,0,259,0,12,65,0,29,0,0,0,0,163,0,16,0,0,0,0,22,106,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,71,0,0,0,0,0,0,0,0,0,0,0,0
B03,196,360,246,38,2605,119,0,587,0,8,5,0,0,0,172,0,11619,16,370,11,0,3,0,0,13,0,0,0,0,0,206,5,0,83,8,67,0,0,0,0,130,0,5,0,0,0,0,27,136,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,120,0,0,0,0,0,0,0,0,0,0,0,0
B04,437,922,636,61,4209,195,0,1051,0,9,5,6,0,0,274,0,14573,22,628,5,88,9,0,0,17,8,0,0,0,26,271,14,0,129,7,95,0,0,0,0,219,0,20,0,0,0,0,84,145,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,467,0,0,0,5,0,0,12,0,0,0,0,0
B05,240,1159,187,26,2929,149,0,778,0,8,0,0,0,0,204,0,10918,26,473,0,20,0,0,7,10,8,0,0,0,0,193,9,0,209,0,96,0,0,2,0,244,3,19,0,0,0,0,131,125,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,532,0,0,0,12,0,0,0,0,0,0,0,0
B06,211,2003,1380,168,2469,188,12,617,3,13,4,15,7,2,196,0,7802,16,383,47,94,0,0,22,7,43,25,0,0,84,223,33,0,244,5,112,0,0,0,0,204,8,28,8,0,0,0,69,105,0,5,0,0,8,0,0,0,0,0,0,0,0,0,0,0,639,0,0,0,9,0,0,0,0,0,0,0,0
B07,231,2607,4129,694,2667,231,20,399,12,36,10,21,37,0,156,0,5611,47,453,76,93,112,0,16,21,187,53,0,0,267,196,73,0,106,0,74,0,0,0,24,169,8,37,6,6,0,5,82,118,0,7,0,0,0,0,0,0,4,0,0,0,0,0,0,0,399,0,0,0,8,0,0,0,0,0,0,0,0
B08,228,2573,4942,604,2084,150,51,507,7,14,0,22,32,2,109,0,6026,28,269,58,156,94,0,31,7,193,21,0,0,264,220,11,4,212,0,67,4,0,0,65,113,3,26,0,0,0,0,81,108,0,0,5,0,0,0,0,0,0,0,13,0,0,0,0,0,468,0,0,0,21,0,0,7,0,0,0,0,0
B09,606,3235,2950,1247,1922,696,52,487,16,21,20,84,12,0,107,0,1559,80,341,478,198,148,0,34,63,131,6,3,9,144,141,30,6,106,13,87,42,0,4,0,205,6,76,5,8,11,8,142,70,9,0,3,0,15,0,0,0,0,0,7,0,0,0,0,0,508,0,0,0,18,0,5,0,0,0,0,0,0
  • 样本元数据,描述各样本特征的文件,以制表符分隔各列,内容如下:
1
2
3
4
5
6
7
8
9
10
ID	Samples	Layer	Depth	Location	Position	Site
B01 B01 0-1 5 Right Downstream B
B02 B02 1-2 5 Right Downstream B
B03 B03 2-3 5 Right Downstream B
B04 B04 3-4 5 Right Downstream B
B05 B05 4-5 5 Right Downstream B
B06 B06 5-6 10 Right Downstream B
B07 B07 6-7 10 Right Downstream B
B08 B08 7-8 10 Right Downstream B
B09 B09 8-9 10 Right Downstream B

代码

建议在Rstudio中运行。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# 首先需要安装依赖包“amplicon”,如果没有安装“devtools”需要先安装,命令如下:
if (!requireNamespace("devtools", quietly=TRUE))
install.packages("devtools")

library(devtools)

# 安装依赖包“amplicon”,我对原作者的程序做了些许的优化,所以,可以从我的GitHub进行安装,命令如下:
if (!requireNamespace("amplicon", quietly=TRUE))
install_github("liaochenlanruo/amplicon")
suppressWarnings(suppressMessages(library(amplicon)))

# 设置工作目录,视自己的具体情况而定,工作目录中当包含输入文件:
setwd("E:/Researches/Xiaqian/NGS/CleanData/ALL/细菌V6V8-1/Analysis_20210624/barplot")

# 从文件读取元数据和物种注释
metadata=read.table("sample-metadata.tsv", header=T, row.names=1, sep="\t", comment.char="", stringsAsFactors=F)

taxonomy=read.table("taxonomy.tsv", header=T, row.names=1, sep="\t", comment.char="", stringsAsFactors=F)


# 绘制堆叠柱状图,颜色自定义。界水平物种组成表和元数据作为输入,分组列名为Group,显示前4个分类,其余归类为其他(Other),按丰度排序。
otutab1=read.table("level-1.csv", header=T, row.names=1, sep=",", comment.char="", stringsAsFactors=F)

otutab1=t(otutab1)

#按照站位分组,通过“groupID”参数设置分组信息,这里我选择根据站位(Site)分组,用户当根据自己的实际情况来设置。
(ds=tax_stackplot(otutab1, metadata, groupID="Site", topN=4, style="sample", sorted="abundance"))

ds2 = ds + scale_fill_manual(values = rev(c("#9467bd", "#ff7f0e", "#1f77b4", "#17becf"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

# 出图
ds2

#按照深度分组
(dd=tax_stackplot(otutab1, metadata, groupID="Depth", topN=4, style="sample", sorted="abundance"))

dd2 = dd + scale_fill_manual(values = rev(c("#9467bd", "#ff7f0e", "#1f77b4", "#17becf"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

dd2

# 门水平物种组成表和元数据作为输入,分组列名为Group,显示前19个分类,按丰度排序
otutab2=read.table("level-2.csv", header=T, row.names=1, sep=",", comment.char="", stringsAsFactors=F)

otutab2=t(otutab2)

#按照站位分组
(ps=tax_stackplot(otutab2, metadata, groupID="Site", topN=20, style="sample", sorted="abundance"))

ps2 = ps + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

ps2

#按照深度分组
(pd=tax_stackplot(otutab2, metadata, groupID="Depth", topN=20, style="sample", sorted="abundance"))

pd2 = pd + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

pd2

# 纲水平物种组成表和元数据作为输入,分组列名为Group,显示前19个分类,按丰度排序
otutab3=read.table("level-3.csv", header=T, row.names=1, sep=",", comment.char="", stringsAsFactors=F, quote="\"")

otutab3=t(otutab3)

#按照站位分组
(cs=tax_stackplot(otutab3, metadata, groupID="Site", topN=20, style="sample", sorted="abundance"))

cs2 = cs + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

cs2

#按照深度分组
(cd=tax_stackplot(otutab3, metadata, groupID="Depth", topN=20, style="sample", sorted="abundance"))

cd2 = cd + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

cd2

# 属水平物种组成表和元数据作为输入,分组列名为Site,显示前19个分类,按丰度排序
otutab6=read.table("level-6.csv", header=T, row.names=1, sep=",", comment.char="", stringsAsFactors=F)

otutab6=t(otutab6)

#按照站位分组
(gs=tax_stackplot(otutab6, metadata, groupID="Site", topN=20, style="sample", sorted="abundance"))

gs2 = gs + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

gs2

#按照深度分组
(gd=tax_stackplot(otutab6, metadata, groupID="Depth", topN=20, style="sample", sorted="abundance"))

gd2 = gd + scale_fill_manual(values = rev(c("#1f77b4", "#aec7e8", "#ff7f0e", "#ffbb78", "#2ca02c", "#98df8a", "#d62728", "#ff9896", "#9467bd", "#c5b0d5", "#8c564b", "#c49c94", "#e377c2", "#f7b6d2", "#7f7f7f", "#c7c7c7", "#bcbd22", "#dbdb8d", "#17becf", "#9edae5"))) + theme(axis.text.x = element_text(angle = 90, hjust = 0.5, vjust = 0.5))

gd2

最后的图可以在Rstudio中手动导出。

敬告:使用文中脚本请引用本文网址,请尊重本人的劳动成果,谢谢!Notice: When you use the scripts in this article, please cite the link of this webpage. Thank you!

上一篇:
Perl处理可恶的Windows换行符
下一篇:
扩增子可视化专题