问题描述
我在治疗后有不同时间点的RNAseq数据.在这里,您可以找到表格的一部分.> View(cluster2)
> cluster2
rownames Sample expression
21 gene1 Sample1 -0.71692047
95 gene2 Sample1 -1.60358087
112 gene3 Sample1 0.29476156
113 gene4 Sample1 0.52390367
136 gene5 Sample1 -0.47093500
148 gene6 Sample1 -0.99902406
151 gene7 Sample1 -0.77891900
229 gene8 Sample1 -1.03649513
252 gene9 Sample1 -1.06392805
260 gene10 Sample1 -1.04305028
14932 gene1 Sample2 0.11824518
15006 gene2 Sample2 -0.06375086
15023 gene3 Sample2 -0.15769900
15024 gene4 Sample2 -0.94928544
15047 gene5 Sample2 -0.41254223
15059 gene6 Sample2 -0.45855777
15062 gene7 Sample2 -0.36056022
15140 gene8 Sample2 0.45096154
15163 gene9 Sample2 0.67248080
15171 gene10 Sample2 -0.59566009
29843 gene1 Sample3 0.29759959
29917 gene2 Sample3 0.48258443
29934 gene3 Sample3 -0.40674145
29935 gene4 Sample3 -1.03206336
29958 gene5 Sample3 -0.37866722
29970 gene6 Sample3 -0.37689157
29973 gene7 Sample3 -0.35649119
30051 gene8 Sample3 -0.31226370
30074 gene9 Sample3 -0.49334391
30082 gene10 Sample3 -0.36080332
44754 gene1 Sample4 0.78247333
44828 gene2 Sample4 1.64665427
44845 gene3 Sample4 1.72461980
44846 gene4 Sample4 0.12393858
44869 gene5 Sample4 0.30088996
44881 gene6 Sample4 1.73211193
44884 gene7 Sample4 0.39511615
44962 gene8 Sample4 1.69006925
44985 gene9 Sample4 0.94181113
44993 gene10 Sample4 -0.34747890
59665 gene1 Sample5 1.93571973
59739 gene2 Sample5 0.91504315
59756 gene3 Sample5 1.17766958
59757 gene4 Sample5 1.99293585
59780 gene5 Sample5 2.38539543
59792 gene6 Sample5 1.21697049
59795 gene7 Sample5 2.33208184
59873 gene8 Sample5 1.15438869
59896 gene9 Sample5 1.22935604
59904 gene10 Sample5 1.85440229
74576 gene1 Sample6 -0.58694546
74650 gene2 Sample6 -0.54178347
74667 gene3 Sample6 -0.70252704
74668 gene4 Sample6 0.41926725
74691 gene5 Sample6 -0.40225920
74703 gene6 Sample6 0.33670711
74706 gene7 Sample6 -0.27067586
74784 gene8 Sample6 -0.84741340
74807 gene9 Sample6 -1.48216198
74815 gene10 Sample6 1.23328639
89487 gene1 Sample7 -0.86542373
89561 gene2 Sample7 -0.40143953
89578 gene3 Sample7 -1.01716492
89579 gene4 Sample7 -0.62448087
89602 gene5 Sample7 -0.50543855
89614 gene6 Sample7 -0.69509192
89617 gene7 Sample7 -0.53891822
89695 gene8 Sample7 -0.78792371
89718 gene9 Sample7 -0.43037957
89726 gene10 Sample7 -0.56034284
104398 gene1 Sample8 -0.96474816
104472 gene2 Sample8 -0.43372711
104489 gene3 Sample8 -0.91291852
104490 gene4 Sample8 -0.45421567
104513 gene5 Sample8 -0.51644320
104525 gene6 Sample8 -0.75622422
104528 gene7 Sample8 -0.42163350
104606 gene8 Sample8 -0.31132355
104629 gene9 Sample8 0.62616555
104637 gene10 Sample8 -0.18035324
这个想法是绘制具有相同表达模式的基因,所以我看了文学,发现自然界中这种伟大的表现 https://www.researchgate.net/figure/Pseudotime-ordering-of-cells-reveals-genes -fig2_261034077早期激活或被抑制
我对这些基因表达进行了聚类,并得到了这些模式,但现在我想进行平滑的表示,这些表示将在本文中进行介绍.我用ggplot2尝试了很多东西,但似乎不起作用!
所以,如果有人有一个主意:)
我尝试了什么:
library(ggplot2)
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=rownames) +
geom_line(size=0.7, aes(color=rownames), alpha=0.5) +
theme(legend.position="none")
ti
给我曲线
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=factor(rownames), colour="black") + geom_line(size=0.7, aes(color=rownames), alpha=0.5) + theme(legend.position="none") + geom_contour()
ti
失败,因为它需要一个"z"值
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=rownames) +
geom_line(size=0.7, aes(color=rownames), alpha=0.5) +
geom_density2d() +
theme(legend.position="none")
ti
似乎不起作用
如果这是沿着时间段进行采样的实验,那么我将geom_line
用于基因,而将geom_smooth
用作趋势线.
# Extract time point from sample
cluster2$TimePoint <- as.numeric(sub("Sample", "", cluster2$Sample))
library(ggplot2)
ggplot(cluster2, aes(TimePoint, expression)) +
geom_hline(yintercept = 0, linetype = 2, color = "red") +
# Line for each gene
geom_line(aes(group = rownames), size = 0.5, alpha = 0.3, color = "blue") +
# Trend line
geom_smooth(size = 2, se = FALSE, color = "orange") +
scale_x_continuous(breaks = cluster2$TimePoint) +
theme_classic()
添加另一种方式(类似)来绘制此类数据.表达水平(> 0)显示为点颜色.
ggplot(cluster2, aes(TimePoint, expression)) +
geom_hline(yintercept = 0, linetype = 2, color = "grey") +
geom_line(aes(group = rownames), size = 0.5, alpha = 0.5, color = "grey90") +
geom_point(alpha = 0.3, aes(color = expression > 0)) +
geom_smooth(size = 2, se = FALSE, color = "orange") +
scale_x_continuous(breaks = cluster2$TimePoint) +
scale_y_continuous(limits = c(-3, 3)) +
scale_color_manual(values = c("blue", "red"), guide = FALSE) +
labs(title = "Expression change in cluster2",
x = "Time point",
y = "Expression") +
theme_classic()
I have RNAseq data from different time-points after a treatment. Here, you can find a part of the table.
> View(cluster2)
> cluster2
rownames Sample expression
21 gene1 Sample1 -0.71692047
95 gene2 Sample1 -1.60358087
112 gene3 Sample1 0.29476156
113 gene4 Sample1 0.52390367
136 gene5 Sample1 -0.47093500
148 gene6 Sample1 -0.99902406
151 gene7 Sample1 -0.77891900
229 gene8 Sample1 -1.03649513
252 gene9 Sample1 -1.06392805
260 gene10 Sample1 -1.04305028
14932 gene1 Sample2 0.11824518
15006 gene2 Sample2 -0.06375086
15023 gene3 Sample2 -0.15769900
15024 gene4 Sample2 -0.94928544
15047 gene5 Sample2 -0.41254223
15059 gene6 Sample2 -0.45855777
15062 gene7 Sample2 -0.36056022
15140 gene8 Sample2 0.45096154
15163 gene9 Sample2 0.67248080
15171 gene10 Sample2 -0.59566009
29843 gene1 Sample3 0.29759959
29917 gene2 Sample3 0.48258443
29934 gene3 Sample3 -0.40674145
29935 gene4 Sample3 -1.03206336
29958 gene5 Sample3 -0.37866722
29970 gene6 Sample3 -0.37689157
29973 gene7 Sample3 -0.35649119
30051 gene8 Sample3 -0.31226370
30074 gene9 Sample3 -0.49334391
30082 gene10 Sample3 -0.36080332
44754 gene1 Sample4 0.78247333
44828 gene2 Sample4 1.64665427
44845 gene3 Sample4 1.72461980
44846 gene4 Sample4 0.12393858
44869 gene5 Sample4 0.30088996
44881 gene6 Sample4 1.73211193
44884 gene7 Sample4 0.39511615
44962 gene8 Sample4 1.69006925
44985 gene9 Sample4 0.94181113
44993 gene10 Sample4 -0.34747890
59665 gene1 Sample5 1.93571973
59739 gene2 Sample5 0.91504315
59756 gene3 Sample5 1.17766958
59757 gene4 Sample5 1.99293585
59780 gene5 Sample5 2.38539543
59792 gene6 Sample5 1.21697049
59795 gene7 Sample5 2.33208184
59873 gene8 Sample5 1.15438869
59896 gene9 Sample5 1.22935604
59904 gene10 Sample5 1.85440229
74576 gene1 Sample6 -0.58694546
74650 gene2 Sample6 -0.54178347
74667 gene3 Sample6 -0.70252704
74668 gene4 Sample6 0.41926725
74691 gene5 Sample6 -0.40225920
74703 gene6 Sample6 0.33670711
74706 gene7 Sample6 -0.27067586
74784 gene8 Sample6 -0.84741340
74807 gene9 Sample6 -1.48216198
74815 gene10 Sample6 1.23328639
89487 gene1 Sample7 -0.86542373
89561 gene2 Sample7 -0.40143953
89578 gene3 Sample7 -1.01716492
89579 gene4 Sample7 -0.62448087
89602 gene5 Sample7 -0.50543855
89614 gene6 Sample7 -0.69509192
89617 gene7 Sample7 -0.53891822
89695 gene8 Sample7 -0.78792371
89718 gene9 Sample7 -0.43037957
89726 gene10 Sample7 -0.56034284
104398 gene1 Sample8 -0.96474816
104472 gene2 Sample8 -0.43372711
104489 gene3 Sample8 -0.91291852
104490 gene4 Sample8 -0.45421567
104513 gene5 Sample8 -0.51644320
104525 gene6 Sample8 -0.75622422
104528 gene7 Sample8 -0.42163350
104606 gene8 Sample8 -0.31132355
104629 gene9 Sample8 0.62616555
104637 gene10 Sample8 -0.18035324
The idea is to plot genes which have the same pattern of expression, so I looked in the litterature and I find this great representation in nature https://www.researchgate.net/figure/Pseudotime-ordering-of-cells-reveals-genes-activated-or-repressed-early-in_fig2_261034077
I made clustering of these gene expression, and I get these patterns, but now I would like to make the smooth representation which are represented on this paper. I tried many things with ggplot2 it do not seems to work!
So if someone have an idea :)
what I tried:
library(ggplot2)
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=rownames) +
geom_line(size=0.7, aes(color=rownames), alpha=0.5) +
theme(legend.position="none")
ti
Give me the curve
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=factor(rownames), colour="black") + geom_line(size=0.7, aes(color=rownames), alpha=0.5) + theme(legend.position="none") + geom_contour()
ti
failed because, it need a "z" value
ti<-ggplot(cluster2) + aes(x=as.factor(cluster2$Sample), y=expression, group=rownames) +
geom_line(size=0.7, aes(color=rownames), alpha=0.5) +
geom_density2d() +
theme(legend.position="none")
ti
Doesn't seems to work
If this is an experiment with sampling along the time period then I would use geom_line
for genes and geom_smooth
as a trend line.
# Extract time point from sample
cluster2$TimePoint <- as.numeric(sub("Sample", "", cluster2$Sample))
library(ggplot2)
ggplot(cluster2, aes(TimePoint, expression)) +
geom_hline(yintercept = 0, linetype = 2, color = "red") +
# Line for each gene
geom_line(aes(group = rownames), size = 0.5, alpha = 0.3, color = "blue") +
# Trend line
geom_smooth(size = 2, se = FALSE, color = "orange") +
scale_x_continuous(breaks = cluster2$TimePoint) +
theme_classic()
Edit: Adding one more way (similar) to plot such data. Expression level (> 0) is visualized as point color.
ggplot(cluster2, aes(TimePoint, expression)) +
geom_hline(yintercept = 0, linetype = 2, color = "grey") +
geom_line(aes(group = rownames), size = 0.5, alpha = 0.5, color = "grey90") +
geom_point(alpha = 0.3, aes(color = expression > 0)) +
geom_smooth(size = 2, se = FALSE, color = "orange") +
scale_x_continuous(breaks = cluster2$TimePoint) +
scale_y_continuous(limits = c(-3, 3)) +
scale_color_manual(values = c("blue", "red"), guide = FALSE) +
labs(title = "Expression change in cluster2",
x = "Time point",
y = "Expression") +
theme_classic()
这篇关于用ggplot2绘制基因表达谱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!