使用ggplot绘制两条重叠的密度曲线

本文介绍了使用ggplot绘制两条重叠的密度曲线的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在R中有一个由104列组成的数据框，看起来像这样:

I have a dataframe in R consisting of 104 columns, appearing as so:

   id         vcr1       vcr2         vcr3  sim_vcr1  sim_vcr2  sim_vcr3  sim_vcr4  sim_vcr5  sim_vcr6  sim_vcr7
1 2913 -4.782992840  1.7631999  0.003768704  1.376937 -2.096857  6.903021  7.018855  6.135139  3.188382  6.905323
2 1260  0.003768704  3.1577108 -0.758378208  1.376937 -2.096857  6.903021  7.018855  6.135139  3.188382  6.905323
3 2912 -4.782992840  1.7631999  0.003768704  1.376937 -2.096857  6.903021  7.018855  6.135139  3.188382  6.905323
4 2914 -1.311132669  0.8220594  2.372950077 -4.194246 -1.460474 -9.101704 -6.663676 -5.364724 -2.717272 -3.682574
5 2915 -1.311132669  0.8220594  2.372950077 -4.194246 -1.460474 -9.101704 -6.663676 -5.364724 -2.717272 -3.682574
6 1261  2.372950077 -0.7022792 -4.951318264 -4.194246 -1.460474 -9.101704 -6.663676 -5.364724 -2.717272 -3.682574

"sim_vcr *"变量贯穿sim_vcr100

The "sim_vcr*" variables go all the way through sim_vcr100

我需要在一张图中包含两条重叠的密度密度曲线，看起来像这样(除了这里您看到的是5而不是2):

I need two overlapping density density curves contained within one plot, looking something like this (except here you see 5 instead of 2):

我需要一条密度曲线来包含vcr1，vcr2和vcr3列中包含的所有值，并且我需要另一条密度曲线，其中包含所有sim_vcr *列中的所有值(所以100列，sim_vcr1-sim_vcr100)

I need one of the density curves to consist of all values contained in columns vcr1, vcr2, and vcr3, and I need another density curve containing all values in all of the sim_vcr* columns (so 100 columns, sim_vcr1-sim_vcr100)

由于两条曲线重叠，因此它们必须是透明的，就像在所附的图像中一样.我知道使用 ggplot 命令有一种非常简单的方法来执行此操作，但是我在语法上遇到了麻烦，并且无法正确定向我的数据框，以便每个直方图都可以从适当的位置提取.列.

Because the two curves overlap, they need to be transparent, like in the attached image. I know that there is a pretty straightforward way to do this using the ggplot command, but I am having trouble with the syntax, as well as getting my data frame oriented correctly so that each histogram pulls from the proper columns.

非常感谢您的帮助.

推荐答案

使用 df 作为您在帖子中提到的数据，您可以尝试以下操作:

With df being the data you mentioned in your post, you can try this:

用下一个代码分隔数据帧，然后绘制:

Separate dataframes with next code, then plot:

library(tidyverse)
library(gdata)
#Index
i1 <- which(startsWith(names(df),pattern = 'vcr'))
i2 <- which(startsWith(names(df),pattern = 'sim'))
#Isolate
df1 <- df[,c(1,i1)]
df2 <- df[,c(1,i2)]
#Melt
M1 <- pivot_longer(df1,cols = names(df1)[-1])
M2 <- pivot_longer(df2,cols = names(df2)[-1])
#Plot 1
ggplot(M1) + geom_density(aes(x=value,fill=name), alpha=.5)
#Plot 2
ggplot(M2) + geom_density(aes(x=value,fill=name), alpha=.5)

更新

对一个情节使用下一个代码:

Use next code for one plot:

#Unique plot
#Melt
M <- pivot_longer(df,cols = names(df)[-1])
#Mutate
M$var <- ifelse(startsWith(M$name,'vcr',),'vcr','sim_vcr')
#Plot 3
ggplot(M) + geom_density(aes(x=value,fill=var), alpha=.5)

这篇关于使用ggplot绘制两条重叠的密度曲线的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！