boxplot与自定义分位数

boxplot与自定义分位数

本文介绍了R. GGplot2,geom_boxplot与自定义分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中包含来自每个车站的每列火车有4列火车,6站和到达迟到的网络中的100次列车运行模拟的数据。我的数据看起来像这样:

I have a dataset which includes data from 100 simulations of train runs in a network with 4 trains, 6 stations and lateness at arrival for each train at each station. My data looks something like this:

MyData <- data.frame(
  Simulation = rep(sort(rep(1:100, 6)), 4),
  Train_number = sort(rep(c(100, 102, 104, 106), 100*6)),
  Stations = rep(c("ST_1", "ST_2", "ST_3", "ST_4", "ST_5", "ST_6"), 100*4),
  Arrival_Lateness = c(rep(0, 60), rexp(40, 1), rep(0, 60), rexp(40, 2), rep(0, 60), rexp(40, 3), rep(0, 60), rexp(40, 5))
  )

现在,我需要创建一个与以下类似的盒子图:

Now, I need to create a box plot which looks similar to this:

library(ggplot2)
m <- ggplot(MyData , aes(y = Arrival_Lateness, x = factor(Stations)))
m + geom_boxplot(aes(fill = factor(Train_number)))

但是这对我的数据不起作用,因为geom_boxplot使用四分位间距范围的胡须。我想为盒子和胡须定义我自己的分位数。我发现这个帖子从部分解决我的问题的Stackoverflow 。但是,当我应用解决方案时(我通过将fill = factor(Train_number)插入到aes函数中来修改代码),我得到了这个:

But this doesn't work for my data because geom_boxplot uses inter-quartile range for whiskers. I would like to define my own quantiles for boxes and whiskers. I found this post from Stackoverflow which partially solves my problem Changing whisker definition in geom_boxplot . But when I apply the solution (I modified the code by inserting fill = factor(Train_number) in to aes function) I get this:

f <- function(x) {
  r <- quantile(x, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))
  names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
  r
}

ggplot(MyData, aes(factor(Stations), Arrival_Lateness, fill = factor(Train_number))) + stat_summary(fun.data = f, geom="boxplot")

这显然不是我想要的。我需要像第一张图像那样为每个列车并排放置箱子,而不是像第二张箱子那样重叠。我该怎么做?

This is clearly not what I want. I need to have boxes for each train side by side like in the first image, and not overlapping like in the second one. How do I do this?

我会很感激任何帮助!

预先感谢!

推荐答案

您非常接近:只需将 position =dodge stat_summary(...)

You are so close: just add position="dodge" to the call to stat_summary(...).

ggplot(MyData, aes(factor(Stations), Arrival_Lateness,fill=factor(Train_number))) +
  stat_summary(fun.data = f, geom="boxplot",position="dodge")

ggplot 是一个很棒的工具,但其中一个令人沮丧的事情是,根据您使用的函数,默认值是不同的。对于 geom_boxplot(...)默认位置闪避,而 stat_summary(...)默认位置身份

ggplot is a fantastic tool, but one of the frustrating things about it is that the defaults are different depending on which function you are using. For geom_boxplot(...) the default position is "dodge", while for stat_summary(...) the default position is "identity".

这篇关于R. GGplot2,geom_boxplot与自定义分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 03:54