如何使用R或Excel中的分组变量计算值的第95个百分点

本文介绍了如何使用R或Excel中的分组变量计算值的第95个百分点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试按流域分组的多个水质值计算第95个百分位数。例如...

 流域WQ 
 50500101 62.370661 
 50500101 65.505046 
 50500101 58.741477 
 50500105 71.220034 
 50500105 57.917249

我回顾了这个问题 - 每次观察的百分位数w / r / t分组变量。它似乎非常接近我想做的，但它是每个观察。我需要它为每个分组变量。所以理想情况下，

 流域WQ  -  95th 
 50500101 x 
 50500105 y

谢谢

解决方案

可以使用 plyr 库来实现。我们指定分组变量流域，并请求WQ的95％分位数。

  library（plyr）
 #Random seed 
 set.seed（42）
 #Sample data 
 dat<  -  data.frame（Watershed = sample（letters [1 ：2]，100，TRUE），WQ = rnorm（100））
 #plyr调用
 ddply（dat，Watershed，总结，WQ95 =分位数（WQ，.95））

结果

 流域WQ95 
 1 a 1.353993 
 2 b 1.461711

i'm trying to calculate the 95th percentile for multiple water quality values grouped by watershed. for example...

Watershed   WQ
50500101    62.370661
50500101    65.505046
50500101    58.741477
50500105    71.220034
50500105    57.917249

i reviewed this question posted - Percentile for Each Observation w/r/t Grouping Variable. it seems very close to what i want to do but it's for EACH observation. i need it for each grouping variable. so ideally,

Watershed   WQ - 95th
50500101    x
50500105    y

thanks

解决方案

This can be achieved using the plyr library. We specify the grouping variable Watershed and ask for the 95% quantile of WQ.

library(plyr)
#Random seed
set.seed(42)
#Sample data
dat <- data.frame(Watershed = sample(letters[1:2], 100, TRUE), WQ = rnorm(100))
#plyr call
ddply(dat, "Watershed", summarise, WQ95 = quantile(WQ, .95))

and the results

  Watershed     WQ95
    1         a 1.353993
    2         b 1.461711

这篇关于如何使用R或Excel中的分组变量计算值的第95个百分点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！