本文介绍了在R中的某些观察之前选择组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
说
data=structure(list(x1 = c(88L, 88L, 94L, 82L, 68L, 72L, 43L, 84L,
65L, 91L, 65L, 80L, 82L, 63L, 67L, 58L, 100L, 32L, 75L, 66L,
30L, 12L, 97L, 58L, 14L, 64L), group = structure(c(2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("female", "male"), class = "factor")), .Names = c("x1",
"group"), class = "data.frame", row.names = c(NA, -26L))
在此数据中有组变量(性别(男性和女性)
,我需要统计平均值,所有在女性之前的男性都需要25%的百分比。在女性之后,我不接触的男性。女性
所以输出
In this data there is group variable (sex (male and female)I need get statistics mean and 25 percentile for ALL male which go before female. Male which after female, i don't touch. Also female i don't touch.So as output
x1 group mean 25%
88 male 76,36 66,5
88 male 76,36 66,5
94 male 76,36 66,5
82 male 76,36 66,5
68 male 76,36 66,5
72 male 76,36 66,5
43 male 76,36 66,5
84 male 76,36 66,5
65 male 76,36 66,5
91 male 76,36 66,5
65 male 76,36 66,5
80 female
82 female
63 female
67 female
58 female
100 female
32 female
75 male
66 male
30 male
12 male
97 male
58 male
14 male
64 male
该怎么做?
x1 group
88 male
88 male
94 male
82 male
68 male
72 male
43 male
84 male
65 male
91 male
65 male
80 female
82 female
63 female
67 female
58 female
100 female
32 female
**76,36 male
**76,36 male
30 male
12 male
**76,36 male
58 male
14 male
64 male
这里结果。
推荐答案
library(dplyr)
library(data.table)
data %>%
group_by(group, group2 = rleid(group)) %>% # group by gender and it's position
mutate(MEAN = mean(x1[group=="male" & group2==1]), # calculate metrics only for male in position 1
Q25 = quantile(x1[group=="male" & group2==1], 0.25)) %>%
ungroup() %>% # ungroup
select(-group2) %>% # remove column
data.frame() # only for visualisation purposes
# x1 group MEAN Q25
# 1 88 male 76.36364 66.5
# 2 88 male 76.36364 66.5
# 3 94 male 76.36364 66.5
# 4 82 male 76.36364 66.5
# 5 68 male 76.36364 66.5
# 6 72 male 76.36364 66.5
# 7 43 male 76.36364 66.5
# 8 84 male 76.36364 66.5
# 9 65 male 76.36364 66.5
# 10 91 male 76.36364 66.5
# 11 65 male 76.36364 66.5
# 12 80 female NaN NA
# 13 82 female NaN NA
# 14 63 female NaN NA
# 15 67 female NaN NA
# 16 58 female NaN NA
# 17 100 female NaN NA
# 18 32 female NaN NA
# 19 75 male NaN NA
# 20 66 male NaN NA
# 21 30 male NaN NA
# 22 12 male NaN NA
# 23 97 male NaN NA
# 24 58 male NaN NA
# 25 14 male NaN NA
# 26 64 male NaN NA
用于更新 x1
列根据您提到的逻辑可以使用:
For updating x1
column according to the logic you mentioned you can use this:
data %>%
group_by(group, group2 = rleid(group)) %>%
mutate(MEAN = mean(x1[group=="male" & group2==1]),
Q25 = quantile(x1[group=="male" & group2==1], 0.25)) %>%
ungroup() %>%
mutate(x1 = ifelse(group=="male" & group2==3 & x1 > unique(Q25[!is.na(Q25)]), unique(MEAN[!is.na(MEAN)]), x1)) %>%
ungroup() %>%
select(-group2) %>%
data.frame()
# x1 group MEAN Q25
# 1 88.00000 male 76.36364 66.5
# 2 88.00000 male 76.36364 66.5
# 3 94.00000 male 76.36364 66.5
# 4 82.00000 male 76.36364 66.5
# 5 68.00000 male 76.36364 66.5
# 6 72.00000 male 76.36364 66.5
# 7 43.00000 male 76.36364 66.5
# 8 84.00000 male 76.36364 66.5
# 9 65.00000 male 76.36364 66.5
# 10 91.00000 male 76.36364 66.5
# 11 65.00000 male 76.36364 66.5
# 12 80.00000 female NaN NA
# 13 82.00000 female NaN NA
# 14 63.00000 female NaN NA
# 15 67.00000 female NaN NA
# 16 58.00000 female NaN NA
# 17 100.00000 female NaN NA
# 18 32.00000 female NaN NA
# 19 76.36364 male NaN NA
# 20 66.00000 male NaN NA
# 21 30.00000 male NaN NA
# 22 12.00000 male NaN NA
# 23 76.36364 male NaN NA
# 24 58.00000 male NaN NA
# 25 14.00000 male NaN NA
# 26 64.00000 male NaN NA
我添加的额外代码( mutate
)仅在男性之后(仅次于女性)更新 x1
group2 = 3'),且仅当
x1`大于分位数值时。
The extra piece of code I added (mutate
) updates x1
only for males after females (i.e. group2 = 3') and only if
x1` is bigger than the quantile value.
这篇关于在R中的某些观察之前选择组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!