我想在一个列中但在分组列定义的组中重新排列因子的水平。
简单的示例数据集:
df <- structure(list(a_factor = structure(1:6, .Label = c("a", "b",
"c", "d", "e", "f"), class = "factor"), group = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("group1", "group2"), class = "factor"),
value = 1:6), class = "data.frame", row.names = c(NA, -6L
))
> df
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6
更确切地说,我如何重新排序因子水平,例如由
value
降到df$group == "group1"
,但由value
升到df$group == "group2"
,最好在dplyr中?预期的输出可能是:
> df
a_factor group value
1 c group1 3
2 b group1 2
3 a group1 1
4 d group2 4
5 e group2 5
6 f group2 6
虽然,问题更笼统地在于如何在dplyr中解决此问题。
最佳答案
要对因子水平进行重新排序,可以使用forcats
(tidyverse
的一部分),并执行类似的操作...
library(forcats)
df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,
value*(-1 + 2 * (group=="group1"))))
levels(df2$a_factor)
[1] "f" "e" "d" "a" "b" "c"
这不会重新排列数据框本身...
df2
a_factor group value
1 a group1 1
2 b group1 2
3 c group1 3
4 d group2 4
5 e group2 5
6 f group2 6
关于组内的重新排序因子级别,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57696639/