我想在一个列中但在分组列定义的组中重新排列因子的水平。

简单的示例数据集:

df <- structure(list(a_factor = structure(1:6, .Label = c("a", "b",
"c", "d", "e", "f"), class = "factor"), group = structure(c(1L,
1L, 1L, 2L, 2L, 2L), .Label = c("group1", "group2"), class = "factor"),
value = 1:6), class = "data.frame", row.names = c(NA, -6L
))

> df
  a_factor  group value
1        a group1     1
2        b group1     2
3        c group1     3
4        d group2     4
5        e group2     5
6        f group2     6

更确切地说,我如何重新排序因子水平,例如由value降到df$group == "group1",但由value升到df$group == "group2",最好在dplyr中?

预期的输出可能是:
> df
  a_factor  group value
1        c group1     3
2        b group1     2
3        a group1     1
4        d group2     4
5        e group2     5
6        f group2     6

虽然,问题更笼统地在于如何在dplyr中解决此问题。

最佳答案

要对因子水平进行重新排序,可以使用forcats(tidyverse的一部分),并执行类似的操作...

library(forcats)
df2 <- df %>% mutate(a_factor = fct_reorder(a_factor,
                                            value*(-1 + 2 * (group=="group1"))))

levels(df2$a_factor)
[1] "f" "e" "d" "a" "b" "c"

这不会重新排列数据框本身...
df2
  a_factor  group value
1        a group1     1
2        b group1     2
3        c group1     3
4        d group2     4
5        e group2     5
6        f group2     6

关于组内的重新排序因子级别,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57696639/

10-12 18:57