问题描述
我正在跟进这个很棒的答案.在那个答案中,foo2
函数帮助用户识别在哪个唯一的 study
、另外两个选定的列(group
&outcome
) 是恒定或变化.
I'm following up on this great answer. In that answer, the foo2
function helped the user identify in which unique study
, any of the two other selected columns (group
&outcome
) are constant or vary.
现在,假设我们要另外确定是否存在唯一的study
,其中所选变量之一(结果
) 对于任何其他选定变量 (group
) 的某些行来说是常量.
Now, imagine we want to additionally identify if there are unique study
in which one of the selected variables (outcome
) is constant for some rows of any other selected variables (group
).
例如,在study==14
中,outcome
对于group
的某些行是恒定的.但是在study==8
中,outcome
对于group
的所有行都完全不同:
For example, in study==14
, outcome
is constant for some rows of group
. But in study==8
, outcome
is completely varying for all rows of group
:
study group outcome
17 14 1 6
18 14 2 6
19 14 3 7
20 14 4 7
study group outcome
9 8 1 2
10 8 2 3
11 8 3 4
有没有办法将foo2
扩展到附加识别像study==14
这样的研究?
Is there a way to extend foo2
to additionally identify studies like study==14
?
dat = read.csv("https://raw.githubusercontent.com/rnorouzian/s/main/cf.csv")
study8 = subset(dat, study==8)[1:3]; study14 = subset(dat, study==14)[1:3]
推荐答案
我们可以用 case_when
foo2 <- function(dat, study_col, ...) {
dot_cols <- ensyms(...)
str_cols <- purrr::map_chr(dot_cols, rlang::as_string)
dat %>%
dplyr::select({{study_col}}, !!! dot_cols) %>%
dplyr::group_by({{study_col}}) %>%
dplyr::mutate(grp = across(all_of(str_cols), ~ {
tmp <- n_distinct(.)
case_when(tmp == 1 ~ 1, tmp == n() ~ 2, tmp >1 & tmp < n() ~ 3, TRUE ~ 4)
}) %>%
purrr::reduce(stringr::str_c, collapse="")) %>%
dplyr::ungroup(.) %>%
dplyr::group_split(grp, .keep = FALSE)
}
foo2(dat, study, group, outcome)
这篇关于后续:泛化一个data.frame子集功能2的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!