问题描述
我似乎偶然发现了我无法解释的 mutate / lag / ifelse 行为。我有以下(简化的)数据框:
test<-data.frame(type = c( START, END, START, START, START, START, END),
个字符串AsFactors = FALSE)
>测试
类型
1 START
2 END
3 START
4 START
5 START
6 START
7 START
8 END
我想修改列类型,以便具有交替的 START 和 END 对的序列(请注意, test 数据框只能使用 START 的序列, END 永远不会重复):
>所需的
类型
1 START
2 END
3 START
4 END
5 START
6 END
7 START
8 END
我想我可以用以下代码实现目标:
test%>%
mutate(type = ifelse(type == START&
dplyr :: lag(type,n = 1,default = END)== START&
dplyr :: lead(type,n = 1,default = END)== START , END,类型))
代码应检测到所在的行START 之前是 START ,然后是 START ,在这种情况下, type 的值更改为 END 。进行此更改之后,以下 START ( test 的第5行)不应该匹配,因为它以前的 type 的值现在为 END 。不幸的是,该命令的输出如下:
类型
1 START
2 END
3 START
4 END
5 END
6 END
7 START
8 END
就像 lag 看到的值不受变异影响。这是应该如何工作的吗?有没有一种方法可以使 lag 看到 mutate 在上一行中的效果?
版本:R版本3.2.3(2015-12-10),dplyr_0.4.3
更新:原因下面的Paul Rougieux解释了为什么上面的代码不起作用的原因:超前和滞后是固定的,并且不考虑进一步的修改。因此,我猜正确的答案是使用dplyr无法直接完成。
在 mutate()中分别定义滞后和前导变量您对 ifelse(type == START& lag == START& Lead == START, END,键入)的呼叫是无法正常工作:
test<-data.frame(type = c( START, END, START, START, START, START, END),stringsAsFactors = FALSE)
测试%&%;%
mutate(lag = dplyr :: lag(type,n = 1,default = END),
lead = dplyr :: lead(type,n = 1,default = END),
type2 = ifelse(type == START& lag == START& Lead == START,
END,类型))
#类型滞后线索type2
#1 START END END START
#2结束开始开始结束
#3开始结束开始开始
#4开始开始开始结束
#5开始开始开始结束
#6开始开始结束开始
#7 END START END END
dplyr :: mutate( )整体修改向量。超前和滞后是固定的,不考虑对 type 向量的进一步修改。在这种情况下,您需要一个`Reduce()̀函数。检查帮助(减少)。
I seem to have stumbled upon a mutate/lag/ifelse behaviour that I cannot explain. I have the following (simplified) dataframe:
test <- data.frame(type = c("START", "END", "START", "START", "START", "START", "END"), stringsAsFactors = FALSE) > test type 1 START 2 END 3 START 4 START 5 START 6 START 7 START 8 END
I would like to modify the column type in order to have a sequence of alternating START and END pairs (note that in the test dataframe only sequences of START are possible, END is never repeated):
> desired type 1 START 2 END 3 START 4 END 5 START 6 END 7 START 8 END
I thought I could achieve my goal with the following code:
test %>% mutate(type = ifelse( type == "START" & dplyr::lag(type, n=1, default="END") == "START" & dplyr::lead(type, n=1, default="END") == "START", "END" , type))
The code should detect rows in which START is preceded by a START and followed by a START, in which case the type value is changed to END. After this change, the following START (row number 5 of test) should not be matched, since its previous type value is now END. Unfortunately, the output of the command is the following:
type 1 START 2 END 3 START 4 END 5 END 6 END 7 START 8 END
It's like the value seen by lag is not affected by mutate. Is this how it is supposed to work? Is there a way to code it in a way that lag sees the effects of mutate on the previous row?
Versions: R version 3.2.3 (2015-12-10), dplyr_0.4.3
UPDATE: The reason why the above code doesn't work is explained by Paul Rougieux below: lead and lag are fixed and do not take into account further modification. So I guess the correct answer is "it cannot be done straightforwardly using dplyr".
Defining lag and lead variables separately in mutate() will show you that your call to ifelse(type == "START" & lag == "START" & lead == "START", "END" , type) is not going to work:
test <- data.frame(type = c("START", "END", "START", "START", "START", "START", "END"), stringsAsFactors = FALSE) test %>% mutate(lag = dplyr::lag(type, n=1, default="END"), lead = dplyr::lead(type, n=1, default="END"), type2 = ifelse(type == "START" & lag == "START" & lead == "START", "END" , type)) # type lag lead type2 #1 START END END START #2 END START START END #3 START END START START #4 START START START END #5 START START START END #6 START START END START #7 END START END END
dplyr::mutate() modifies the vector as a whole. Lead and lag are fixed and do not take into account further modification to the type vector. What you want is a `Reduce()̀ function in this case. Check help(Reduce).
这篇关于滞后没有看到mutate对上一行的影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!