问题描述
借鉴关于条件dplyr评估的讨论我想根据参考列是否有条件地在管道中执行步骤存在于传递的数据帧中.
Drawing on the discussion on conditional dplyr evaluation I would like conditionally execute a step in pipeline depending on whether the reference column exists in the passed data frame.
1)
和 2)
生成的结果应该相同.
# 1)
mtcars %>%
filter(am == 1) %>%
filter(cyl == 4)
# 2)
mtcars %>%
filter(am == 1) %>%
{
if("cyl" %in% names(.)) filter(cyl == 4) else .
}
不可用列
# 1)
mtcars %>%
filter(am == 1)
# 2)
mtcars %>%
filter(am == 1) %>%
{
if("absent_column" %in% names(.)) filter(absent_column == 4) else .
}
问题
对于可用列,传递的对象与初始数据帧不对应.原始代码返回错误消息:
Problem
For the available column the passed object does not correspond to the initial data frame. The original code returns the error message:
我尝试了其他语法(没有运气):
I have tried alternative syntax (with no luck):
>> mtcars %>%
... filter(am == 1) %>%
... {
... if("cyl" %in% names(.)) filter(.$cyl == 4) else .
... }
Show Traceback
Rerun with Debug
Error in UseMethod("filter_") :
no applicable method for 'filter_' applied to an object of class "logical"
跟进
我想扩展这个问题,以解释 filter
通话中 ==
右侧的评估.例如,以下语法尝试根据第一个可用值进行过滤.mtcars%>%
Follow-up
I wanted to expand this question that would account for the evaluation on the right-hand side of the ==
in filter
call. For instance the syntax below attempts to filter on the first available value.mtcars %>%
filter({
if ("does_not_ex" %in% names(.))
does_not_ex
else
NULL
} == {
if ("does_not_ex" %in% names(.))
unique(.[['does_not_ex']])
else
NULL
})
预期,该呼叫会评估为错误消息:
Expectedly, the call evaluates to an error message:
应用于现有列时:
mtcars %>%
filter({
if ("mpg" %in% names(.))
mpg
else
NULL
} == {
if ("mpg" %in% names(.))
unique(.[['mpg']])
else
NULL
})
它与警告消息一起工作:
It works with a warning message:
mpg cyl disp hp drat wt qsec vs am gear carb
1 21 6 160 110 3.9 2.62 16.46 0 1 4 4
后续问题
是否有一种巧妙的方法来扩展现有语法,以便在filter
调用的右侧获得条件评估,从而理想地留在dplyr工作流程中?
Follow-up question
Is there a neat way of expending the existing syntax in order to get conditional evaluation on the right-hand side of the filter
call, ideally staying within dplyr workflow?
推荐答案
由于此处作用域的工作方式,您无法从if
语句中访问数据框.幸运的是,您不需要.
Because of the way the scopes here work, you cannot access the dataframe from within your if
statement. Fortunately, you don't need to.
尝试:
mtcars %>%
filter(am == 1) %>%
filter({if("cyl" %in% names(.)) cyl else NULL} == 4)
在这里您可以在条件中使用'.
'对象,以便检查该列是否存在,如果存在,则可以将该列返回给filter
函数.
Here you can use the '.
' object within the conditional so you can check if the column exists and, if it exists, you can return the column to the filter
function.
根据问题的docendo discimus评论,您可以访问数据框,但不能隐式访问-即,您必须使用.
as per docendo discimus' comment on the question, you can access the dataframe but not implicitly - i.e. you have to specifically reference it with .
这篇关于仅当列存在时才执行dplyr操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!