本文介绍了将表达式传递到mapply的MoreArgs中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 dplyr 进行一些编程,并对如何将表达式传递为(特别是 MoreArgs )参数 mapply

I'm doing some programming using dplyr, and am curious about how to pass an expression as (specifically a MoreArgs) argument to mapply?

考虑一个简单函数 F 根据某些 ids time_range 的子集 data.frame code>,然后基于其他列 x 输出摘要统计信息。

Consider a simple function F that subsets a data.frame based on some ids and a time_range, then outputs a summary statistic based on some other column x.

require(dplyr)
F <- function(ids, time_range, df, date_column, x) {
    date_column <- enquo(date_column)
    x <- enquo(x)
    df %>%
        filter(person_id %chin% ids) %>%
        filter(time_range[1] <= (!!date_column) & (!!date_column) <= time_range[2]) %>%
        summarise(newvar = sum(!!x))
}

我们可以构成一些示例数据,我们可以将函数 F 应用于其中。

We can make up some example data to which we can apply our function F.

person_ids <- lapply(1:2, function(i) sample(letters, size = 10))
time_ranges <- lapply(list(c("2014-01-01", "2014-12-31"),
                           c("2015-01-01", "2015-12-31")), as.Date)

require(data.table)
dt <- CJ(person_id = letters,
         date_col  = seq.Date(from = as.Date('2014-01-01'), to = as.Date('2015-12-31'), by = '1 day'))
dt[, z := rnorm(nrow(dt))]  # The variable we will later sum over, i.e. apply F to.

我们可以成功地将函数应用于每个输入。

We can successfully apply our function to each of our inputs.

F(person_ids[[1]], time_ranges[[1]], dt, date_col, z)
F(person_ids[[2]], time_ranges[[2]], dt, date_col, z)

如果我愿意,我可以编写一个简单的for循环来解决我的问题。但是,如果我们尝试应用语法糖并将所有内容包装在 mapply 中,则会出现错误。

And so if I wanted, I could write a simple for-loop to solve my problem. But if we try to apply syntactic sugar and wrap everything within mapply, we get an error.

mapply(F, ids = person_ids, time_range = time_ranges, MoreArgs = list(df = dt, date_column = date_col, x = z))

# Error in mapply... object 'date_col' not found


推荐答案

mapply MoreArgs 作为列表提供,但是R尝试评估列表元素,从而导致错误。正如@Gregor所建议的那样,您可以 quote 我们不想立即评估的那些 MoreArgs ,以防止发生错误并允许该功能继续进行。可以使用基本 quote dplyr quo 完成:

In mapply, MoreArgs is provided as a list, but R tries to evaluate the list elements, causing the error. As suggested by @Gregor, you can quote those MoreArgs that we don't want to evaluate immediately, preventing the error and allowing the function to proceed. This can be done with base quote or dplyr quo:

mapply(F, person_ids, time_ranges, MoreArgs = list(dt, quote(date_col), quote(z)))

mapply(F, person_ids, time_ranges, MoreArgs = list(dt, quo(date_col), quo(z)))

另一种选择是使用 purrr 软件包中的 map2 tidyverse 等价于 mapply 的两个输入向量。 tidyverse 函数被设置为与非标准评估一起使用,从而避免了使用 mapply 时出现的错误。需要引用参数:

Another option is to use map2 from the purrr package, which is the tidyverse equivalent of mapply with two input vectors. tidyverse functions are set up to work with non-standard evaluation, which avoids the error you're getting with mapply without the need for quoting the arguments:

library(purrr)

map2(person_ids, time_ranges, F, dt, date_col, z)



[[1]]
    newvar
1 40.23419

[[2]]
    newvar
1 71.42327


通常,您可以使用 pmap ,可以在任意数量的输入向量上并行迭代:

More generally, you could use pmap, which iterates in parallel over any number of input vectors:

pmap(list(person_ids, time_ranges), F, dt, date_col, z)

这篇关于将表达式传递到mapply的MoreArgs中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-21 13:34