如何在dplyr::do内部实现函数的方法分派?

我已经阅读了GitHub问题#719#3558#3429,其中包含有关如何为dplyr动词创建方法的有用信息,但没有什么特别适用于dplyr::do的内容-这是“特殊的” ”,从某种意义上说,调度不仅需要针对dplyr:do本身进行,而且还需要针对dplyr::do内部调用的函数进行(至少我要这样做)

这是我尝试过的:

初赛

library(dplyr)
#>
#> Attache Paket: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union

# Example data ------------------------------------------------------------

df <- tibble::tibble(
  id = c(rep("A", 5), rep("B", 5)),
  x = 1:10
)

df_custom <- df
class(df_custom) <- c("tbl_df_custom", class(df_custom))

# Reclass function --------------------------------------------------------

reclass <- function(x, result) {
  UseMethod('reclass')
}

reclass.default <- function(x, result) {
  class(result) <- unique(c(class(x)[[1]], class(result)))
  attr(result, class(x)[[1]]) <- attr(x, class(x)[[1]])
  result
}


步骤1:尝试为dplyr动词定义方法

# Custom method for summarize ---------------------------------------------

summarise.tbl_df_custom <- function (.data, ...) {
  message("Custom method for `summarise`")
  result <- NextMethod("summarise")
  ret <- reclass(.data, result)
  print(class(ret))
  ret
}

ret <- df_custom %>%
  summarise(y = mean(x))
#> Custom method for `summarise`
#> [1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"
ret %>% class()
#> [1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"


步骤2:尝试为另一个dplyr动词定义方法以测试更长的管道

# Custom method for group_by ----------------------------------------------

group_by.tbl_df_custom <- function (.data, ..., add = FALSE) {
  message("Custom method for `group_by`")
  result <- NextMethod("group_by")
  ret <- reclass(.data, result)
  print(class(ret))
  ret
}

ret <- df_custom %>%
  group_by(id) %>%
  summarise(y = mean(x))
#> Custom method for `group_by`
#> [1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"
#> [5] "data.frame"
#> Custom method for `summarise`
#> [1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"
ret %>% class()
#> [1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"


步骤3:对do尝试相同的操作

# Custom method for do ----------------------------------------------------

do.tbl_df_custom <- function (.data, ...) {
  message("custom method for `do`")
  result <- NextMethod("do")
  ret <- reclass(.data, result)
  print(class(ret))
  ret
}

foo <- function(df) {
  UseMethod("foo")
}

foo.default <- function(df) {
  message("Default method for `foo`")
  df %>%
    summarise(y = mean(x))
}

foo.tbl_df_custom <- function(df) {
  message("Custom method for `foo`")
  df %>%
    summarise(y = mean(x) * 100)
}

ret <- df_custom %>%
  group_by(id) %>%
  do(foo(.))
#> Custom method for `group_by`
#> [1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"
#> [5] "data.frame"
#> custom method for `do`
#> Default method for `foo`
#> Default method for `foo`
#> [1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"
#> [5] "data.frame"
ret
#> # A tibble: 2 x 2
#> # Groups:   id [2]
#>   id        y
#>   <chr> <dbl>
#> 1 A         3
#> 2 B         8
ret %>% class()
#> [1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"
#> [5] "data.frame"


乍一看似乎可以,但问题是调用了默认方法而不是foo的自定义方法。

reprex package(v0.2.1)于2019-01-08创建

最佳答案

因此,问题与this question I just asked有关。我能够通过定义3个新功能来解决它​​:ungroup.tbl_df_custom,类构造函数和[.tbl_df_custom

ungroup.tbl_df_custom <- function (.data, ...) {
  message("custom method for `ungroup`")
  result <- NextMethod("ungroup")
  ret <- reclass(.data, result)
  ret
}


new_custom <- function(x, ...) {

  structure(x, class = c("tbl_df_custom", class(x)))
}

`[.tbl_df_custom` <- function(x, ...) {
  new_custom(NextMethod())
}



df_custom2 <- new_custom(df)


df_custom2 %>%
  group_by(id) %>%
  do(foo(.))

Custom method for `group_by`
[1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"           "data.frame"
custom method for `do`
custom method for `ungroup`
Custom method for `foo`
Custom method for `summarise`
[1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"
Custom method for `foo`
Custom method for `summarise`
[1] "tbl_df_custom" "tbl_df"        "tbl"           "data.frame"
[1] "tbl_df_custom" "grouped_df"    "tbl_df"        "tbl"           "data.frame"
custom method for `ungroup`
# A tibble: 2 x 2
# Groups:   id [2]
  id        y
  <chr> <dbl>
1 A       300
2 B       800

关于r - dplyr::do内部函数的方法分派(dispatch),我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/54083208/

10-12 17:34
查看更多