在函数中使用dplyr的问题（group_by）

本文介绍了在函数中使用dplyr的问题（group_by）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用dplyr进行一些数据操作。背景：我有一个调查重量和一堆变量（主要是喜欢项目）。我想将每个类别的频率和百分比与有和没有调查重量相加。

I want to use dplyr for some data manipulation. Background: I have a survey weight and a bunch of variables (mostly likert-items). I want to sum the frequencies and percentages per category with and without survey weight.

举个例子，我们只是用频率来表示性别变量。结果应该是：

As an example, let us just use frequencies for the gender variable. The result should be this:

 gender freq    freq.weighted
    1       292     922.2906
    2       279     964.7551
    9         6      21.7338

我将为许多变量执行此操作。所以，我决定把dplyr代码放在一个函数中，所以我只需要改变变量并输入更少的值。

I will do this for many variables. So, i decided to put the dplyr-code inside a function, so i only have to change the variable and type less.

#exampledata
gender<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2","2","2","2","2","2","2","2")
survey_weight<-c("2.368456","2.642901","2.926698","3.628653","3.247463","3.698195","2.776772","2.972387","2.686365","2.441820","3.494899","3.133106","3.253514","3.138839","3.430597","3.769577","3.367952","2.265350","2.686365","3.189538","3.029999","3.024567","2.972387","2.730978","4.074495","2.921552","3.769577","2.730978","3.247463","3.230097")
test_dataframe<-data.frame(gender,survey_weight)

#function
weighting.function<-function(dataframe,variable){
  test_weighted<- dataframe %>%
    group_by_(variable) %>%
    summarise_(interp(freq=count(~weight)),
               interp(freq_weighted=sum(~weight)))
  return(test_weighted)
}

result_dataframe<-weighting.function(test_dataframe,"gender")

#this second step was left out in this example:
#mutate_(perc=interp(~freq/sum(~freq)*100),perc_weighted=interp(~freq_weighted/sum(~freq_weighted)*100))

导致以下错误消息：

Error in UseMethod("group_by_") :
  no applicable method for 'group_by_' applied to an object of class "formula"

I尝试了很多不同的事情。首先，我使用 freq = n（）来计算频率，但是我总是收到一个错误（我检查过，plyr是在dplyr之前加载的，而不是之后 - 也没有没有工作。）

I have tried a lot of different things. First, I used freq=n() to count the frequencies, but I always got an Error (i checked, that plyr was loaded before dplyr and not afterwards - it also didn´t work.).

任何想法？我阅读了关于标准评估的小插曲。但是，我总是遇到问题，不知道可能是什么解决方案。

Any ideas? I read the vignette on standard evaluation. But, i always run into problems and have no idea what could be a solution.

推荐答案

我想你有几个嵌套的错误这是造成你的问题。最大的一个是使用 count（）而不是 summarize（）。我猜你想要 n（）：

I think you have a few nested mistakes which is causing you problems. The biggest one is using count() instead summarise(). I'm guessing you wanted n():

weighting.function <- function(dataframe, variable){
  dataframe %>%
    group_by_(variable) %>%
    summarise_(
      freq = ~n(),
      freq_weighted = ~sum(survey_weight)
    )
}

weighting.function(test_dataframe, ~gender)

你还有一些不必要的使用 interp（）。如果您使用 interp（），则该调用应该看起来像 freq = interp（〜n（）），即该名称不在外部调用中，而插入的东西从〜开始。

You also had a few unneeded uses of interp(). If you do use interp(), the call should look like freq = interp(~n()), i.e. the name is outside the call to interp, and the thing being interpolated starts with ~.

这篇关于在函数中使用dplyr的问题（group_by）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！