包按分类变量获取单独的汇总统计数据

包按分类变量获取单独的汇总统计数据

本文介绍了使用 Stargazer 包按分类变量获取单独的汇总统计数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 stargazer 来制作分组变量的每个类别的汇总统计量.我可以在单独的表中完成,但我希望将其全部放在一个表中 - 如果这对这个包来说不是不合理的挑战.

I would like to use stargazer to produce summary statistics for each category of a grouping variable. I could do it in separate tables, but I'd like it all in one – if that is not unreasonably challenging for this package.

例如

library(stargazer)
stargazer(ToothGrowth, type = "text")
#>
#> =========================================
#> Statistic N   Mean  St. Dev.  Min   Max
#> -----------------------------------------
#> len       60 18.813  7.649   4.200 33.900
#> dose      60 1.167   0.629   0.500 2.000
#> -----------------------------------------

ToothGrowth 中的 continue 变量提供汇总统计.我想通过分类变量 supp 分割那个总结,也在 ToothGrowth 中.

provides summery statistics for the continues variables in ToothGrowth. I would like to split that summery by the categorical variable supp, also in ToothGrowth.

对于预期结果的两个建议,

stargazer(ToothGrowth ~ supp, type = "text")
#>
#> ==================================================
#> Statistic         N   Mean   St. Dev.  Min   Max
#> --------------------------------------------------
#> OJ       len       30 16.963  8.266   4.200 33.900
#>          dose      30  1.167  0.634   0.500  2.000
#> VC       len       30 20.663  6.606   8.200 30.900
#>          dose      30  1.167  0.634   0.500  2.000
#> --------------------------------------------------
#>
 stargazer(ToothGrowth ~ supp, type = "text")
#>
#> ==================================================
#> Statistic          N   Mean   St. Dev.  Min   Max
#> --------------------------------------------------
#> len
#>        _by VC     30 16.963  8.266   4.200 33.900
#>        _by VC     30  1.167  0.634   0.500  2.000
#> _tot              60 18.813  7.649   4.200 33.900
#>
#> dose
#>        _by OJ     30 20.663  6.606   8.200 30.900
#>        _by OJ     30  1.167  0.634   0.500  2.000
#> _tot              60 1.167   0.629   0.500 2.000
#> --------------------------------------------------

推荐答案

解决方案

library(stargazer)
library(dplyr)
library(tidyr)

ToothGrowth %>%
    group_by(supp) %>%
    mutate(id = 1:n()) %>%
    ungroup() %>%
    gather(temp, val, len, dose) %>%
    unite(temp1, supp, temp, sep = '_') %>%
    spread(temp1, val) %>%
    select(-id) %>%
    as.data.frame() %>%
    stargazer(type = 'text')

结果

=========================================
Statistic N   Mean  St. Dev.  Min   Max
-----------------------------------------
OJ_dose   30 1.167   0.634   0.500 2.000
OJ_len    30 20.663  6.606   8.200 30.900
VC_dose   30 1.167   0.634   0.500 2.000
VC_len    30 16.963  8.266   4.200 33.900
-----------------------------------------

说明

这消除了 OP 在对原始答案的评论中提到的问题,我真正想要的是一个带有由分类变量分隔的汇总统计信息的单个表,而不是创建单独的表."我看到用 stargazer 做到这一点的最简单方法是使用 gather()unite() 创建一个新的数据框,其中包含每个组观察的变量, spread() 策略.唯一的技巧是通过按组创建唯一标识符并在调用 stargazer() 之前删除该变量来避免重复标识符.

Explanation

This gets rid of the problem mentioned by the OP in a comment to the original answer, "What I really want is a single table with summary statistics separated by a categorical variable instead of creating separate tables." The easiest way I saw to do that with stargazer was to create a new data frame that had variables for each group's observations using a gather(), unite(), spread() strategy. The only trick to it is to avoid duplicate identifiers by creating unique identifiers by group and dropping that variable before calling stargazer().

这篇关于使用 Stargazer 包按分类变量获取单独的汇总统计数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 22:54