本文介绍了使用 purrr 在混合数据上运行多个 glm 模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个玩具数据集:

Suppose we have a toy data set:

library(tidyverse)
library(purrr)

tbl <- tibble(a = rep(c(0, 1), each = 5),
              b = rep(c(0, 1), times = 5),
              c = runif(10),
              d = rexp(10)) %>%
    mutate_at(vars(1,2), as.factor)

其中 a 是因变量,b:d 是自变量.这个想法是为每个自变量运行 glm 模型:

where a is a dependent variable and b:d are independent variables. The idea is to run glm model for each independent variable:

  • glm(a ~ b, data = tbl, family = "binomial")
  • glm(a ~ c, data = tbl, family = "binomial")
  • glm(a ~ d, data = tbl, family = "binomial")

我最初的尝试如下:

tbl %>%
    pivot_longer(2:4, names_to = "key", values_to = "val") %>%
    group_split(key) %>%
    map(~ glm(a ~ val, data = .x, family = "binomial"))

这导致了错误,因为bc(或d)的数据类型不同.

This resulted in an error because data types of b and c (or d) are not the same.

Error: No common type for `b` <factor<dec08>> and `c` <double>.

我想知道如何解决这个问题.

I wonder how to address this issue.

推荐答案

无需整形,我们可以使用mapglm应用到各种自变量上,使用重新制定以创建如下公式:

Without reshaping, we can use map to apply glm to various independent variables and use reformulate to create the formula like :

purrr::map(names(tbl)[-1],~glm(reformulate(.x, 'a'),data = tbl,family = 'binomial'))

#[[1]]

#Call:  glm(formula = reformulate(.x, "a"), family = "binomial", data = tbl)

#Coefficients:
#(Intercept)           b1  
#    -0.4055       0.8109  

#Degrees of Freedom: 9 Total (i.e. Null);  8 Residual
#Null Deviance:     13.86 
#Residual Deviance: 13.46   AIC: 17.46

#...
#...

这篇关于使用 purrr 在混合数据上运行多个 glm 模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-16 08:54