本文介绍了如何对总体中的子组应用回归?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有以下数据框
weight <- c(100, 137, 158, 225, 149)
age <- c(15, 18, 21, 31, 65)
gender <- c("Female, "Male, "Male", "Male", "Female")
table <- data.frame(weight, age, gender)
如果我想进行线性回归以了解体重如何预测年龄并进行检查,我会这样做:
If I wanted to do a linear regression to see how weight predicts age, as well as examine it, I'd do:
allData <- lm(age ~ weight, data = table)
summary(allData)
如果我只想检查体重如何预测年龄,该怎么办?就像在其中一样,仅使用女性数据人群来查看体重如何预测年龄?我在想类似的东西:
What do I do if I wanted to examine how weight predicts age for females only? As in, use only the female data population to see how weight predicts age? I'm thinking something like:
FemaleData <- lm(age ~ weight, data=table (gender="Female"))
推荐答案
library(dplyr)
library(broom)
# example dataset
weight <- c(100, 137, 158, 225, 149, 148)
age <- c(15, 18, 21, 31, 65, 64)
gender <- c("Female", "Male", "Male", "Male", "Female", "Female")
table <- data.frame(weight, age, gender)
# build model for each gender value and store it in a column
table %>%
group_by(gender) %>% # for each gender value
do(model = summary(lm(age ~ weight, data = .))) %>% # build a model
ungroup() -> tbl_models
# check how your new dataset looks like
tbl_models
# # A tibble: 2 x 2
# gender model
# * <fctr> <list>
# 1 Female <S3: summary.lm>
# 2 Male <S3: summary.lm>
# access / view model for Females
tbl_models %>% filter(gender == "Female") %>% pull(model)
# [[1]]
#
# Call:
# lm(formula = age ~ weight, data = .)
#
# Residuals:
# 1 2 3
# -0.0002125 -0.0101997 0.0104122
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) -8.706e+01 4.943e-02 -1761 0.000361 ***
# weight 1.021e+00 3.681e-04 2773 0.000230 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.01458 on 1 degrees of freedom
# Multiple R-squared: 1, Adjusted R-squared: 1
# F-statistic: 7.69e+06 on 1 and 1 DF, p-value: 0.0002296
# build model for each gender value and store it as a tidy dataset
table %>%
group_by(gender) %>%
do(tidy(lm(age ~ weight, data = .))) %>%
ungroup()
# # A tibble: 4 x 6
# gender term estimate std.error statistic p.value
# <fctr> <chr> <dbl> <dbl> <dbl> <dbl>
# 1 Female (Intercept) -87.0609860 0.0494272875 -1761.39518 0.0003614292
# 2 Female weight 1.0206120 0.0003680516 2773.01334 0.0002295769
# 3 Male (Intercept) -2.3370680 0.2181313917 -10.71404 0.0592475719
# 4 Male weight 0.1480985 0.0012299556 120.40961 0.0052869963
这篇关于如何对总体中的子组应用回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!