R中的预测功能出现问题

R中的预测功能出现问题

本文介绍了R中的预测功能出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我加载了内置的R数据女性",该数据具有美国女性平均身高和相应体重的表格数据.该表有15行.我试图使用这些数据来预测身高特定值的体重.我首先建立了线性模型,并给出了新的预测值.但是R仍然从原始数据中得出15个数字.

I loaded the inbuilt R data 'women' which has a tabular data of average American women's height and corresponding weight. this table has 15 rows. Using this data I am trying to predict the weight for specific values of height. I made a linear model first and gave new values to predict. But R still comes up with the 15 figures from the original data.

我是回归的初学者,所以请告诉我我在这里做错了什么.

I am a beginner in regression so please tell me if I am doing anything wrong here.

 data()
> women<-data.frame(women)
> names(women)
[1] "height" "weight"
> plot(women$weight~women$height)
> model<-lm(women$weight~women$height,data=women)
> new<-data.frame(height=c(82,83,84,85))
> wgt.prediction<-predict(model,new)
Warning message:
'newdata' had 4 rows but variables found have 15 rows
 > wgt.prediction
   1        2        3        4        5        6        7        8        9          10       11       12       13
112.5833 116.0333 119.4833 122.9333 126.3833 129.8333 133.2833 136.7333 140.1833 143.6333 147.0833 150.5333 153.9833
  14       15
157.4333 160.8833

推荐答案

请注意,在原始数据范围之外推断预测可能会得出较差的答案;但是,请忽略尝试以下操作.

Note that extrapolating predictions outside the range of the original data can give poor answers; however, ignoring that try the following.

首先,没有必要使用data()data.frame.无论如何,women都可以使用,并且它已经是一个数据框.

First, it is not necessary to use data() or data.frame. women will be available to you anyways and it is already a data frame.

此外,在问题中将模型的自变量指定为women$height,但预测将其指定为height.不知道women$heightheight是相同的.

Also, the model's independent variable was specified in the question as women$height but the prediction specified it as height. It does not know that women$height and height are the same.

将所有代码替换为此:

fo <- weight ~ height
model <- lm(fo, women)
heigths <- c(82, 83, 84, 85)
weights <- predict(model, data.frame(height = heights))

给予:

> weights
       1        2        3        4
195.3833 198.8333 202.2833 205.7333

使用预测值(即使用weights)和由model定义的回归线(在图表后续)绘制数据:

To plot the data with the predictions (i.e. with weights) and the regression line defined by model (continued after graph):

plot(fo, women, xlim = range(c(height, heights)), ylim = range(c(weight, weights)))
points(weights ~ heights, col = "red", pch = 20)
abline(model)

尽管通常使用predict,但是鉴于公式中使用$引入的问题,使用原始公式的另一种方法是计算这样的预测:

Although normally one uses predict, given the problem introduced by using $ in the formula, an alternative using your original formulation would be to calculate the predictions like this:

model0 <- lm(women$weight ~ women$height)
cbind(1, 82:85) %*% coef(model0)

给予:

         [,1]
[1,] 195.3833
[2,] 198.8333
[3,] 202.2833
[4,] 205.7333

这篇关于R中的预测功能出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 22:18