问题描述
由于某种原因,当我指定glms时(结果也显示lm时),R不能预测数据的缺失值.这是一个示例:
For some reason, when I specify glms (and lm's too, it turns out), R is not predicting missing values of the data. Here is an example:
y = round(runif(50))
y = c(y,rep(NA,50))
x = rnorm(100)
m = glm(y~x, family=binomial(link="logit"))
p = predict(m,na.action=na.pass)
length(p)
y = round(runif(50))
y = c(y,rep(NA,50))
x = rnorm(100)
m = lm(y~x)
p = predict(m)
length(p)
p的长度应为100,但应为50.奇怪的是,我在同一脚本中还有其他预测可以根据丢失的数据进行预测.
The length of p should be 100, but its 50. The weird thing is that I have other predicts in the same script that do predict from missing data.
事实证明那些其他预测是完全错误的-我在做imputed.value = rnorm(N,mean.from.predict,var.of.prediction.interval)
.这个 recycled 来自length(predict)<N
时lm预测或glm预测函数的均值和sd向量,这与我正在寻找的完全不同.
It turns out that those other predicts were quite wrong -- I was doing imputed.value = rnorm(N,mean.from.predict,var.of.prediction.interval)
. This recycled the mean and sd vectors from the lm predict or glm predict functions when length(predict)<N
, which was quite different from what I was seeking.
所以我的问题是我的示例代码如何阻止glm和lm预测缺失值?
谢谢!
推荐答案
当glm
拟合模型时,它仅使用没有缺失值的情况.通过构造数据框并将其传递给predict.glm
,您仍然可以获得有关y
值缺失的情况的预测.
When glm
fits the model, it uses only the cases where there are no missing values. You can still get predictions for the cases where your y
values are missing, by constructing a data frame and passing that to predict.glm
.
predict(m, newdata=data.frame(y, x))
这篇关于Predict.glm无法预测响应中的缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!