问题描述
我想用 lm
函数进行线性回归.我的因变量是一个称为 AccountStatus
:
I want to do linear regression with the lm
function. My dependent variable is a factor called AccountStatus
:
1:0 days in arrears, 2:30-60 days in arrears, 3:60-90 days in arrears and 4:90+ days in arrears. (4)
作为自变量,我有几个数字变量:贷款价值
,债务收入
和利率
.
As independent variable I have several numeric variables: Loan to value
, debt to income
and interest rate
.
是否可以对这些变量进行线性回归?我看了一下互联网,发现了一些关于假人的东西,但这些都是关于自变量的.
Is it possible to do a linear regression with these variables? I looked on the internet and found something about dummy's, but those were all for the independent variable.
这不起作用:
fit <- lm(factor(AccountStatus) ~ OriginalLoanToValue, data=mydata)
summary(fit)
推荐答案
线性回归不对因变量使用分类变量,它必须是连续的.考虑到您的AccountStatus变量只有四个级别,因此将其视为连续是不可行的.在开始任何统计分析之前,应该先了解自己变量的测量水平.
Linear regression does not take categorical variables for the dependent part, it has to be continuous. Considering that your AccountStatus variable has only four levels, it is unfeasible to treat it is continuous. Before commencing any statistical analysis, one should be aware of the measurement levels of one's variables.
您可以使用多项逻辑回归,例如,请参见此处.或者,您可以将AccountStatus重新编码为二分法,并使用简单的逻辑回归.
What you can do is use multinomial logistic regression, see here for instance. Alternatively, you can recode the AccountStatus as dichotomous and use simple logistic regression.
很抱歉让您失望,但这只是多元回归的固有限制,与R无关.如果您想详细了解哪种统计技术适用于因变量和自变量的测量水平的不同组合,我可以竭诚建议这本书
Sorry to disappoint you, but this is just an inherent restriction of multiple regression, nothing to do with R really. If you want to learn more about which statistical technique is appropriate for different combinations of measurement levels of dependent and independent variables, I can wholeheartedly advise this book.
这篇关于线性模型(lm)当因变量是因子/分类变量时?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!