for(i in 1:n) {
vars = names(output)[names(output) %in% paste0(c(".PRE", ".POST"), i)]
models[[as.character(i)]] = lm(paste("growth_rate ~ ", paste(vars, collapse=" + ")),
data = output)
Error in parse(text = x, keep.source = FALSE) :
<text>:2:0: unexpected end of input
1: growth_rate ~
除了"RDPI_T"和"DRY_T"变量是交替排列的事实之外,我的数据集几乎类似于上述文章中给出的数据集(在这种情况下,我认为这并不重要).我有类似的变量是输出数据集中的69个PRE变量,分别称为id1.PRE,id2.PRE ... id69.PRE,还有69个POST变量,分别称为id1.POST,id2.POST ... id69.POST.另外,growth_rate在称为输出的同一数据集中.
如果您的列名为id1.PRE,id2.PRE,则您上面的粘贴功能将不起作用,这很可能引发错误. >
#simulate data
colnames(output)[-1] = c(paste("id",1:69,".PRE",sep=""),paste("id",1:69,".POST",sep=""))
output$year = 1901:2000
output$country = sample(letters,nrow(output),replace=TRUE)
#create list to hold models
models = vector("list",n)
for(i in 1:n) {
vars = paste0("id",i,c(".PRE", ".POST"))
# i think it works without as.formula, but better to be safe
FORMULA = as.formula(paste("growth_rate ~ ", paste(vars, collapse=" + ")))
models[[i]] = lm(FORMULA,data = output)
for(i in 1:n) {
vars = paste0("id",i,c(".PRE", ".POST"))
# add other variables
vars = c(vars,"country","year")
FORMULA = paste("growth_rate ~ ", paste(vars, collapse=" + "))
models[[i]] = lm(FORMULA,data = output)
I am trying to run this code from this post: looping with iterations over two lists of variables for a multiple regression in Rwith modified variable and data frame names, because it seems to do exactly what I want and uses a very similar dataset. However, it keeps giving me an error and I don't know why, so I would really appreciate if someone could help me to understand the error or the corresponding line of code so I could try to figure out what's wrong.
for(i in 1:n) {
vars = names(output)[names(output) %in% paste0(c(".PRE", ".POST"), i)]
models[[as.character(i)]] = lm(paste("growth_rate ~ ", paste(vars, collapse=" + ")),
data = output)
Error in parse(text = x, keep.source = FALSE) :
<text>:2:0: unexpected end of input
1: growth_rate ~
My dataset looks almost like the one given in the above mentioned post besides the fact that my "RDPI_T" and "DRY_T" variables are in an alternating order (which I dont think matters in this case). The analogous variables I have are 69 PRE variables called id1.PRE, id2.PRE ... id69.PRE and also 69 POST variables called id1.POST, id2.POST ... id69.POST in the output dataset. Also, growth_rate is in the same dataset called output.
Additionally, I would also like to add 2 more independent variables that are regular and do not come from a list: country and year but I am unsure how to incorporate that here?
Any help would be appreciated. Thank you!
If your columns are called id1.PRE, id2.PRE, then the paste function you have above will not work, which most likely throws the error.
Please do dput(head(output))
and paste the output, this allows us to see the column names and why it doesn't work.
Try something below,according to how you describe the column names:
#simulate data
colnames(output)[-1] = c(paste("id",1:69,".PRE",sep=""),paste("id",1:69,".POST",sep=""))
output$year = 1901:2000
output$country = sample(letters,nrow(output),replace=TRUE)
#create list to hold models
models = vector("list",n)
for(i in 1:n) {
vars = paste0("id",i,c(".PRE", ".POST"))
# i think it works without as.formula, but better to be safe
FORMULA = as.formula(paste("growth_rate ~ ", paste(vars, collapse=" + ")))
models[[i]] = lm(FORMULA,data = output)
If you want to include other variables:
for(i in 1:n) {
vars = paste0("id",i,c(".PRE", ".POST"))
# add other variables
vars = c(vars,"country","year")
FORMULA = paste("growth_rate ~ ", paste(vars, collapse=" + "))
models[[i]] = lm(FORMULA,data = output)