为什么 R 更新函数在使用 1 个参数调用时会更改 glmer 拟合?

本文介绍了为什么 R 更新函数在使用 1 个参数调用时会更改 glmer 拟合?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

下面是一个简化的代码，重现了在我看来像 R update 函数的问题:

Below is a simplified code reproducing what looks to me like a problem with R update function:

library('lme4')
f <- function(formula) {
  data <- data.frame(a = c(4, 5), rowi = c(1, 2), b = c(2, 2))
  fit0 <- glmer(formula, data = data, family = poisson(log))
  fit1 <- update(fit0)
  cat('f likelihoods: ', logLik(fit0), logLik(fit1), '\n')
}
g <- function() {
  f(a ~ -1 + (1|rowi) + offset(b))
  data <- data.frame(a = c(4, 5), rowi = c(1, 2), b = c(20, 40))
  f(a ~ -1 + (1|rowi) + offset(b))
  cat('g likelihood: ', logLik(glmer(a ~ -1 + (1|rowi) + offset(b),
      data = data, family = poisson(log))), '\n')
}
g()
data <- data.frame(a = c(4, 5), rowi = c(1, 2), b = c(50, 80))
g()
cat('global likelihood: ', logLik(glmer(a ~ -1 + (1|rowi) + offset(b),
    data = data, family = poisson(log))), '\n')

此代码输出:

f likelihoods:  -4.712647 -4.712647
f likelihoods:  -4.712647 -12.6914
g likelihood:  -12.6914
f likelihoods:  -4.712647 -14.22997
f likelihoods:  -4.712647 -12.6914
g likelihood:  -12.6914
global likelihood:  -14.22997

令人惊讶的(对我而言)是 update(fit0) 操作会在公式环境中定义 data 时更改模型.这是为什么?如何正确使用update来避免这样的陷阱?

The surprising (to me) thing is that update(fit0) operation changes the model when data is defined in the environment of the formula. Why is that? How to use update properly to avoid pitfalls like this?

推荐答案

我也遇到过这个问题.简短的回答是 update.merMod(model) 使用 environment(formula(model)) 来确定在重新拟合模型时使用哪个环境(如果失败，那么它将尝试封闭环境，等等).结果是 update() 使用创建公式的环境重新拟合模型，不是原始 merMod 对象所在的环境创建于.这与您编写的示例的行为一致.

I've run into this, too. The short answer is that update.merMod(model) employs environment(formula(model)) to determine which environment to use when refitting the model (if that fails, then it'll try the enclosing environment, and so on). The result is that update() refits the model using the environment that the formula was created in, not the environment that the original merMod object was created in. This is consistent with the behavior of the example you cooked up.

我解决这个问题的笨拙方法是将公式作为字符串传递，并确保在模型最初适合的同一函数体内转换为公式；例如

My clumsy way around this issue would be to pass the formulas around as strings, and be sure to convert to formulas inside the same function body where the model is originally fit; e.g.

f <- function(formula_string) {
  formula <- as.formula(formula_string)
  data <- data.frame(a = c(4, 5), rowi = c(1, 2), b = c(2, 2))
  fit0 <- glmer(formula, data = data, family = poisson(log))
  fit1 <- update(fit0)
  cat('f likelihoods: ', logLik(fit0), logLik(fit1), '\n')
}
g <- function() {
  f("a ~ -1 + (1|rowi) + offset(b)")
  data <- data.frame(a = c(4, 5), rowi = c(1, 2), b = c(20, 40))
  f("a ~ -1 + (1|rowi) + offset(b)")
  cat('g likelihood: ', logLik(glmer(a ~ -1 + (1|rowi) + offset(b),
      data = data, family = poisson(log))), '\n')
}

我不确定当前的行为是否出于某种原因是可取的(这是@benbolker 和其他 lme4 开发人员的问题)，或者较低级别的修复程序可能是什么样子... 除了在创建时显式设置/保存 merMod 对象的环境，或者使用 identical() 的递归框架搜索(ala where() 在 pryr 中).反对这些可能有很好的论据.

I'm not sure if the current behavior is desirable for some reason (that's a question for @benbolker and the other lme4 developers), or what a lower-level fix might look like ... aside from either explicitly setting/saving the environment of the merMod object at creation, or a recursive frame search that uses identical() (ala where() in pryr). There are probably good arguments against these.

这篇关于为什么 R 更新函数在使用 1 个参数调用时会更改 glmer 拟合?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！