r - 多层次建模的协方差结构

我有一个大约300名患者的多级重复测量数据集，每组最多有10个重复测量可预测肌钙蛋白升高。数据集中还有其他变量，但这里没有包括它们。
我正在尝试使用nlme创建随机斜率，随机截距模型，其中患者之间的影响会有所不同，并且不同患者的时间影响会有所不同。当我尝试引入一阶协方差结构以允许由于时间而导致测量值的相关性时，我收到以下错误消息。

Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible

我已经包含了我的代码和数据集的样本，对于任何智慧的话我都会非常感激。

#baseline model includes only the intercept. Random slopes - intercept varies across patients
randomintercept <- lme(troponin ~ 1,
                       data = df, random = ~1|record_id, method = "ML",
                       na.action = na.exclude,
                       control = list(opt="optim"))

#random intercept and time as fixed effect
timeri <- update(randomintercept,.~. + day)
#random slopes and intercept: effect of time is different in different people
timers <- update(timeri, random = ~ day|record_id)
#model covariance structure. corAR1() first order autoregressive covariance structure, timepoints equally spaced
armodel <- update(timers, correlation = corAR1(0, form = ~day|record_id))
Error in `coef<-.corARMA`(`*tmp*`, value = value[parMap[, i]]) : Coefficient matrix not invertible

数据：

record_id   day troponin
1   1   32
2   0     NA
2   1   NA
2   2   NA
2   3   8
2   4   6
2   5   7
2   6   7
2   7   7
2   8   NA
2   9   9
3   0   14
3   1   1167
3   2   1935
4   0   19
4   1   16
4   2   29
5   0   NA
5   1   17
5   2   47
5   3   684
6   0   46
6   1   45440
6   2   47085
7   0   48
7   1   87
7   2   44
7   3   20
7   4   15
7   5   11
7   6   10
7   7   11
7   8   197
8   0   28
8   1   31
9   0   NA
9   1   204
10  0   NA
10  1   19

最佳答案

如果将优化器更改为“ nlminb”（或至少与您发布的精简数据集一起使用），则可以适合此要求。

armodel <- update(timers,
              correlation = corAR1(0, form = ~day|record_id),
              control=list(opt="nlminb"))

但是，如果您查看拟合模型，就会发现有问题-估计的AR1参数为-1，并且随机截距和斜率项与r = 0.998相关。

我认为问题在于数据的性质。大部分数据似乎在10-50范围内，但存在一到两个数量级的偏移（例如，单个6个偏移，最大约45000）。很难使模型适合这种尖峰数据。我强烈建议您对数据进行日志转换；标准诊断图（plot(randomintercept)）如下所示：

而适合对数刻度

rlog <- update(randomintercept,log10(troponin) ~ .)
plot(rlog)

尽管仍有一些证据表明存在异方差性，但它还是比较合理的。

AR +随机斜率模型适合：

ar.rlog <- update(rlog,
              random = ~day|record_id,
              correlation = corAR1(0, form = ~day|record_id))
## Linear mixed-effects model fit by maximum likelihood
## ...
## Random effects:
##  Formula: ~day | record_id
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr
## (Intercept) 0.1772409 (Intr)
## day         0.6045765 0.992
## Residual    0.4771523
##
##  Correlation Structure: ARMA(1,0)
##  Formula: ~day | record_id
##  Parameter estimate(s):
##       Phi1
## 0.09181557
## ...

快速浏览intervals(ar.rlog)表明自回归参数的置信区间为（-0.52,0.65），因此可能不值得保留...

有了模型中的随机斜率，异方差似乎不再是问题...

plot(rlog,sqrt(abs(resid(.)))~fitted(.),type=c("p","smooth"))

关于r - 多层次建模的协方差结构，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/39291148/