问题描述
我目前正在处理一个非常小的数据集(20个观察值,我知道这很糟糕).但是我需要以某种方式预测这些价值.当我简单地对因变量进行时间回归时,我可以做出预测,但是当我添加滞后或差异变量时,则预测未来不会超过一年.这是因为观察太少了吗?
I am currently dealing witha very small data set (20 observations, I know it's terrible). But I need to somehow forecast out the values. When I simply regress time on the dependent variable I am able to get a prediction, but when I add lagged or differenced variables it does not predict more than one year into the future. Is this due to having too few observations?
这是我的上下文代码.这两行注释掉了对当前数据的更好拟合预测的结果,但仅生成了一个未来预测.
Here is my code for context. The two lines have have commented out result in a better fitting prediction for present data, but generate only one future prediction.
use "scrappage.dta", clear
drop if year == 1993
tsappend, add(12)
tsset year, y
reg scrappagerate year
*reg scrappagerate year l.scrappagerate l2.scrappagerate
*reg scrappagerate year d.scrappagerate d2.scrappagerate
predict p
predict yp if year>year(2013)
tsline yp p scrappagerate
很抱歉,如果这是一个愚蠢的问题,这是我第一次使用Stata来预测值.
Sorry if this is a stupid question, this is my first time using Stata to predict values.
推荐答案
这是您的问题:
仅获得一个预测的原因与预测功能无关,而是数据的性质.假设您有N
个观察值.在您的情况下,您使用了tsappend, add(12)
,因此具有N+12
观察值.并且您的l1.y
滞后变量将向下移动到第N+1
行.
Here's your problem:
The reason you're obtaining only one prediction has nothing to do with the predict function, but the nature of your data. Let's say you have N
observations. In your case, you used tsappend, add(12)
, making it so you have N+12
observations. And your l1.y
lagged variable will carry down to the N+1
th row.
Stata的predict
函数将对所有非缺失数据进行预测,只要有可用的预测变量即可.因此,由于您的自变量l1.y
填充在N + 1
行中,因此Stata可以预测该观察结果. (类似地,predict
不会预测第一个观测值,因为您的滞后预测变量将丢失.)
Stata's predict
function will predict on all non-missing data, where there are available predictors. Therefore, since your independent variable, l1.y
is populated in the N + 1
row, Stata will predict that observation. (Similarly, predict
won't predict the 1st observation, since the your lagged predictor will be missing.)
为了在Stata中使用OLS回归获得动态预测,您需要将此第N+1
个预测输入X矩阵,并使用回归系数矩阵来预测N+2
观测值.然后,您进行迭代.
In order to get dynamic prediction using OLS regression in Stata, you need to feed this N+1
th prediction into an X matrix and use the regression coefficient matrix to predict the N+2
observation. You then iterate.
* Example of how to do dynamic prediction using OLS regression and lagged variables
clear
set obs 12
gen time = _n
gen y = rnormal(100,100)
tsset time
tsappend, add(12)
gen y_lag1 = l1.y
* Establish the regression relationship and save the coefficients
regress y y_lag1
matrix a = r(table)'
matrix beta = a[1..2,1]
* Predict the N+1 value (notice you have y_lag1 in the 13th row)
predict yhat
* Predict the next values
local lag = 1
forval i = 14/24 {
local last_y = yhat[`i'-`lag']
matrix xinput = [`last_y',1]
* Estimate the next sales
matrix next_y = xinput*beta
replace yhat = next_y[1,1] in `i'
}
将其与使用ARIMA模型进行比较(根据Dimitriy V. Masterov的评论),您将获得几乎相同的结果.
Comparing this to using the ARIMA model (as per Dimitriy V. Masterov's comment), and you get nearly identical results.
arima y l1.y
predict yhat_ar, dyn(13)
这篇关于使用滞后结果作为回归变量时,如何使Stata生成动态预测?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!