这个解释比较详细。基本上,你可以将drift理解为这个time series 随时间有线性增长。
When fitting ARIMA models with R, a constant term is NOT included in the model if there is any differencing. The best R will do by default is fit a mean if there is no differencing [type ?arima for details]. What's wrong with this? Well (with a time series in x), for example:
arima(x, order = c(1, 1, 0)) # (1)
will not produce the same result as
arima(diff(x), order = c(1, 0, 0)) # (2)
because in (1), R will fit the model [with ∇x(s) = x(s)-x(s-1)]
∇x(t)= φ*∇x(t-1) + w(t) (no constant)
whereas in (2), R will fit the model
∇x(t) = α + φ*∇x(t-1) + w(t). (constant)
If there's drift (i.e., α is NOT zero), the two fits can be extremely different and using (1) will lead to an incorrect fit and consequently bad forecasts (see Issue 3 below).
If α is NOT zero, then what you have to do to correct (1) is use xreg as follows:
arima(x, order = c(1, 1, 0), xreg=1:length(x)) # (1+)
Why does this work? In symbols, xreg = t and consequently, R will replace x(t) with y(t) = x(t)-β*t; that is, it will fit the model
∇y(t)= φ*∇y(t-1) + w(t),
or
∇[x(t) - β*t] = φ*∇[x(t-1) - β*(t-1)] + w(t).
Simplifying,
∇x(t) = α + φ*∇x(t-1) + w(t) where α = β*(1-φ).
If you want to see the differences, generate a random walk with drift and try to fit an ARIMA(1,1,0) model to it. Here's how:
set.seed(1) # so you can reproduce the results
v = rnorm(100,1,1) # v contains 100 iid N(1,1) variates
x = cumsum(v) # x is a random walk with drift = 1
plot.ts(x) # pretty picture...
arima(x, order = c(1, 1, 0)) #(1)
Coefficients:
ar1
0.6031
s.e. 0.0793
arima(diff(x), order = c(1, 0, 0)) #(2)
Coefficients:
ar1 intercept <-- remember, this is the mean of diff(x)
-0.0031 1.1163 and NOT the intercept
s.e. 0.1002 0.0897
arima(x, order = c(1, 1, 0), xreg=1:length(x)) #(1+)
Coefficients:
ar1 1:length(x) <-- this is the intercept of the model
-0.0031 1.1169 for diff(x)... got a headache?
s.e. 0.1002 0.0897
Let me explain what's going on here. The model generating the data is
x(t) = 1 + x(t-1) + w(t)
where w(t) is N(0,1) noise. Another way to write this is
[x(t)-x(t-1)] = 1 + 0*[x(t-1)-x(t-2)] + w(t)
or
∇x(t) = 1 + 0*∇x(t-1) + w(t)
so, if you fit an AR(1) to ∇x(t), the estimates should be, approximately, ar1 = 0 and intercept = 1.
Note that (1) gives the WRONG answer because it's forcing the regression to go through the origin. But, (2) and (1+) give the correct answers expressed in two different ways.
|