Dynamic Regression Models
Luc Bauwens
Michel Lubrano
Jean-François Richard
DOI:10.1093/acprof:oso/9780198773122.003.0005
Abstract and Keywords
This chapter examines the application of the dynamic regression models for inference and prediction with dynamic econometric models. It shows how to extend to the dynamic case the notion of Bayesian cut seen in the static case to justify conditional inference. The chapter also explains how Bayesian inference can be used for single-equation dynamic models. It discusses the particular case of models with autoregressive errors, discusses the issues of moving average errors, and illustrates the empirical use of the error correction model by an analysis of a money demand function for Belgium.
Keywords: dynamic regression models, econometric models, Bayesian inference, single-equation models, autoregressive errors, moving average errors, error correction, money demand
5.1 Introduction
The previous chapters have introduced all the essential tools of Bayesian analysis. Beyond this, our purpose in the rest of this book is to explain and illustrate how these tools can be used for inference and prediction with dynamic econometric models. This class of models is obviously very large, but stochastic difference equations that are linear in the variables (although not necessarily in the parameters) have been intensively used by econometricians for the last 20 years or so. Their justification, which is to a large extent due to their relative empirical success in economics, has even been grounded by the statistical theory of ‘reduction of dynamic experiments’ of Florens and Mouchart (1982, 1985a, 1985b) as explained in Section 5.2. We show how to extend to the dynamic case the notion of Bayesian cut seen in the static case to justify conditional inference, how to take account of non-stationarity in the Bayesian approach, and how to treat initial conditions which necessarily occur in dynamic models. In Section 5.3, we explain how Bayesian inference can be used for single-equation dynamic models and particularly a popular reparameterization known as the error correction model after Hendry and Richard (1982). In Section 5.4, we treat the particular case of models with autoregressive errors, and in Section 5.5, we discuss the specific issues of moving average errors. Finally, in Section 5.6, we illustrate the empirical use of the error correction model by an analysis of a money demand function for Belgium.
5.2 Statistical Issues Specific to Dynamic Models
Broadly speaking, a model is dynamic every time the variables are indexed by time and appear with different time lags. For instance, γt = β0xt + β1xt−1 + ut is a simple dynamic model, called a distributed lag model. Here the dynamic structure appears on the exogenous variables. It can also appear on the endogenous variables with for instance yt = αyt−1 + ut which is an autoregressive (AR) model. Finally the dynamic structure can also appear on the error process as considered in Sections 5.4 and 5.5. In Chapter 8, we consider models where the dynamic structure is non-linear in the lags of the dependent variable, whereas it is linear in the models of this chapter. In Chapter 7, we consider a different class of models, where the dynamic structure is put on the variance of the error process. In Chapter 9, we consider linear dynamic systems of equations.
(p.130) 5.2.1 Reductions: Exogeneity and Causality
Reductions by marginalization or by conditioning were introduced in Section 2.5 quite generally. No attention was paid to the issue of whether the model might be dynamic rather than static. The notions of cut and of exogeneity were defined relative to a sample of size T, i.e. they were global notions. In a static model (i.e. of independent observations), it does not make any difference if the cut is defined for the complete sample or for each observation. In the sequel of this book, we consider dynamic data generating processes, i.e. processes where the generated random variable is indexed by time. The function which associates t to xt is called the trajectory of the process and the ordered collection of observations the history of the process. It is convenient to note
Dynamic Regression Models
The first observation x0 plays a specific role and is called the initial condition of the process. It represents presample information, the state of the system when it begins to be observable. The model can be characterized by its data density f(X T1 |x0,θ), by conditioning on the initial value x0 (other approaches are discussed in Subsection 5.2.3). For model building, it is convenient to consider the data density from the point of view of sequential analysis which is based on the analysis of the generating process of an observation xt conditional on the past and on the initial condition. This amounts to considering the following factorization:
Dynamic Regression Models
In a sequential model like (5.2) it can be made apparent how prior information is revised by the arrival of a new observation. We can now introduce the definition of a sequential cut.
Definition 5.1
Let us consider a reparameterization of θ in α and β and α partition of xt in yt and zt. A Bayesian sequential cut is obtained if α and β are a priori independent and if
Dynamic Regression Models
An immediate consequence of a sequential cut is that the two members of the likelihood function (5.4) can be treated separately for inference as we shall see below. We have the following theorem given in Florens and Mouchart (1985a).
Theorem 5.2
If α, β, and zt operate a Bayesian sequential cut, then α and β are a posteriori independent.
Engle, Hendry, and Richard (1983) call this type of exogeneity weak exogeneity. In a dynamic model, there are subtleties due to the occurrence of lagged (p.131) variacbles. In (5.3), the first product does not represent the sampling density of (Y1T|Z1T,x0) and the second product is not the sampling density of (Z1T|x0). Therefore, at the phase of model building, a sequential cut is not a sufficient condition to separate the generating process into two subprocesses that could be specified separately. Because we are in a dynamic framework, we have to introduce a new definition, considering what we can call a global cut (called an initial cut by Florens and Mouchart 1985a).
Definition 5.3
Let us consider a reparameterization of θ in α and β and α partition of xt in yt and zt. A Bayesian global cut is obtained if α and β are a priori independent and if the data density can be factorized as
Dynamic Regression Models
We must point out that the parameters α and β are not necessarily the same in the sequential and in the global cut. A global cut introduces a restriction on the marginal process of zt which is
Dynamic Regression Models
This means that the past of yt is of no use for predicting zt. This is the notion of non-causality due to Granger (1969). When both a sequential and a global cut hold, we have the notion of strong exogeneity introduced by Engle, Hendry, and Richard (1983).
Definition 5.4
Let us consider a stochastic process in xt indexed by θ and α partition of xt in yt and zt. The variable zt is said to be strongly exogenous if the reparameterization of θ into α and β operates a sequential cut (zt is weakly exogenous for inference on α) and yt does not Granger-cause zt.
If there is weak exogeneity, the only part of the data density that is relevant for inference on α is the first product in (5.4), which is indeed the likelihood kernel of α. The posterior density of α is obtained by
Dynamic Regression Models
The important thing to notice in (5.6) is that we do not need the second product in (5.3) even though it depends on y. We do not need to specify the marginal density f(zt|X0t−l,β) for inference on α.
For predictive inference on y given z, weak exogeneity is not sufficient. Let us start from the predictive density of X1T which is
Dynamic Regression Models
(p.132) If there is a global cut, this becomes
Dynamic Regression Models
because
Dynamic Regression Models
From (5.8) we see that we can forget f (Z1T|x0,β) and φ(β) to compute the predictive density of y given z. However, if there is only weak exogeneity, the first product in (5.3) is not the conditional density f (Y1T|Z1T,x0,α) that we need to compute f (Y1T|Z1T,x0), because the second product in (5.3) depends on y. So in dynamic models, weak exogeneity is necessary and sufficient for posterior inference as in static models, but strong exogeneity is necessary for prediction.
5.2.2 Reduction of a VAR Model to an ADL Equation
A VAR (Vector autoregressive) model with independent normal error terms is a commonly used representation for a dynamic multivariate stochastic process. Let us consider the k-dimensional random variable xt. The VAR model is noted:
[Ik−A(L) ]xt = xt (5.10)
where A(L) is a matrix of lag polynomials of order p (without a term of degree 0 in L):
A(L) = A1L + A2L2 + … + ApLp (5.11)
and vt ~ Nk(0,Σ). For simplicity, we do not introduce deterministic variables in (5.10). In what follows, we assume that the initial conditions x0 … x−p are known, but we do not write them explicitly as conditioning variables. So we note simply Xt−1 for the past of xt, including the initial conditions. The VAR model has gained important popularity with the work of Sims (1980). Because it requires many parameters and therefore observations, and it often lacks a ‘structural’ interpretation, econometricians are interested in admissible reductions of this model. We can partition xt and Σ conformably in
Dynamic Regression Models
where yt is a scalar and zt has k − 1 elements. This partition is done because we wish to find a regression equation where yt is the explained variable and zt are the explicative variables. We continue by proposing the conformable partitioning of A(L) in
Dynamic Regression Models
(p.133) We can factorize the normal distribution of xt
xt|Xt−1 ~ Nk(A(L)xt,Σ) (5.14)
into the marginal distribution of zt
zt|Xt−1 ~ Nk−1(Az(L)xt, Σzz) (5.15)
and the conditional distribution of yt|zt
yt|zt,Xt−1 ~ N(c′zt + b(L)′xt,σ2), (5.16)
where
c = Σzz−1Σzy,
b(L) = Ay(L)−c′Az(L), (5.17)
σ2 = Σyy − ΣyzΣzz−1Σzy
A sequential cut is obtained if we define the parameters α and β introduced in (5.3) as
α = [c, b(L), σ2],
β = [Az(L),Σzz].(5.18)
We assume prior independence between α and β. From (5.15) and (5.16), we see that z is weakly exogenous for α without further restrictions, because (5.3) holds automatically in the VAR model. For strong exogeneity, however, we need the restriction of Granger non-causality:
Azy(L)=0, (5.19)
so that lagged values of yt do not appear in the marginal model (5.15). As weak exogeneity is automatically satisfied, the conditional model (5.16) can be analysed independently of the marginal model (5.15). This leads to the regression equation
yt = c′zt + b(L)′xt + ut, (5.20)
where ut ~ N(0,σ2). Introducing the partition
b(L) = (by(L)bz(L)′) (5.21)
we can express (5.20) as
yt = by(L)yt + c′ zt + bz(L)′zt + ut. (5.22)
Inference in this type of dynamic regression model, called the ADL (Autoregressive Distributed Lag) model is studied in the next section.
(p.134) As in the static case, the (weak) exogeneity property in the dynamic case is a direct consequence of the properties of the multivariate normal distribution. Suppose now that we have incidental parameters, so that (5.10) becomes
[Ik − A(L)) (xt − μt) = vt. (5.23)
Let us partition the incidental mean vector μt as xt in (5.12) and let us assume that μt is constrained by the linear relation
μyt = cˉ′ μzt. (5.24)
We have the joint distribution of xt
xt|Xt−1 ~ Nk(μt + A(L) (xt − μt), Σ), (5.25)
which factorizes into the marginal model
zt|Xt−1 ~ Nk−1(μzt + Az(L) (xt − μt), Σzz), (5.26)
and, given (5.24), the conditional model
Dynamic Regression Models
The parameters of (5.27) are defined by (5.17). To obtain a conditional model without incidental parameters, we need the restriction
c = cˉ (5.28)
to eliminate the term (cˉ − c)′μzt, and, given (5.28) and (5.24),
Ayz(L) − cˉ ′Azz(L) = Ayy(L) cˉ ′ − cˉ ′Azy(L) cˉ ′ (5.29)
to eliminate the term b(L)′μt. With these restrictions, if we define the parameters α and β of (5.4) by
α = [cˉ,b(L), σ2] (5.30)
β = [Az(L),Σzz, μ1,…,μT],
z is weakly exogenous for α. Imposing (5.19) also gives strong exogeneity.
5.2.3 Treatment of Initial Observations
In a dynamic model, the distribution of the initial observations (hereafter denoted y0 plays a special role.