GARCH 是一个time series model, 而R squared不适合来衡量此类model,更常见用AIC BIC
引用一下Cross Validated的回答:
What is the problem with using R-squared in time series models?
(http://stats.stackexchange.com/questions/101546/what-is-the-problem-with-using-r-squared-in-time-series-models)
Some aspects of the issue:
If somebody gives us a vector of numbers y and a conformable matrix of numbers X, we do not need to know what is the relation between them to execute some estimation algebra, treating y as the dependent variable. The algebra will result, irrespective of whether these numbers represent cross-sectional or time series or panel data, or of whether the matrix X contains lagged values of y etc.
The fundamental definition of the coefficient of determination R2 is
R2=1−SSres/SStot
where SSres is the sum of squared residuals from some estimation procedure, and SStot is the sum of squared deviations of the dependent variable from its sample mean.
Combining, the R2 will always be uniquely calculated, for a specific data sample, a specific formulation of the relation between the variables, and a specific estimation procedure, subject only to the condition that the estimation procedure is such that it provides point estimates of the unknown quantities involved (and hence point estimates of the dependent variable, and hence point estimates of the residuals). If any of these three aspects change, the arithmetic value of R2 will in general change -but this holds for any type of data, not just time-series.
So the issue with R2 and time-series, is not whether it is "unique" or not (since most estimation procedures for time-series data provide point estimates). The issue is whether the "usual" time series specification framework is technically friendly for the R2, and whether R2 provides some useful information.
The interpretation of R2 as "proportion of dependent variable variance explained" depends critically on the residuals adding up to zero. In the context of linear regression (on whatever kind of data), and of Ordinary Least Squares estimation, this is guaranteed only if the specification includes a constant term in the regressor matrix (a "drift" in time-series terminology). In autoregressive time-series models, a drift is in many cases not included.
More generally, when we are faced with time-series data, "automatically" we start thinking about how the time-series will evolve into the future. So we tend to evaluate a time-series model based more on how well it predicts future values, than how well it fits past values. But the R2 mainly reflects the latter, not the former. The well-known fact that R2 is non-decreasing in the number of regressors means that we can obtain a perfect fit by keeping adding regressors (any regressors, i.e. any series' of numbers, perhaps totally unrelated conceptually to the dependent variable). Experience shows that a perfect fit obtained thus, will also give abysmal predictions outside the sample.
Intuitively, this perhaps counter-intuitive trade-off happens because by capturing the whole variability of the dependent variable into an estimated equation, we turn unsystematic variability into systematic one, as regards prediction (here, "unsystematic" should be understood relative to our knowledge -from a purely deterministic philosophical point of view, there is no such thing as "unsystematic variability". But to the degree that our limited knowledge forces us to treat some variability as "unsystematic", then the attempt to nevertheless turn it into a systematic component, brings prediction disaster).
In fact this is perhaps the most convincing way to show somebody why R2 should not be the main diagnostic/evaluation tool when dealing with time series: increase the number of regressors up to a point where R2≈1. Then take the estimated equation and try to predict the future values of the dependent variable.
附上Granger and Newbold 的文献
希望对题主有帮助
努力攒下学期课本的论坛币中。。。