wikipedia的解释:
http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)
An only slightly less simple example is that of least squares estimation of a and b in the model
where εi and hence Yi are random. Let and be the least-squares estimates of a and b. Then the residuals
are constrained to lie within the space defined by the two equations
One says that there are n − 2 degrees of freedom for error.
The capital Y is used in specifying the model, and lower-case y in the definition of the residuals. That is because the former are hypothesized random variables and the latter are data.
We can generalise this to multiple regression involving p parameters and covariates (e.g. p − 1 predictors and one mean), in which case the cost in degrees of freedom of the fit is p.
说句实话,大家都是在一个层面上的理解,主要通过自然语言进行解释。而实际上,真正的数学推导并没有给出。