为什么不看帮助呢?
-----------------------------------------------------------------------------------------------------------
help for regress manual: [R] regress
dialogs: regress predict
-----------------------------------------------------------------------------------------------------------
Linear regression
regress depvar [varlist] [weight] [if exp] [in range] [, level(#) beta robust cluster(varname)
score(newvar) hc2 hc3 hascons noconstant tsscons noheader eform(string) depname(varname)
mse1 plus ]
by ... : may be used with regress; see help by.
aweights, fweights, iweights, and pweights are allowed; see help weights.
depvar and the varlist following depvar may contain time-series operators; see help varlist.
regress shares the features of all estimation commands; see help estcom.
regress may be used with sw to perform stepwise estimation; see help sw.
The syntax of predict following regress is
predict [type] newvarname [if exp] [in range] [, statistic]
where statistic is
xb fitted values; the default
pr(a,b) Pr(y | a<y<b)
e(a,b) E(y | a<y<b)
ystar(a,b) E(y*), y*=max(a,min(y,b))
cooksd Cook's distance
leverage | hat leverage (diagonal elements of hat matrix)
residuals residuals
rstandard standardized residuals
rstudent Studentized (jackknifed) residuals
stdp standard error of the prediction
stdf standard error of the forecast
stdr standard error of the residual
(*) covratio COVRATIO
(*) dfbeta(varname) DFBETA for varname
(*) dfits DFITS
(*) welsch Welsch distance
where a and b may be numbers or variables; a missing (a > .) means -infinity; and b missing (b > .)
means infinity.
Unstarred statistics are available both in and out of sample; type "predict ... if e(sample) ..." if
wanted only for the estimation sample. Starred statistics are calculated for the estimation sample
even when "if e(sample)" is not specified.
Description
regress fits a model of depvar on varlist using linear regression.
Here is an abbreviated list of other regression commands that may be of interest. See [R] estimation
commands for a complete list.
help anova analysis of variance and covariance
help cnreg censored-normal regression
help heckman Heckman selection model
help intreg interval regression
help ivreg instrumental variables (2SLS) regression
help newey regression with Newey-West standard errors
help prais Prais-Winsten, Cochrane-Orcutt, or Hildreth-Lu regression
help qreg quantile (including median) regression
help reg3 three-stage least squares regression
help rreg robust regression (NOT robust standard errors)
help sureg seemingly unrelated regression
help svyheckman Heckman selection model with survey data
help svyintreg interval regression with survey data
help svyivreg instrumental variables regression with survey data
help svyregress linear regression with survey data
help tobit tobit regression
help treatreg treatment effects model
help truncreg truncated regression
help xtabond Arellano-Bond linear, dynamic panel-data estimator
help xtintreg panel data interval regression models
help xtreg fixed- and random-effects linear models
help xtregar fixed- and random-effects linear models with an AR(1) disturbance
help xttobit panel data tobit models
Options
level(#) specifies the confidence level, in percent, for confidence intervals of the coefficients;
see help level.
beta requests that normalized beta coefficients be reported instead of confidence intervals. beta
may not be specified with cluster().
robust specifies that the Huber/White/sandwich estimator of variance is to be used in place of the
traditional calculation. robust combined with cluster() further allows observations which are
not independent within cluster (although they must be independent between clusters). See [U]
23.14 Obtaining robust variance estimates.
cluster(varname) specifies that the observations are independent across groups (clusters) but not
necessarily independent within groups. varname specifies to which group each observation
belongs; e.g., cluster(personid) in data with repeated observations on individuals. cluster()
can be used with pweights to produce estimates for unstratified cluster-sampled data, but see
help svyregress for a command especially designed for survey data. Specifying cluster() implies
robust.
score(newvar) creates a new variable for the scores from the equation in the model. The new variable
contains each observation's contribution to the score; see [U] 23.15 Obtaining scores.
hc2 and hc3 specify an alternative bias correction for the robust variance calculation. hc2 and hc3
may not be specified with cluster().
hc2 uses u_j^2/(1-h_j) as the observation's variance estimate.
hc3 uses u_j^2/(1-h_j)^2 as the observation's variance estimate.
Specifying either hc2 or hc3 implies robust.
hascons indicates that a user-defined constant or its equivalent is specified among the independent
variables. Some caution is recommended when using this option as resulting estimates may not be
as accurate as they otherwise would be. Use of this option requires "sweeping" the constant
last, so the moment matrix must be accumulated in absolute rather than deviation form. This
option may be safely specified when the means of the dependent and independent variables are all
"reasonable" and there are not large amounts of collinearity between the independent variables.
The best procedure is to view hascons as a reporting option -- estimate with and without hascons
and verify that the coefficients and standard errors of the variables not affected by the
identity of the constant are unchanged. If you do not understand this warning, it is best to
avoid this option.
noconstant suppresses the constant term (intercept) in the regression.
tsscons forces the total sum of squares to be computed as though the model has a constant; i.e., as
deviations from the mean of the dependent variable. This is a rarely used option that has an
effect only when specified with nocons. It affects only the total sum of squares and all results
derived from the total sum of squares.
noheader, eform(), depname(), mse1, and plus are for ado-file writers; see [R] regress.
Options for predict
xb, the default, calculates the fitted values.
pr(a,b) calculates the Pr(a < xb+u < b), the probability that y|x would be observed in the interval
(a,b).
a and b may be specified as numbers or variable names;
pr(20,30) calculates Pr(20 < xb+u < 30);
pr(lb,ub) calculates Pr(lb < xb+u < ub); and
pr(20,ub) calculates Pr(20 < xb+u < ub).
a missing (a > .) means minus infinity; pr(.,30) calculates Pr(xb+u < 30) and pr(lb,30)
calculates Pr(xb+u < 30) in observations for which lb > . (and calculates Pr(lb < xb+u < 30)
elsewhere).
b missing (b > .) means plus infinity; pr(20,.) calculates Pr(xb+u > 20) and pr(20,ub) calculates
Pr(xb+u > 20) in observations for which ub > . (and calculates Pr(20 < xb+u < ub) elsewhere).
e(a,b) calculates E(xb+u | a < xb+u < b), the expected value of y|x conditional on y|x being in the
interval (a,b), which is to say, y|x is censored. a and b are specified as they are for pr().
ystar(a,b) calculates E(y*), where y* = a if xb+u < a, y* = b if xb+u > b, and y* = xb+u otherwise,
which is to say, y* is truncated. a and b are specified as they are for pr().
cooksd calculates Cook's D influence statistic.
leverage and hat calculate the diagonal elements of the projection hat matrix.
residuals calculates the residuals.
rstandard calculates the standardized residuals.
rstudent calculates the studentized (jackknifed) residuals.
stdp calculates the standard error of the prediction.
stdf calculates the standard error of the forecast. This is often informally referred to as the
standard error of the prediction.
stdr calculates the standard error of the residuals.
covratio calculates COVRATIO (Belsley, Kuh, and Welsch 1980), a measure of the influence of the jth
observation based on considering the effect on the variance-covariance matrix of the estimates.
The calculation is automatically restricted to the estimation sample.
dfits calculates DFITS (Welsch and Kuh 1977) and attempts to summarize the information in the
leverage versus residual-squared plot into a single statistic. The calculation is automatically
restricted to the estimation sample.
welsch calculates Welsch Distance (Welsch 1982) and is a variation on dfits. The calculation is
automatically restricted to the estimation sample.
dfbeta(varname) calculates the DFBETA for varname, the difference between the regression coefficient
when the jth observation is included and excluded, said difference being scaled by the estimated
standard error of the coefficient. varname must have been included among the regressors in the
previously fitted model. The calculation is automatically restricted to the estimation sample.
Examples: linear regression
. regress y x1 x2 x3 x4 x5
. test x1 x2
. test x3=5
. test x3=(x4+x5)/2
. predict yhat if e(sample)
. predict r, resid
. regress y x1 x2 x3 [freq=pop]
. regress y x1 x2 x3 [pweight=pop]
. regress yavg x1avg x2avg x3avg [aweight=pop]
. regress y x1 x2 x3 x4 x5 if region==1
. by region: regress y x1 x2 x3 x4 x5
. by region: regress y x1 x2 x3 x4 s5 if sex=="male"
Examples: regression with robust standard errors
. regress y x1 x2, robust
. regress y x1 x2, robust cluster(patid)
. regress y x1 x2 [pweight=pop], robust
. regress y x1 x2 [pweight=pop]
(Note, specifying pweights implies robust.)
Also see
Manual: [U] 23 Estimation and post-estimation commands,
[U] 29 Overview of Stata estimation commands,
[R] regress
Online: help for estcom, postest, regdiag, sw
[此贴子已经被作者于2007-10-12 12:49:44编辑过]