http://repec.sowi.unibe.ch/stata/coefplot/confidence-intervals.html
Confidence intervals
Changing the plot type
By default coefplot draws confidence intervals as spikes. Use ciopts(recast())
to change the plot type. For example, to use capped spikes, type:
. sysuse auto, clear
(1978 automobile data)
. regress price mpg trunk length turn if foreign==0
(output omitted)
. estimates store domestic
. regress price mpg trunk length turn if foreign==1
(output omitted)
. estimates store foreign
. coefplot domestic foreign, drop(_cons) xline(0) ciopts(recast(rcap))
Code
[top]
Multiple levels
The default for coefplot is to draw 95% confidence intervals (or as set by set level
). To specify a different level or to include multiple confidence intervals, use the levels()
option. Here is an example with 99.9%, 99%, and 95% confidence intervals:
. sysuse auto, clear
(1978 automobile data)
. regress price mpg trunk length turn
(output omitted)
. coefplot, drop(_cons) xline(0) msymbol(s) mfcolor(white) ///
levels(99.9 99 95) legend(order(1 "99.9" 2 "99" 3 "95") rows(1))
Code
Line widths are (logarithmically) increased across the confidence intervals. To use different line widths specify the lwidth()
suboption within ciopts()
:
. coefplot, drop(_cons) xline(0) msymbol(s) mfcolor(white) ///
levels(99.9 99 95) legend(order(1 "99.9" 2 "99" 3 "95") rows(1)) ///
ciopts(lwidth(*1 *3 *6))
Code
Here is a further example inspired by Harrel (2001, Figure 20.4):
. coefplot, drop(_cons) xline(0) msymbol(d) mcolor(white) ///
levels(99 95 90 80 70) ciopts(lwidth(3 ..) lcolor(*.2 *.4 *.6 *.8 *1)) ///
legend(order(1 "99" 2 "95" 3 "90" 4 "80" 5 "70") rows(1))
Code
And here is an example inspired by Cleveland (1994, Figure 3.78):
. sysuse auto, clear
(1978 automobile data)
. regress price mpg trunk length turn if foreign==0
(output omitted)
. estimates store domestic
. regress price mpg trunk length turn if foreign==1
(output omitted)
. estimates store foreign
. coefplot domestic foreign, drop(_cons) xline(0) levels(95 50) ciopts(recast(. rcap))
Code
[top]
How CIs are retrieved
To compute confidence intervals, coefplot collects the variances of the coefficients from the diagonal of e(V)
(or e(V_mi)
for estimates from mi
) and then, depending on whether degrees of freedom are available in scalar e(df_r)
(or in matrix e(df_mi)
for estimates from mi
), applies the standard formulas for confidence intervals based on the t-distribution or the normal distribution, respectively. Custom degrees of freedom can be provided through option df()
. If variances are stored under a different name than e(V)
, use thev()
option to provide the appropriate name, or, alternatively use option se()
to provide custom standard errors (in which case variances from e(V)
will be ignored). Likewise, if your estimation command provides precomputed confidence intervals, use the ci()
option to include them in the plot (see the example on plotting bootstrap CIs below).
For example, in survey estimation, you might want compare the design-based confidence intervals with the confidence intervals you would obtain in a hypothetical simple random sample of the same size. The svy
command stores the hypothetical SRS variances in e(V_srs)
. Hence, to display design-based and SRS-based confidence intervals, you could type:
. webuse nhanes2f, clear
. svyset psuid [pweight=finalwgt], strata(stratid)
(output omitted)
. svy: regress zinc age age2 weight female black orace rural
(running regress on estimation sample)
Survey: Linear regression
Number of strata = 31 Number of obs = 9,189
Number of PSUs = 62 Population size = 104,176,071
Design df = 31
F(7, 25) = 62.50
Prob > F = 0.0000
R-squared = 0.0698
| Linearized
zinc | Coefficient std. err. t P>|t| [95% conf. interval]
-------------±---------------------------------------------------------------
age | -.1701161 .0844192 -2.02 0.053 -.3422901 .002058
age2 | .0008744 .0008655 1.01 0.320 -.0008907 .0026396
weight | .0535225 .0139115 3.85 0.001 .0251499 .0818951
female | -6.134161 .4403625 -13.93 0.000 -7.032286 -5.236035
black | -2.881813 1.075958 -2.68 0.012 -5.076244 -.687381
orace | -4.118051 1.621121 -2.54 0.016 -7.424349 -.8117528
rural | -.5386327 .6171836 -0.87 0.390 -1.797387 .7201216
_cons | 92.47495 2.228263 41.50 0.000 87.93038 97.01952
. coefplot (., label(design-based)) (., v(V_srs) label(SRS-based)) ///
, keep(female black orace rural) xlabel(,grid)
Code
When computing the SRS-based confidence intervals you might also want to take into account that in a hypothetical SRS the residual degrees of freedom of the model would be different. By default, coefplot uses the information in e(df_r)
, which is equal to 31 in the example. In an SRS, however, the degrees of freedom would be e(N)
– e(df_m)
– 1, which is equal to 9181 in the example. To use the corrected degrees of freedom for displaying the SRS-based confidence intervals, you could type:
. local df_r = e(N) - e(df_m) - 1
. coefplot (., label(design-based)) (., v(V_srs) df(`df_r’) label(SRS-based)) ///
, keep(female black orace rural) xlabel(,grid)
Code
Comparing the two graphs you will see that, due to the increased degrees of freedom, the SRS-based CIs in the second graph are slightly narrower than in the first graph.
[top]
Plotting bootstrap CIs
Bootstrap estimates obtained by the vce(bootstrap)
option or the bootstrap
command provide normal-approximation, percentile, and bias-corrected confidence intervals (for the confidence level specified at the time of estimation) in e(ci_normal)
, e(ci_percentile)
, and e(ci_bc)
. Use the ci()
option to plot there confidence intervals:
. sysuse auto, clear
(1978 automobile data)
. regress price mpg trunk length turn, vce(bootstrap)
(output omitted)
. coefplot (., ci(ci_normal) label(normal)) ///
(., ci(ci_percentile) label(percentile)) ///
(., ci(ci_bc) label(bc)) ///
, drop(_cons) xline(0) legend(rows(1))
Code
[top]
Smoothed CIs
Option cismooth
adds smoothed confidence intervals (inspired by code provided in a post by David B. Sparks). By default, cismooth
generates confidence intervals for 50 equally spaced levels (1, 3, …, 99) width graduated color intensities and varying line widths, as illustrated in the following example:
. sysuse auto, clear
(1978 automobile data)
. regress price mpg trunk length turn if foreign==0
(output omitted)
. estimates store domestic
. regress price mpg trunk length turn if foreign==1
(output omitted)
. estimates store foreign
. coefplot domestic foreign, drop(_cons) xline(0) cismooth grid(none)
Code
The smoothed confidence intervals are produced independently from levels()
and ci()
and are not affected by ciopts()
. Their appearance, however, can be set by a number of suboptions. If cismooth
is specified together with levels()
or ci()
, then the smoothed confidence intervals are placed behind the confidence intervals from levels()
or ci()
.
[top]
CIs for proportions
When plotting proportions you may want to apply option citype(logit)
to ensure that the confidence limits stay within 0 and 1 (seehelp proportion
):
. sysuse auto, clear
(1978 automobile data)
. proportion rep78 if foreign==0
Proportion estimation Number of obs = 48
| Logit
| Proportion Std. err. [95% conf. interval]
-------------±-----------------------------------------------
rep78 |
1 | .0416667 .0288424 .0100647 .1567801
2 | .1666667 .0537914 .0840476 .3035829
3 | .5625 .0716027 .4172638 .6977581
4 | .1875 .0563367 .0988312 .3268658
5 | .0416667 .0288424 .0100647 .1567801
. estimates store domestic
. proportion rep78 if foreign==1
Proportion estimation Number of obs = 21
| Logit
| Proportion Std. err. [95% conf. interval]
-------------±-----------------------------------------------
rep78 |
3 | .1428571 .0763604 .0434141 .3796739
4 | .4285714 .1079898 .2301427 .6529748
5 | .4285714 .1079898 .2301427 .6529748
. estimates store foreign
. coefplot domestic foreign, xtitle(Repair Record 1978) ytitle(Proportion) ///
vertical recast(bar) barwidth(0.25) finten(60) ///
citop citype(logit) ciopt(recast(rcap)) rename(*.rep78 = "")
Code
(Option rename()
can be omitted in Stata 15 or lower, or if version
is set to 15 or lower.)
[top]
Truncated confidence spikes
Sometimes it may make sense to truncate wide confidence intervals so that the rest of the information in the plot is better visible. The following example illustrates how such truncation can be achieved using the transform()
option. When truncating the confidence intervals you want to make sure that the truncated spikes go all the way to the edge of the plot region. This is why in the example the margin of the plot region is set to zero:
. sysuse nlsw88, clear
(NLSW, 1988 extract)
. regress wage ibn.occupation, nocons
(output omitted)
. coefplot, transform(* = min(max(@,1.5),12.5)) ///
xscale(range(1.5 12.5)) plotregion(margin(zero))
Code
An alternative might be as follows:
. coefplot, transform(* = min(max(@,2),12)) ///
plotregion(color(gray) icolor(white)) grid(nogextend)
Code
Furthermore, here is a somewhat involved example that uses the if()
option to select a different plot type depending on truncation:
. coefplot (., pstyle(p1) if(@ll>2&@ul<12)) ///
(., pstyle(p1) if(@ll>2&@ul>=12) ciopts(recast(pcarrow))) ///
(., pstyle(p1) if(@ll<=2&@ul<12) ciopts(recast(pcrarrow))) ///
(., pstyle(p1) if(@ll<=2&@ul>=12) ciopts(recast(pcbarrow))) ///
, nooffset transform(* = min(max(@,2),12)) legend(off)
Code