我的数据是个人水平的住院报销费用individual level
dependent variable是expen(住院费用)
independent variable 是患者的age和gender
intervention是一项政策(policy)
组别有两组treated(=1是干预组,=0是对照组)
我想研究的是policy对于expen的影响
我在用stata 12进行分析的时候,先用以下语句进行PSM
globaltreatment treated
globalylist expen
globalxlist age male
globalbreps 5
describe $treatment $ylist $xlist
summarize $treatment $ylist $xlist
bysort $treatment: summarize $treatment $ylist $xlist
**propensity score matching with commonsupport
pscore $treatment $xlist, pscore(myscore) blockid(myblock) comsup运行以上语句后提示显示为
****************************************************
Algorithm to estimate the propensity score
****************************************************
The treatment is treated
treated | Freq. Percent Cum.
------------+-----------------------------------
0 | 97,720 58.57 58.57
1 | 69,134 41.43 100.00
------------+-----------------------------------
Total | 166,854 100.00
Estimation of the propensity score
Iteration 0: log likelihood = -112190.35
Iteration 1: log likelihood = -110807.85
Iteration 2: log likelihood = -110807.42
Probit regression Number of obs = 165176
LR chi2(2) = 2765.86
Prob > chi2 = 0.0000
Log likelihood = -110807.42 Pseudo R2 = 0.0123
------------------------------------------------------------------------------
treated | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | .0073063 .0001407 51.92 0.000 .0070305 .0075821
male | -.0175782 .0063238 -2.78 0.005 -.0299726 -.0051838
_cons | -.516637 .0075214 -68.69 0.000 -.5313786 -.5018953
------------------------------------------------------------------------------
Note: the common support option has been selected
The region of common support is [.29659633, .5818742]
Description of the estimated propensity score
in region of common support
Estimated propensity score
-------------------------------------------------------------
Percentiles Smallest
1% .2991284 .2965963
5% .306783 .2965963
10% .3155763 .2965963 Obs 165168
25% .3636981 .2965963 Sum of Wgt. 165168
50% .4243344 Mean .4165852
Largest Std. Dev. .0636628
75% .4688119 .5732959
90% .4950135 .5761594 Variance .004053
95% .5084037 .5778587 Skewness -.2840554
99% .5241467 .5818742 Kurtosis 2.009277
******************************************************
Step 1: Identification of the optimal number of blocks
Use option detail if you want more detailed output
******************************************************
The final number of blocks is 26
This number of blocks ensures that the mean propensity score
is not different for treated and controls in each blocks
**********************************************************
Step 2: Test of balancing property of the propensity score
Use option detail if you want more detailed output
**********************************************************
Variable age is not balanced in block 15
Variable male is not balanced in block 15
Variable age is not balanced in block 25
Variable male is not balanced in block 25
Variable male is not balanced in block 26
The balancing property is not satisfied
Try a different specification of the propensity score
Inferior |
of block | treated
of pscore | 0 1 | Total
-----------+----------------------+----------
.29375 | 235 223 | 458
.296875 | 1,093 516 | 1,609
.3015625 | 1,796 415 | 2,211
.3023438 | 120 124 | 244
.303125 | 1,425 347 | 1,772
.3046875 | 639 327 | 966
.30625 | 5,281 1,292 | 6,573
.3125 | 3,607 917 | 4,524
.31875 | 2,450 869 | 3,319
.325 | 825 353 | 1,178
.328125 | 291 121 | 412
.3296875 | 473 317 | 790
.33125 | 1,157 625 | 1,782
.334375 | 882 557 | 1,439
.3375 | 2,375 1,640 | 4,015
.34375 | 889 614 | 1,503
.346875 | 476 353 | 829
.3484375 | 292 293 | 585
.35 | 14,532 9,640 | 24,172
.4 | 57,495 49,292 | 106,787
-----------+----------------------+----------
Total | 96,333 68,835 | 165,168
Note: the common support option has been selected
*******************************************
End of the algorithm to estimate the pscore
*******************************************
想请教各位的问题是:
1.红色字部分出现问题提示,是什么原因,我应该如何解决?
2.在stata 中我查找了diff命令,得到以下方法的举例:
(1)Diff-in-Diff with covariates.
diff fte, t(treated) p(t) cov(bk kfc roys)
diff fte, t(treated) p(t) cov(bk kfc roys) report
diff fte, t(treated) p(t) cov(bk kfc roys) report bs
(2)Kernel Propensity Score Diff-in-Diff.
diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id)
diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support
diff fte, t(treated) p(t) cov(bk kfc roys) kernel id(id) support addcov(wendys)
diff fte, t(treated) p(t) kernel id(id) ktype(gaussian) pscore(_ps)
要是用(2),我是不是就可以不分开PSM的那一步,直接可以在这个举例的命令中完成?其中的id代表的是什么意思呢,在我的数据库中应该如何设置或者生成?
期待各位的帮助和解答!谢谢


雷达卡




京公网安备 11010802022788号







