搜索
人大经济论坛 附件下载

附件下载

所在主题:
文件名:  IV-Oprobit.doc
资料下载链接地址: https://bbs.pinggu.org/a-3515292.html
附件大小:
我们在使用stata的过程中,有时会碰到有序回归模型,使用oprobit,在解决内生性问题时,尝试使用工具变量进行处理,但是一般的教科书中只有IV-Probit的方法,几乎没有看到IV-Oprobit 的方法,因此,本贴结合自己的使用经验,试图给大家做个例子。

首先,第一步,要知道这个命令要用到cmp,我们help cmp来看一下介绍


关于版本要求,最好从Stata 13起步

Versions 8.0.0 and 8.2.0 of cmp, released in mid-2017 and early 2018,
include changes that can somewhat affect results in hierarchical models.
An older version, 7.1.0, is available as a Github archive, and can be
directly installed, in Stata 13 or later, via "net from
https://raw.github.com/droodman/cmp/v7.1.0".

Versions 8.6.2, released in June 2021, requires Stata 13 or later. The
previous version works in Stata 11 and 12 too.


第二步,安装命令:ssc install ghk2,replace

第三步,大致知道cmp命令语句等式的意思

To inform cmp about the natures of the dependent variables and about which
equations apply to which observations, the user must include the
indicators() option after the comma in the cmp command line. This must
contain one expression for each equation. The expression can be a
constant, a variable name, or a formula. Formulas that contain spaces or
parentheses should be enclosed in quotes. For each observation, each
expression must evaluate to one of the following codes, with the meanings
shown:

0 = observation is not in this equation's sample
. = observation is in this equation's sample but dependent variable
unobserved for this observation
1 = equation is "continuous" for this observation, i.e., is linear
with Gaussian error or is an uncensored observation in a tobit
equation
2 = observation is left-censored for this (tobit) equation at the
value stored in the dependent variable
3 = observation is right-censored at the value stored in the dependent
variable
4 = equation is probit for this observation
5 = equation is ordered probit for this observation
6 = equation is multinomial probit for this observation
7 = equation is interval-censored for this observation
8 = equation is truncated on the left and/or right (obsolete because
truncation is now a general modeling feature)
9 = equation is rank-ordered probit for this observation
10 = equation is frational probit for this observation

For clarity, users can execute the cmp setup subcommand, which defines
global macros that can then be used in cmp command lines:

$cmp_out = 0
$cmp_missing = .

$cmp_cont = 1

$cmp_left = 2

$cmp_right = 3

$cmp_probit = 4

$cmp_oprobit = 5

$cmp_mprobit = 6

$cmp_int = 7

$cmp_trunc = 8 (deprecated)

$cmp_roprobit = 9

$cmp_frac = 10


第四步,举例说明

. cmp setup

. webuse laborsup


先来一个正常的oprobit,mprobit

. oprobit kids fem_inc male_educ

. margins, dydx(*) predict(outcome(#2))

. cmp (kids = fem_inc male_educ), ind($cmp_oprobit) qui

. margins, dydx(*) predict(eq(#1) outcome(#2) pr)


. webuse sysdsn3

. mprobit insure age male nonwhite site2 site3

. margins, dydx(nonwhite) predict(outcome(2))

. cmp (insure = age male nonwhite site2 site3, iia), nolr ind($cmp_mprobit) qui

. margins, dydx(nonwhite) predict(eq(#2) pr)



由于在help中没有直接的iv-oprobit,先借鉴下ivprobit

. ivprobit fem_work fem_educ kids (other_inc = male_educ), first

. version 13: margins, predict(pr) dydx(*)

. cmp (fem_work = other_inc fem_educ kids) (other_inc = fem_educ kidsmale_educ), ind($ cmp_probit$ cmp_cont)

. margins, predict(pr) dydx(*) force


由此,我们大致可以推测出语法结构:

以y1为例,问卷中是一个五分类的变量,分别赋值1-5,现在转换成二分类的y2,赋值1和0,有变量x1,x2,x3,x4,其中,x1是解释变量,他的工具变量是mv,其它的x2,x3,x4是控制变量

那么,ivprobit模型为ivprobit y2 x2,x3,x4(x1=mv)

如果我们要做iv-oprobit,则变成:

cmp(y1=x1 x2 x3 x4)(x1=x2 x3 x4 mv),ind($ cmp_oprobit $ cmp_cont)technique(dfp) nolrtest

可以看到cmp后边跟了两个等式,第一个等式,y1=x1 x2 x3 x4就是正常的oprobit回归中所有变量,第二个等式x1=x2 x3 x4 mv,就是将主要解释变量x1让它等于工具变量 加上x1以外的所有变量,

ind($cmp_oprobit $cmp_cont)这个语法是固定的,要联系到第三步中这两个的意思,建议最好在help中复制这个命令,我自己做的时候,就老出错,不知道是不是括号的问题,后来改了才能进行,比如:

The indicators() option must contain 2 variables, one for each equation. Did you forget to type cmp setup?

technique(dfp) nolrtest是help文件中没怎么出现的,但我在别的提问中看到这个,目前还不知道意思,暂且加上

然后就是等结果,结果分为两部分

第一部分

Fitting individual models as starting point for full model fit.
Note: For programming reasons, these initial estimates may deviate from your specification.
For exact fits of each equation alone, run cmp separately on each.

Iteration 0: log likelihood =-14235.01
Iteration 1: log likelihood = -13216.218
Iteration 2: log likelihood = -13210.845
Iteration 3: log likelihood = -13210.842
Iteration 4: log likelihood = -13210.842

Ordered probit regression 一个模型


然后会出现

Warning: regressor matrix for _cmp_y1 equation appears ill-conditioned. (Condition number = 4140.8843.)
This might prevent convergence. If it does, and if you have not done so already, you may need to remove nearly
collinear regressors to achieve convergence. Or you may need to add a nrtolerance(#) or nonrtolerance option to the command line.
See cmp tips.


又是一个模型


Warning: regressor matrix for erzi equation appears ill-conditioned. (Condition number = 5012.1327.)
This might prevent convergence. If it does, and if you have not done so already, you may need to remove nearly
collinear regressors to achieve convergence. Or you may need to add a nrtolerance(#) or nonrtolerance option to the command line.
See cmp tips.

Fitting full model.


Iteration 0: log likelihood = -18709.494

接下来就是漫长的运行,一直到260多次,终于结束


Iteration 261: log likelihood = -18698.597
Iteration 262: log likelihood = -18698.593
Iteration 263: log likelihood = -18698.593
Iteration 264: log likelihood = -18698.593

Mixed-process regression


终于出现了模型的结果,最后求一下边际效应

margins, dydx(*)


由于帖子被自动排版,出了很多问题,比如符号 $ 的 问题,导致$ cmp_oprobit $cmp_cont 出现问题,我传一个word供大家观看。
以上就是我自己的操作经验,其实我自己也有点懵逼,不知道对错,欢迎大家指正、批评。







    熟悉论坛请点击新手指南
下载说明
1、论坛支持迅雷和网际快车等p2p多线程软件下载,请在上面选择下载通道单击右健下载即可。
2、论坛会定期自动批量更新下载地址,所以请不要浪费时间盗链论坛资源,盗链地址会很快失效。
3、本站为非盈利性质的学术交流网站,鼓励和保护原创作品,拒绝未经版权人许可的上传行为。本站如接到版权人发出的合格侵权通知,将积极的采取必要措施;同时,本站也将在技术手段和能力范围内,履行版权保护的注意义务。
(如有侵权,欢迎举报)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

GMT+8, 2026-2-7 17:43