楼主: pq366
19514 13

[回归分析求助] 多个内生变量的工具变量怎样在stata中以2sls的方法实现? [推广有奖]

  • 2关注
  • 2粉丝

已卖:1份资源

教授

96%

还不是VIP/贵宾

-

威望
0
论坛币
2338 个
通用积分
0.0417
学术水平
2 点
热心指数
7 点
信用等级
0 点
经验
5185 点
帖子
249
精华
2
在线时间
2922 小时
注册时间
2004-11-29
最后登录
2025-12-9

楼主
pq366 发表于 2010-6-1 22:37:00 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
多个内生变量的工具变量怎样在stata中以2sls的方法实现呢,谢谢。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Stata 2SLS 内生变量 工具变量 tata 工具 变量 Stata 内生

沙发
jiangbogz 发表于 2010-12-20 16:22:08
同问同问。

藤椅
efei200x 发表于 2010-12-21 20:53:15
能详细一点吗?是多个内生变量,一个instrument?
忍无可忍,重新再忍

板凳
hanaoxue 发表于 2012-5-5 10:39:05
你好,请问你会了吗?可以教我一下吗?谢谢

报纸
yarsuse 发表于 2013-8-19 16:47:16
同问同问

地板
捣蛋布叮 发表于 2013-11-20 15:54:11

同问同问

7
蓝色 发表于 2013-11-20 19:12:19
ivregress  2sls  y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2)

8
捣蛋布叮 发表于 2013-11-26 09:44:40
ivregress  2sls  y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2)         好

9
xiaoyuertutu 发表于 2018-5-2 15:56:36
蓝色 发表于 2013-11-20 19:12
ivregress  2sls  y x1 x2 (x3=iv1 iv2) (x4=iv1 iv2)
您好,请问,您这个命令是x3,x4两个内生变量的工具变量相同,都是iv1和iv2嘛?

我还有个问题:
x1x2是外生变量;x3 x4 x5内生变量,相应的工具变量分别是ivx3 ivx4 ivx5,那么2sls的命令:
ivregress 2sls y x1 x2 (x3 x4 x5=ivx3 ivx4 ivx5),r first  
问题:x1  x2 x3,每个内生变量的工具变量有且仅有一个。但stata却理解为x1的工具变量为IVX1 IVX2 IVX3,x2的工具变量为IVX1 IVX2 IVX3,x3的工具变量为IVX1 IVX2 IVX3。
那么我根据您这个命令的启示,修改为:ivregress 2sls y x1 x2 (x3=ivx3) (x4= ivx4) (x5=ivx5)是可以的吗?
期待您的解答

10
蓝色 发表于 2018-5-2 20:34:02
xiaoyuertutu 发表于 2018-5-2 15:56
您好,请问,您这个命令是x3,x4两个内生变量的工具变量相同,都是iv1和iv2嘛?

我还有个问题:
命令应该是
ivregress 2sls y x1 x2 (x3 x4 x5=ivx3 ivx4 ivx5),r first  
其他格式软件应该不允许
https://www.stata.com/support/faqs/statistics/instrumental-variables-regression/

Must I use all of my exogenous variables as instruments when estimating instrumental variables regression?
Title Two-stage least-squares regression
AuthorVince Wiggins, StataCorp

[size=14.6667px]Note: This model could also be fit with sem, using maximum likelihood instead of a two-step method.
You can find examples for recursive models fit with sem in the “Structural models: Dependencies between response variables” section of [SEM] intro 5 — Tour of models.

[size=14.6667px]Someone posed the following question:

[size=14.6667px]I am estimating an equation:        Y = a + bX + cZ + dW I then want to instrument W with Q. I know the first-stage regression is supposed to be        W = e + fX + gZ + hQ (i.e., use all the exogenous variables in the first stage). Actually this is automatically done if I use the ivregress command. However, I only want to use Q to instrument W without using X and Z in the first stage. Is there a way I can do it in Stata? I can regress W on Q and get the predicted W, and then use it in the second-stage regression. The standard errors will, however, be incorrect.

[size=14.6667px]ivregress will not let you do this and, moreover, if you believe W to be endogenous because it is part of a system, then you must include X and Z as instruments, or you will get biased estimates for b, c, and d.

[size=14.6667px]Consider the system

        Y1 = a0 + a1*Y2 + a2*X1 + a3*X2 + e1               (1)        Y2 = b0 + b1*Y1 + b2*X3 + b3*X4 + e2               (2)

[size=14.6667px]Warning: Assume we are estimating structural equation (1); if X1 and X2 are exogenous, then they must be kept as instruments or your estimates will be biased. In a general system, such exogenous variables must be used as instruments for any endogenous variables when the instrumented value for the endogenous variables appears in an equation in which the exogenous variable also appears.

[size=14.6667px]Consider the reduced forms of your two equations:

        Y1 = e0 + e1*X1 + e2*X2 + e3*X3 + e4*x4 + u1        (1r)        Y2 = f0 + f1*X1 + f2*X2 + f3*X3 + f4*x4 + u2        (2r)

[size=14.6667px]where e# and f# are combinations of the a# and b# coefficients from (1) and (2) and u1 and u2 are linear combinations of e1 and e2.

[size=14.6667px]All exogenous variables appear in each equation for an endogenous variable. This is the nature of simultaneous systems, so efficiency argues that allexogenous variables be included as instruments for each endogenous variable.

[size=14.6667px]Here is the real problem. Take (1): the reduced-form equation for Y2, (2r), clearly shows that Y2 is correlated with X2 (by the coefficient f2). If we do not include X2 among the instruments for Y2, then we will have failed to account for the correlation of Y2 with X2 in its instrumented values. Since we did not account for this correlation, when we estimate (1) with the instrumented values for Y2, the coefficient a3 will be forced to account for this correlation. This approach will lead to biased estimates of both a1 and a3.

[size=14.6667px]For a brief reference, see Baltagi (2011). See the whole discussion of 2SLS, particularly the paragraph after equation 11.40, on page 265. (I have no idea why this issue is not emphasized in more books.)

[size=14.6667px]Failing to include X4 affects only efficiency and not bias.

[size=14.6667px]However, there is one case where it is not necessary to include X1 and X2 as instruments for Y2. That is when the system is triangular such that Y2 does not depend on Y1, but you believe it is weakly endogenous because the disturbances are correlated between the equations. You are still consistent here to do what ivregress does and retain X1 and X2 as instruments. They are, however, no longer required. Then you could do what you suggested and just regress on the predicted instruments from the first stage.

[size=14.6667px]If you do use this method of indirect least squares, you will have to perform the adjustment to the covariance matrix yourself. Consider the structural equation

        y1 = y2 + x1 + e

[size=14.6667px]where you have an instrument z1 and you do not think that y2 is a function of y1.

[size=14.6667px]The following example uses only z1 as an instrument for y2. Let’s begin by creating a dataset (containing made-up data) on y1, y2, x1, and z1:

. sysuse auto  (1978 Automobile Data) . rename price y1 . rename mpg y2 . rename displacement z1 . rename turn x1

[size=14.6667px]Now we perform the first-stage regression and get predictions for the instrumented variable, which we must do for each endogenous right-hand-side variable.

. regress y2 z1
      Source

       SS       df       MS

           Number of obs =      74

           F(  1,    72) =   71.41

       Model

  1216.67534     1  1216.67534

           Prob > F      =  0.0000
    Residual

  1226.78412    72  17.0386683

           R-squared     =  0.4979

           Adj R-squared =  0.4910

       Total

  2443.45946    73  33.4720474

           Root MSE      =  4.1278

          y2

       Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

          z1

   -.0444536   .0052606    -8.45   0.000    -.0549405   -.0339668

       _cons

    30.06788   1.143462    26.30   0.000     27.78843    32.34733

. predict double y2hat  (option xb assumed; fitted values)  * perform IV regression  . regress y1 y2hat x1  
      Source

       SS       df       MS

           Number of obs =      74

           F(  2,    71) =   12.41

       Model

   164538571     2  82269285.5

           Prob > F      =  0.0000
    Residual

   470526825    71  6627138.38

           R-squared     =  0.2591

           Adj R-squared =  0.2382

       Total

   635065396    73  8699525.97

           Root MSE      =  2574.3

          y1

       Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

       y2hat

   -463.4688    117.187    -3.95   0.000    -697.1329   -229.8046

          x1

   -126.4979   108.7468    -1.16   0.249    -343.3328    90.33697

       _cons

    21051.36   6451.837     3.26   0.002     8186.762    33915.96


[size=14.6667px]Now we correct the variance–covariance by applying the correct mean squared error:

. rename y2hat y2hold . rename y2 y2hat . predict double res, residual . rename y2hat y2                       /* put back real y2 */ . rename y2hold y2hat   . replace res = res^2   (74 real changes made)   . summarize res
    Variable

      Obs        Mean    Std. Dev.       Min        Max

         res

       74     7553657    1.43e+07   117.4375   1.06e+08

. scalar realmse = r(mean)*r(N)/e(df_r)                                   /* much ado about small sample */ . matrix bmatrix = e(b) . matrix Vmatrix = e(V) . matrix Vmatrix = e(V) * realmse / e(rmse)^2 . ereturn post bmatrix Vmatrix, noclear . ereturn display

       Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]

       y2hat

   -463.4688   127.7267    -3.63   0.001    -718.1485    -208.789

          x1

   -126.4979   118.5274    -1.07   0.289    -362.8348    109.8389

       _cons

    21051.36   7032.111     2.99   0.004      7029.73    35072.99


ReferenceBaltagi, B. H. 2011.Econometrics. New York: Springer.

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-27 06:59