楼主: zzzcards
2767 6

求助:弱弱问一下关于log的问题 [推广有奖]

  • 0关注
  • 0粉丝

小学生

35%

还不是VIP/贵宾

-

威望
0
论坛币
0 个
通用积分
0
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
188 点
帖子
7
精华
0
在线时间
5 小时
注册时间
2010-3-24
最后登录
2010-9-1

相似文件 换一批

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
求助高人解答下
今天做了个关于房价的测试,数据从91年到08年的: t(时间);pri(房价);pop(人口);IR(年利率);cor(房屋建设成本);(urb)城市化程度

第一个模型是:PRI= α +β1INCt+β2POPt+β3IRt+β4CORt+β5URBt+εt      
结果很糟糕,P>|t|那项显示除了cor有比较大影响,剩下的independent variable都没有什么影响,下面是结果:

. regress pri  inc  pop ir  cor  urb


     Source |       SS       df      MS              Number of obs=      18
-------------+-------------------------------------------------           F( 5,    12) =    5.55
      Model |   .09597453     5 .019194906           Prob >F      = 0.0071
   Residual |  .041508229    12 .003459019            R-squared    =  0.6981
-------------+-----------------------------------------------           Adj R-squared =  0.5723
      Total |  .137482759    17 .008087221           Root MSE      = .05881


------------------------------------------------------------------------------
        pri |      Coef.   Std. Err.          t            P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        inc |     .07411     .44087        0.17    0.869    -.8864633   1.034683
        pop |   4.480694    35.3166    0.13   0.901    -72.46756   81.42894
         ir |  -.0036233   .0082428     -0.44    0.668    -.0215827   .0143362
        cor |   .7221647   .2370482    3.05   0.010     .2056811   1.238648
        urb |  -.1003941   1.372459   -0.07   0.943    -3.090726   2.889938
      _cons |   .0469459    .786432    0.06   0.953    -1.666542   1.760434
------------------------------------------------------------------------------
然后用了第二个模型:lnPRI= α +lnβ1INCt+lnβ2POPt+lnβ3IRt+lnβ4CORt+lnβ5URBt+εt
然后就结果就好多了

regress logpri  loginc  logpop logir  logcor  logurb


     Source |       SS       df      MS              Number of obs=       8
-------------+------------------------------           F( 5,     2) =   19.08
      Model |  3.22763639     5 .645527277           Prob >F      = 0.0506
   Residual |  .067671343     2 .033835671           R-squared     = 0.9795
-------------+------------------------------           Adj R-squared =  0.9281
       Total | 3.29530773     7  .470758247           Root MSE      = .18394


------------------------------------------------------------------------------
     logpri |      Coef.   Std. Err.               t   P>|t|     [95% Conf. Interval]
-------------+------------------------------------- ---------------------------
     loginc |   1.165712   .2963565    3.93   0.059    -.1094067   2.440831
     logpop |   10.09308   2.080364    4.85   0.040     1.141994   19.04416
      logir |  -.2454173   .2198075   -1.12   0.380   -1.191173     .700338
     logcor |   .2216548    .104405    2.12   0.168    -.2275636   .6708731
     logurb |   17.14887   3.925733    4.37   0.049     .2578032   34.03993
      _cons |    67.6519   14.33035    4.72   0.042     5.993377   129.3104
------------------------------------------------------------------------------



我觉的我用的是OLS模型,不存在异方差(散点)问题啊,但是log确实修正了P>|t| (significant)这个值
用第一个模型的时候同时还存在serial correlation(序列相关?不知道中文是什么)的问题,用了第二个log模型之后又修正了
我懒,直接用DW测试(durbina)结果如下:
模型一:
. durbina


Durbin's alternative test forautocorrelation
---------------------------------------------------------------------------
   lags(p)  |          chi2               df                 Prob > chi2
-------------+-------------------------------------------------------------
      1     |          1.608               1                   0.2047
---------------------------------------------------------------------------


H0: no serial correlation
存在serial correlation 因为 Prob>chi2 这项大于0.1
模行二:
Durbin's alternative test forautocorrelation
---------------------------------------------------------------------------
   lags(p)  |          chi2               df                 Prob > chi2
-------------+-------------------------------------------------------------
      1     |          3.374               1                   0.0662
---------------------------------------------------------------------------
                        H0: no serialcorrelation
这个模型就小于0.1了,问题解决了。为什么呢????



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Log correlation Alternative Independent significant 求助 Log

data.png (11.77 KB)

data.png

沙发
zzzcards 发表于 2010-8-29 05:30:24 |只看作者 |坛友微信交流群
求助各位大大帮忙看看啊,555555

使用道具

藤椅
fenggrace 发表于 2010-8-29 06:57:03 |只看作者 |坛友微信交流群
I am afraid that you misread the results of the Durbin h-test. Let's look at it from the very beginning.
1. You have time series data. Generally, you don't need to test for heteroskedasticity for time series data since heteroskedasticity is a common problem for cross-sectional data. But you do need to consider the existence of serial correlation, which you did. We will come back to this later.
2. Taking log can improve the fitness of a regression given that it decrease the effect of outliers.  Sometimes taking log can correct the problem of serial correlation if the model is mis-specified.
3. The test you use in Stata for serial correlation is not DW test, it is Durbin-h test. Durbin-h test is used when you have lagged dependent variable in the regression. Since you don't have lagged housing price in the regression as an independent variable, I think you still need to run a DW test for first-order serial correlation.
4. OK, now let's take a look at the results of your Durbin-h test. Before you took natural log (model 1), the P value is greater than 0.1, which shows that you CANNOT reject the null hypothesis. The null hypothesis is There is NO serial correlation. So the conclusion here is that there is no serial correlation in the first model. But there IS serial correlation in the second model based on a small P value.
5. Again, strictly, you need to perform a DW test, not a D-h test.

Hope it helps.
行至水穷处 坐看云起时

使用道具

板凳
zzzcards 发表于 2010-8-29 08:22:46 |只看作者 |坛友微信交流群
3# fenggrace
Thank you so much , dude. You really help me a lot.
I really take a mistake by using D-h test for testing serial-correlation, and thank you for tell me indeed.
And I still have some problem on those two model, Ill very appreciate if you can help me.


The result of DW test in given below:


For the first model:

PRI= α +β1INCt+β2POPt+β3IRt+β4CORt+β5URBt+εt


Durbin-Watson d-statistic(6,18) =2.161911
Number of obs= 18

RESULT:
4-Du(1.94)<2.16<4-DL(3.29)
It is indicate that DW is in "zone of indecision"

For the second model:
lnPRI= α +lnβ1INCt+lnβ2POPt+lnβ3IRt+lnβ4CORt+lnβ5URBt+εt


Number of gaps in sample:
3

Durbin-Watson d-statistic(6,8) =1.641205


THIS IS REALLY CONFUSE ME DUE TO I CANT FIND Du and Dl.
question 1.How to deal with the problem above in second model ?
question 2.How to explain "P>|t|" part in both model ?
In first model,almost all of P>|t|   is greater than 0.1 except interest rate.
        pri |      Coef.   Std. Err.          t            P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        inc |     .07411     .44087        0.17    0.869    -.8864633   1.034683
        pop |   4.480694    35.3166    0.13   0.901    -72.46756   81.42894
         ir |  -.0036233   .0082428     -0.44    0.668    -.0215827   .0143362
        cor |   .7221647   .2370482    3.05   0.010     .2056811   1.238648
        urb |  -.1003941   1.372459   -0.07   0.943    -3.090726   2.889938
      _cons |   .0469459    .786432    0.06   0.953    -1.666542   1.760434
------------------------------------------------------------------------------

However, in 2nd model, almost all P>|t| is less than 0.1. Is it means the P>|t| corrected by
log model? Meanwhile, how to explain the P>|t| of interest rate is greater than 0.1?
     logpri |      Coef.   Std. Err.               t   P>|t|     [95% Conf. Interval]
-------------+------------------------------------- ---------------------------
     loginc |   1.165712   .2963565    3.93   0.059    -.1094067   2.440831
     logpop |   10.09308   2.080364    4.85   0.040     1.141994   19.04416
      logir |  -.2454173   .2198075   -1.12   0.380   -1.191173     .700338
     logcor |   .2216548    .104405    2.12   0.168    -.2275636   .6708731
     logurb |   17.14887   3.925733    4.37   0.049     .2578032   34.03993
      _cons |    67.6519   14.33035    4.72   0.042     5.993377   129.3104
------------------------------------------------------------------------------

使用道具

报纸
fenggrace 发表于 2010-8-29 08:37:11 |只看作者 |坛友微信交流群
1. you have zero and negative values in your data. As a result, taking natural log of them will create missing values because log of zero or log of negative values are not mathmatically defined. That's why you have gaps in your model. But you are still able to find the Du and DL based on number of observations included in the sample. You can get a DW table and find those values.
2. When P<0.1, that means the variable is "significant" or the coefficient of that variable is significantly different from zero. If P>0.1, it means the coefficient of the variable is not different from zero.
行至水穷处 坐看云起时

使用道具

地板
zzzcards 发表于 2010-8-29 08:58:13 |只看作者 |坛友微信交流群
6# fenggrace
1. In 2nd model n=8 k=6
however, there is a restrain of K(6) <= n-4 (8-4=4)sigh, I dont know what should I do now:(

2. Actually, you may have mislead by my question, sorry about that :)
let me reinterpret my question:
In model 1, all P value are greater than 0.1. when I used log model, all P value is less than 0.1. why this happened?
And, the only P value in model 1 less than 0.1 became greater than 0.1. why this happened?

3. Meanwhile, if I get the value of Du and Dl for 2nd model, the  DW is still in zone of indecision.
Is it mean the model is incorrect or invalid?

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-28 12:12