楼主: TeddyJakusch
3806 11

Panel-data maximum likelihood likelihood : trouble with mlsum [推广有奖]

  • 0关注
  • 0粉丝

初中生

4%

还不是VIP/贵宾

-

威望
0
论坛币
10 个
通用积分
0
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
48 点
帖子
7
精华
0
在线时间
15 小时
注册时间
2012-4-20
最后登录
2012-4-26

楼主
TeddyJakusch 发表于 2012-4-20 20:27:19 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
[size=+1]Dear  Members of this Blog,

I just found this blog and thought that some of you might help me out with a question i have concerning the ml-environment in Stata 10.0 i´m currently using:
For a recent project, I try to perform a panel-data probit likelihood estimation to unravel the preference parameters of investors on individual level. The whole approach I try to perform is based on a loose adaption of Harrison`s (2008) “Maximum Likelihood Estimation of Utility Functions Using Stata”, in which I implemented a trading model, which I want to calibrate now using ML.  The original dataset is similar to the structure in Gould, Pitblado and Sribney (2006) “Maximum Likelihood Estimation with Stata”, p. 110. It looks more or less like this (I hope it is not too scrambled):Obs. No.         Investor _ID        Security_ID         Date           Choice             Charact. Of Sec.        L.L.1                      1                          1                         t_0           1                                .2                1                          1                         t_1           1                                   .3                1                          1                         t_2           0                                   ln(L1)4                1                          2                         t_0           1                                   .5                1                          2                         t_1     1                                   .6                1                          2                         t_2           1                                   .7                1                          2                         t_3     0                                   ln(L2)8                2                          1                         t_0           1                                   .…and so on....As I’m interested which characteristics of the securities might also have an impact on the hold (=1) or sell (=0) decision of the respective investor, I inted to generate “sub”-log-likelihood functions at the end of each observation of “Choice” for each security and aggregate the sum of these log-likelihoods at investor-level.A sketch of my attempts looks like this (I´m using a stata 10.0 version and did already the "update query".):.....program define ML_My_problematic_model_1 // define maximum likelihood program for the panel datasetargs todo b lnf  //define variables and coefficient vector btempvar alpha lambda gamma  last lj **some more variables** utility_diff mleval `alpha' = `b', eq(1) mleval `lambda' = `b', eq(2) mleval `gamma' = `b', eq(3) //Variables of interest are alpha, lambda and gammaquietly {        **contains more or less specifications of the model** // define likelihood function per security_ID:by security_ID: gen double `utility_diff'=`utility_alternative_2' -`utility_alternative_1' //here I tried to generate the sub-likelihood functions for each security_ID (in line with Harrison (2008) as mentioned above)        by security_ID: gen byte `last'=_n==_N //construct likelihood for utility difference under iid assumption:        gen double `lj'=.        by security_ID: replace `lj' =(normal(`utility_diff')) if $ML_y1==0        by security_ID: replace `lj' =(normal(-`utility_diff')) if $ML_y1==1mlsum `lnf' = ln(`lj') if `last'==1 //sum the added likelihood functions at the end of each security_IDif (`todo'==0 | `lnf'>=.) exit}end.....It is the indented middle part that worries me: My problem is now, that I try to generate a sub-ln-likelihood function at the last observation of the group, here “security_ID” for the “Choice” variable. Trying to sum them up with mlsum returns only a likelihood function of 0 or an algorithm that generates an error message, stating that numerical derivatives are flat or not obtainable- no matter what I do. Furthermore, mlsum seems to add the whole column of my dataset such that it seems the "by security_ID : ..." is fairly ignored..   The ml check works fine and indicates no serious issues. I obtained the results (if not interrupted after 300 iterations) using:statsby [alpha]_cons [lambda]_cons [gamma]_cons, by(person_ID) clear: ml model d0 ML_My_problematic_model_1 (alpha: Choice  **a lot of other variables** = ) (lambda: ) (gamma: ), maximize technique(dfp nr)To check the results  I got with this program, I created a “test”-sample to play around, for which I wrote the likelihood-program for this particular investors aswell but in which the securities are listed side by side. This program surprisingly works quite fine but would be (obviously) messy to apply for all investors (if it helps I can also post it here). I can´t see a large difference to the code shown in Gould, Pitblado and Sribney (2006), p. 111 which confuses me. I hope the information is sufficient to make a first statement about this problem and to give me an indication where my flaws are..Thank you very much in advance !

Teddy

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Likelihood Maximum trouble Panel like trouble

沙发
primmxz 在职认证  发表于 2012-4-20 20:32:52
楼主你思路能不能清晰些

藤椅
TeddyJakusch 发表于 2012-4-20 20:35:04
hi, sorry.. here is a clearer version of it (it seems its too scrambled)

http://www.stata.com/statalist/archive/2012-04/msg00712.html

Hope it works..
Teddy

板凳
TeddyJakusch 发表于 2012-4-20 20:40:24
ok i have a try- hope its not too scrambled...

this is my program

program define ML_My_problematic_model_1 // define maximum likelihood program

args todo b lnf  //define variables and coefficient vector b

tempvar alpha lambda gamma  last lj **some more variables** utility_diff

mleval `alpha' = `b', eq(1)

mleval `lambda' = `b', eq(2)

mleval `gamma' = `b', eq(3) //Variables of interest are alpha, lambda and gamma


quietly {        **contains more or less specifications of the model** // define likelihood function per security_ID:


by security_ID: gen double `utility_diff'=`utility_alternative_2' -`utility_alternative_1' //here I tried to generate the sub-likelihood functions for each security_ID

by security_ID: gen byte `last'=_n==_N //construct sub- likelihood

        gen double `lj'=.

        by security_ID: replace `lj' =(normal(`utility_diff')) if $ML_y1==0

        by security_ID: replace `lj' =(normal(-`utility_diff')) if $ML_y1==1

mlsum `lnf' = ln(`lj') if `last'==1 //sum the added likelihood functions at the end of each security_ID

if (`todo'==0 | `lnf'>=.) exit

}

end
.....

My problem is that the middle part (by Security_ID:...) doesn´t really generate the sub-likelihood functions..

mlsum here forms the likelihood across all Security_ID´s and not just adds the sub-likelihoods for each security. I think there´s some flaw in the part of the program where the "by security_ID: " shows up..

Thank you for your help
Teddy

报纸
sungmoo 发表于 2012-4-21 10:22:00
My problem is that the middle part (by Security_ID:...) doesn´t really generate the sub-likelihood functions..
by without the sort option requires that the data be sorted

cap pr drop ML_My_problematic_model_1
pr ML_My_problematic_model_1
args todo b lnf
tempvar alpha lambda gamma last lj utility_diff
mleval `alpha' = `b', eq(1)
mleval `lambda' = `b', eq(2)
mleval `gamma' = `b', eq(3)
qui{
sort security_ID
by security_ID: g double `utility_diff'=`utility_alternative_2' -`utility_alternative_1'
by security_ID: g byte `last'=_n==_N
g double `lj'=.
by security_ID: replace `lj' =normal(`utility_diff'*(-1)^$ML_y1)
mlsum `lnf' = ln(`lj') if `last'
if (!`todo'|`lnf'>=.) exit
}
end

(1) It seems that the first macro after "args" stands for the general term in the cumulative sum.
(2) "**  **" cannot be used as "in-line" comment delimiters.

地板
TeddyJakusch 发表于 2012-4-21 20:29:52
sungmoo 发表于 2012-4-21 10:22
by without the sort option requires that the data be sorted

cap pr drop ML_My_problematic_model ...
Hi,

thank you for your reply.

I sorted the data by security_ID one line before. Alternatively i used "bysort security_ID" and tried also a foreach loop using levelsof security_ID, local(security) to enerate the varlist. Unfortunately still without results.. :-(
I can also post the results if it helps to identify the problem..

Thanks again.
Regards,
Teddy

7
sungmoo 发表于 2012-4-21 20:39:22
What do utility_alternative_1 and utility_alternative_2 stand for?

8
TeddyJakusch 发表于 2012-4-21 21:14:19
utility_alternative_1 and utility_alternative_2 are functions containing the parameters that i try to estimate (alpha, gamma and lambda)- its a static optimization model from Harrison and Rutström(2008) Maximum likelihood estimation of Utility Functions using Stata . Would it help if i post the output or/and the whole code?

9
sungmoo 发表于 2012-4-21 22:43:10
Would it help if i post the output or/and the whole code?
Yes, please.

10
TeddyJakusch 发表于 2012-4-22 15:33:13
The outputof the program above looks like this:

initial:       log likelihood =     -<inf>  (could not be evaluated)
feasible:      log likelihood =          0
rescale:       log likelihood =          0
rescale eq:    log likelihood =          0
(setting optimization to DFP)
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered
Iteration 0:   log likelihood =          0  
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered
Iteration 1:   log likelihood =          0  
numerical derivatives are approximate
flat or discontinuous region encountered
numerical derivatives are approximate
flat or discontinuous region encountered

. ml display

                                                  Number of obs   =       1564
                                                  Wald chi2(0)    =          .
Log likelihood =          0                       Prob > chi2     =          .

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
alpha        |
       _cons |        -.5          .        .       .            .           .
-------------+----------------------------------------------------------------
lambda       |
       _cons |        -.5   793.6529    -0.00   0.999    -1556.031    1555.031
-------------+----------------------------------------------------------------
gamma        |
       _cons |        -.5          .        .       .            .           .
------------------------------------------------------------------------------


A similar program that i wrote for a two-security-case generates an output for the same dataset like this:
initial:       log likelihood =     -<inf>  (could not be evaluated)
feasible:      log likelihood =  -1059.786
rescale:       log likelihood =  -1059.786
rescale eq:    log likelihood = -1027.9832
(setting optimization to DFP)
Iteration 0:   log likelihood = -1027.9832  
Iteration 1:   log likelihood = -1027.7447  (backed up)
Iteration 2:   log likelihood = -1027.5696  
Iteration 3:   log likelihood = -1027.4468  
Iteration 4:   log likelihood = -1026.6417  
(switching optimization to Newton-Raphson)
Iteration 5:   log likelihood = -1026.6203  
Iteration 6:   log likelihood = -1026.6203  

. ml display

                                                  Number of obs   =        782
                                                  Wald chi2(0)    =          .
Log likelihood = -1026.6203                       Prob > chi2     =          .

------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
alpha        |
       _cons |   .4306489   .0745162     5.78   0.000     .2845999     .576698
-------------+----------------------------------------------------------------
lambda       |
       _cons |  -.6196399   .1814276    -3.42   0.001    -.9752314   -.2640483
-------------+----------------------------------------------------------------
gamma        |
       _cons |   .9426656   .1456149     6.47   0.000     .6572657    1.228065
------------------------------------------------------------------------------
As its the same dataset just with a different ordering ( in the latter case each securities are side-by-side) i tried to align both programs such that the former produces this output- but without success..

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-3 21:52