- program lfprobit
- version 10.1
- args lnf xb
- local y "$ML_y1"
- quietly replace `lnf' = ln( normal(`xb')) if `y'==1
- quietly replace `lnf' = ln(1-normal(`xb')) if `y'==0
- end
这个改进其实很简单。原理就是用in语句代替if语句。大家一看下面的例子就会明白了。
剧透一下:试验结果表明,当数据包含20万观测值时,使用in条件比默认的使用if条件来计算likelihood可以使运算速度减少(相对于默认算法)40%至50%!
- // testing speed of updating likelihood function
- // =============================================
- foreach iter in 100 200 500 1000 {
- clear all
- // set up testing parameters
- set seed 130507
- set obs 200000
- // setting up dependent variable so it assumes values 1 to 3
- cap drop depvar
- gen depvar = runiform()
- qui su depvar, d
- replace depvar = 1 if depvar<r(p25)
- replace depvar = 2 if depvar!=1&depvar<r(p75)
- replace depvar = 3 if depvar!=1&depvar!=2
- // confirm that depvar is set correctly
- noi su depvar
- noi levelsof depvar
- // setting up xb variable and placeholder for likelihood
- cap drop lnf xb
- gen lnf = .
- gen xb = runiform()
- // initialize timer
- timer clear 1
- timer clear 2
- timer clear 3
- timer clear 4
- // NOTE: the data is NOT sorted by depvar.
- // first method of updating the likelihood, the default, as benchmark
- timer on 1
- forvalues i = 1/`iter' {
- replace lnf=ln(normal(0-xb)) if depvar==1
- replace lnf=ln(normal(2-xb)-normal(0-xb)) if depvar==2
- replace lnf=ln(1-normal(2-xb)) if depvar==3
- }
- timer off 1
- // second method of updating the likelihood
- timer on 2
- forvalues i = 1/`iter' {
- replace lnf=ln(normal(0-xb))*(depvar==1)+ln(normal(2-xb)-normal(0-xb))*(depvar==2)+ln(1-normal(2-xb))*(depvar==3)
- }
- timer off 2
- // now sort the data and see if speeds up the default method
- sort depvar
- timer on 3
- forvalues i = 1/`iter' {
- replace lnf=ln(normal(0-xb)) if depvar==1
- replace lnf=ln(normal(2-xb)-normal(0-xb)) if depvar==2
- replace lnf=ln(1-normal(2-xb)) if depvar==3
- }
- timer off 3
- // finally, try the updating method using in conditions
- // as counting is part of the procedure, start timer before counting
- timer on 4
- // note that a forvalues loop can be easily constructed if number of
- // categories of the dependent variable is unknown
- sort depvar
- count if depvar==1
- local start1 = 1
- local end1 = r(N)
- count if depvar==2
- local start2 = `end1'+1
- local end2 = `start2'+r(N)-1
- local start3 = `end2'+1
- // finished counting, start updating using in conditions
- // note that without loss of generality it is assumed that the entire data set is used.
- // unused portion of the data set can be dropped after preserve then restore
- // after the program finished
- forvalues i = 1/`iter' {
- replace lnf=ln(normal(0-xb)) in `start1'/`end1'
- replace lnf=ln(normal(2-xb)-normal(0-xb)) in `start2'/`end2'
- replace lnf=ln(1-normal(2-xb)) in `start3'/l
- }
- timer off 4
- // list results
- noi di "Benchmark: using if conditions before sorting"
- noi timer list 1
- noi di "Method 2"
- noi timer list 2
- noi di "Using default method after sorting"
- noi timer list 3
- noi di "Proposed alternative using in conditions"
- noi timer list 4
- // clean up
- drop depvar lnf xb
- timer clear 1
- timer clear 2
- timer clear 3
- timer clear 4
- } // end foreach
- Number of iterations: 100
- Benchmark: using if conditions before sorting
- 1: 2.36 / 1 = 2.3560
- Method 2
- 2: 4.87 / 1 = 4.8670
- Using default method after sorting
- 3: 4.35 / 1 = 4.3530
- Proposed alternative using in conditions
- 4: 1.37 / 1 = 1.3730
- Number of iterations: 200
- Benchmark: using if conditions before sorting
- 1: 4.88 / 1 = 4.8830
- Method 2
- 2: 9.59 / 1 = 9.5940
- Using default method after sorting
- 3: 8.74 / 1 = 8.7360
- Proposed alternative using in conditions
- 4: 2.65 / 1 = 2.6520
- Number of iterations: 500
- Benchmark: using if conditions before sorting
- 1: 11.72 / 1 = 11.7160
- Method 2
- 2: 23.81 / 1 = 23.8050
- Using default method after sorting
- 3: 21.67 / 1 = 21.6680
- Proposed alternative using in conditions
- 4: 6.63 / 1 = 6.6300
- Number of iterations: 1000
- Benchmark: using if conditions before sorting
- 1: 23.35 / 1 = 23.3540
- Method 2
- 2: 48.55 / 1 = 48.5470
- Using default method after sorting
- 3: 43.29 / 1 = 43.2900
- Proposed alternative using in conditions
- 4: 13.26 / 1 = 13.2600


雷达卡




京公网安备 11010802022788号







