已经生成unbalanced.i-j-k.dta,接下来就是打开unbalanced.1998-1999-2000.dta,将2001、2002、2003一直到2007年的数据加进去,形成一个十年的非平衡面板数据文件。这一阶段是既定的,不需要做改变。其程序如下:
use unbalanced.1998-1999-2000.dta, clear
tab match_status_1998_1999_2000
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test1.dta, replace
**step 110 将2001从1999-2000-2001中加入
use unbalanced.1999-2000-2001.dta, clear
tab match_status_1999_2000_2001
keep if match_status_1999_2000_2001=="1999-2000-2001"|match_status_1999_2000_2001=="2000-2001 only"
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id1999+string(revenue1999)+string(employment1999)+string(profit1999)+province1999
sort code
compress
saveold test3.dta, replace
use unbalanced.1999-2000-2001.dta, clear
tab match_status_1999_2000_2001
keep if match_status_1999_2000_2001=="1999-2001 only"
gen code=id1999+string(revenue1999)+string(employment1999)+string(profit1999)+province1999
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
tab _m
keep if _merge==5
keep *2001
compress
saveold test6.dta, replace
use unbalanced.1999-2000-2001.dta, clear
keep if match_status_1999_2000_2001=="2001 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test1.dta, replace
**step 120 将2002从 2000-2001-2002中提取出来,加入:
use unbalanced.2000-2001-2002.dta, clear
tab match_status_2000_2001_2002
keep if match_status_2000_2001_2002=="2000-2001-2002"|match_status_2000_2001_2002=="2001-2002 only"
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test3.dta, replace
use unbalanced.2000-2001-2002.dta, clear
tab match_status_2000_2001_2002
keep if match_status_2000_2001_2002=="2000-2002 only"
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2002
compress
saveold test6.dta, replace
use unbalanced.2000-2001-2002.dta, clear
keep if match_status_2000_2001_2002=="2002 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test1.dta, replace
**step 130 将 2003 从2001-2002-2003中提取出来,加入:
use unbalanced.2001-2002-2003.dta, clear
tab match_status_2001_2002_2003
keep if match_status_2001_2002_2003=="2001-2002-2003"|match_status_2001_2002_2003=="2002-2003 only"
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test3.dta, replace
use unbalanced.2001-2002-2003.dta, clear
tab match_status_2001_2002_2003
keep if match_status_2001_2002_2003=="2001-2003 only"
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2003
compress
saveold test6.dta, replace
use unbalanced.2001-2002-2003.dta, clear
keep if match_status_2001_2002_2003=="2003 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test1.dta, replace
**step 140 将 2004从 2002-2003-2004中提取出来,加入
use unbalanced.2002-2003-2004.dta, clear
tab match_status_2002_2003_2004
keep if match_status_2002_2003_2004=="2002-2003-2004"|match_status_2002_2003_2004=="2003-2004 only"
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test3.dta, replace
use unbalanced.2002-2003-2004.dta, clear
tab match_status_2002_2003_2004
keep if match_status_2002_2003_2004=="2002-2004 only"
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2004
compress
saveold test6.dta, replace
use unbalanced.2002-2003-2004.dta, clear
keep if match_status_2002_2003_2004=="2004 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test1.dta, replace
**step 150 将2005从2003-2004-2005中提取出来,加入:
use unbalanced.2003-2004-2005.dta, clear
tab match_status_2003_2004_2005
keep if match_status_2003_2004_2005=="2003-2004-2005"|match_status_2003_2004_2005=="2004-2005 only"
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test3.dta, replace
use unbalanced.2003-2004-2005.dta, clear
tab match_status_2003_2004_2005
keep if match_status_2003_2004_2005=="2003-2005 only"
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2005
compress
saveold test6.dta, replace
use unbalanced.2003-2004-2005.dta, clear
keep if match_status_2003_2004_2005=="2005 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test1.dta, replace
**step 160 将2006 从2004-2005-2006中提取出来,加入:
use unbalanced.2004-2005-2006.dta, clear
tab match_status_2004_2005_2006
keep if match_status_2004_2005_2006=="2004-2005-2006"|match_status_2004_2005_2006=="2005-2006 only"
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test3.dta, replace
use unbalanced.2004-2005-2006.dta, clear
tab match_status_2004_2005_2006
keep if match_status_2004_2005_2006=="2004-2006 only"
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2006
compress
saveold test6.dta, replace
use unbalanced.2004-2005-2006.dta, clear
keep if match_status_2004_2005_2006=="2006 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2006+string(revenue2006)+string(employment2006)+string(profit2006)+province2006
sort code
compress
saveold test1.dta, replace
**step 170 将2007 从2005-2006-2007 提取出来,加入:
use unbalanced.2005-2006-2007.dta, clear
tab match_status_2005_2006_2007
keep if match_status_2005_2006_2007=="2005-2006-2007"|match_status_2005_2006_2007=="2006-2007 only"
gen code=id2006+string(revenue2006)+string(employment2006)+string(profit2006)+province2006
sort code
compress
saveold test2.dta, replace
use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test3.dta, replace
use unbalanced.2005-2006-2007.dta, clear
tab match_status_2005_2006_2007
keep if match_status_2005_2006_2007=="2005-2007 only"
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test4.dta, replace
use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
saveold test5.dta, replace
use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2007
compress
saveold test6.dta, replace
use unbalanced.2005-2006-2007.dta, clear
keep if match_status_2005_2006_2007=="2007 no match"
display _N
compress
saveold test7.dta, replace
use test5.dta, clear
append using test6.dta
append using test7.dta
drop match_status*
compress
*最终生成名为unbalanced.1998--2007.dta的十年非平衡面板数据文件:
saveold unbalanced.1998--2007.dta, replace
**删除中间过程产生的无关数据文件:
local file_list: dir . files "*.dta"
foreach file of local file_list{
if "`file'" == "unbalanced.1998--2007.dta"{
continue
}
disp "erase `file'"
erase "`file'"
}
*/
*use "unbalanced.1998--2007.dta",clear
keep id_in_source*
gen id_in_panel=_n
reshape long id_in_source, i(id_in_panel) j(year)
drop if id_in_source == .
sort id_in_panel year
*生成ID对照表:
saveold "PanelID_1998-2007.dta",replace
至此,小白将1998-2007年的工业企业数据库匹配完毕,可以进行毕业论文的写作了!