小白VS中国工业企业数据库（6）：生成最终十年的非平衡面板数据 - 数据交流中心

2关注
236
粉丝

博士生

50%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 11742 个
通用积分: 89.8917
学术水平: 66 点
热心指数: 73 点
信用等级: 59 点
经验: 4683 点
帖子: 173
精华: 0
在线时间: 387 小时
注册时间: 2007-6-3
最后登录: 2024-2-5

楼主

liuyangclick

发表于 2017-8-10 19:40:22 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

已经生成unbalanced.i-j-k.dta，接下来就是打开unbalanced.1998-1999-2000.dta，将2001、2002、2003一直到2007年的数据加进去，形成一个十年的非平衡面板数据文件。这一阶段是既定的，不需要做改变。其程序如下：

use unbalanced.1998-1999-2000.dta, clear
tab match_status_1998_1999_2000
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test1.dta, replace

**step 110  将2001从1999-2000-2001中加入

use unbalanced.1999-2000-2001.dta, clear
tab match_status_1999_2000_2001
keep if match_status_1999_2000_2001=="1999-2000-2001"|match_status_1999_2000_2001=="2000-2001 only"
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000

sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id1999+string(revenue1999)+string(employment1999)+string(profit1999)+province1999
sort code
compress
saveold test3.dta, replace

use unbalanced.1999-2000-2001.dta, clear
tab match_status_1999_2000_2001
keep if match_status_1999_2000_2001=="1999-2001 only"
gen code=id1999+string(revenue1999)+string(employment1999)+string(profit1999)+province1999
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
tab _m
keep if _merge==5
keep *2001
compress
saveold test6.dta, replace

use unbalanced.1999-2000-2001.dta, clear
keep if match_status_1999_2000_2001=="2001 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test1.dta, replace

**step 120  将2002从 2000-2001-2002中提取出来，加入：

use unbalanced.2000-2001-2002.dta, clear
tab match_status_2000_2001_2002
keep if match_status_2000_2001_2002=="2000-2001-2002"|match_status_2000_2001_2002=="2001-2002 only"
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test3.dta, replace

use unbalanced.2000-2001-2002.dta, clear
tab match_status_2000_2001_2002
keep if match_status_2000_2001_2002=="2000-2002 only"
gen code=id2000+string(revenue2000)+string(employment2000)+string(profit2000)+province2000
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2002
compress
saveold test6.dta, replace

use unbalanced.2000-2001-2002.dta, clear
keep if match_status_2000_2001_2002=="2002 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test1.dta, replace

**step 130  将 2003 从2001-2002-2003中提取出来，加入：

use unbalanced.2001-2002-2003.dta, clear
tab match_status_2001_2002_2003
keep if match_status_2001_2002_2003=="2001-2002-2003"|match_status_2001_2002_2003=="2002-2003 only"
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test3.dta, replace

use unbalanced.2001-2002-2003.dta, clear
tab match_status_2001_2002_2003
keep if match_status_2001_2002_2003=="2001-2003 only"
gen code=id2001+string(revenue2001)+string(employment2001)+string(profit2001)+province2001
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2003
compress
saveold test6.dta, replace

use unbalanced.2001-2002-2003.dta, clear
keep if match_status_2001_2002_2003=="2003 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test1.dta, replace

**step 140  将 2004从 2002-2003-2004中提取出来，加入

use unbalanced.2002-2003-2004.dta, clear
tab match_status_2002_2003_2004
keep if match_status_2002_2003_2004=="2002-2003-2004"|match_status_2002_2003_2004=="2003-2004 only"
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test3.dta, replace

use unbalanced.2002-2003-2004.dta, clear
tab match_status_2002_2003_2004
keep if match_status_2002_2003_2004=="2002-2004 only"
gen code=id2002+string(revenue2002)+string(employment2002)+string(profit2002)+province2002
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2004
compress
saveold test6.dta, replace

use unbalanced.2002-2003-2004.dta, clear
keep if match_status_2002_2003_2004=="2004 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test1.dta, replace

**step 150  将2005从2003-2004-2005中提取出来，加入：

use unbalanced.2003-2004-2005.dta, clear
tab match_status_2003_2004_2005
keep if match_status_2003_2004_2005=="2003-2004-2005"|match_status_2003_2004_2005=="2004-2005 only"
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test3.dta, replace

use unbalanced.2003-2004-2005.dta, clear
tab match_status_2003_2004_2005
keep if match_status_2003_2004_2005=="2003-2005 only"
gen code=id2003+string(revenue2003)+string(employment2003)+string(profit2003)+province2003
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2005
compress
saveold test6.dta, replace

use unbalanced.2003-2004-2005.dta, clear
keep if match_status_2003_2004_2005=="2005 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test1.dta, replace

**step 160  将2006 从2004-2005-2006中提取出来，加入：

use unbalanced.2004-2005-2006.dta, clear
tab match_status_2004_2005_2006
keep if match_status_2004_2005_2006=="2004-2005-2006"|match_status_2004_2005_2006=="2005-2006 only"
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test3.dta, replace

use unbalanced.2004-2005-2006.dta, clear
tab match_status_2004_2005_2006
keep if match_status_2004_2005_2006=="2004-2006 only"
gen code=id2004+string(revenue2004)+string(employment2004)+string(profit2004)+province2004
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
compress
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2006
compress
saveold test6.dta, replace

use unbalanced.2004-2005-2006.dta, clear
keep if match_status_2004_2005_2006=="2006 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
dis _N
append using test7.dta
dis _N
gen code=id2006+string(revenue2006)+string(employment2006)+string(profit2006)+province2006
sort code
compress
saveold test1.dta, replace

**step 170  将2007 从2005-2006-2007 提取出来，加入：

use unbalanced.2005-2006-2007.dta, clear
tab match_status_2005_2006_2007
keep if match_status_2005_2006_2007=="2005-2006-2007"|match_status_2005_2006_2007=="2006-2007 only"
gen code=id2006+string(revenue2006)+string(employment2006)+string(profit2006)+province2006
sort code
compress
saveold test2.dta, replace

use test1.dta, clear
merge code using test2.dta
tab _merge
drop _merge code
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test3.dta, replace

use unbalanced.2005-2006-2007.dta, clear
tab match_status_2005_2006_2007
keep if match_status_2005_2006_2007=="2005-2007 only"
gen code=id2005+string(revenue2005)+string(employment2005)+string(profit2005)+province2005
sort code
compress
saveold test4.dta, replace

use test3.dta, clear
merge code using test4.dta, update
tab _merge
drop code _merge
saveold test5.dta, replace

use test3.dta, clear
merge code using test4.dta, update replace
keep if _merge==5
keep *2007
compress
saveold test6.dta, replace

use unbalanced.2005-2006-2007.dta, clear
keep if match_status_2005_2006_2007=="2007 no match"
display _N
compress
saveold test7.dta, replace

use test5.dta, clear
append using test6.dta
append using test7.dta

drop match_status*
compress
*最终生成名为unbalanced.1998--2007.dta的十年非平衡面板数据文件：
saveold unbalanced.1998--2007.dta, replace

**删除中间过程产生的无关数据文件：

local file_list: dir . files "*.dta"
foreach file of local file_list{
if "`file'" == "unbalanced.1998--2007.dta"{
      continue
}
disp "erase `file'"
erase "`file'"
}
*/
*use "unbalanced.1998--2007.dta",clear
keep id_in_source*
gen id_in_panel=_n
reshape long id_in_source, i(id_in_panel) j(year)
drop if id_in_source == .
sort id_in_panel year
*生成ID对照表：
saveold "PanelID_1998-2007.dta",replace

至此，小白将1998-2007年的工业企业数据库匹配完毕，可以进行毕业论文的写作了！