- *Table 2 - Are Prices Lower When More Students Are Enrolled Online?*
- foreach x in pub nonpub {
- xi: reg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x', robust
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- xtset cbsa
- xi: xtreg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x'==1, fe vce(cluster cbsa)
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- xtset opeid_year_group
- xi: xtreg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x'==1, fe vce(cluster opeid_year_group)
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- }
不知道为什么do文件上传失败,贴在这里吧:
- /*
- Replication file for Deming, Goldin, Katz and Yuchtman. "Can Online Education Bend the Higher Education Cost Curve?"
- American Economic Review: Papers & Proceedings, 105(5): 496-501.
- We provide a cleaned data set for replication. To replicate from the raw IPEDS files, here are the steps:
- Start by obtaining raw csv files from the IPEDS website
- They can be found here - http://nces.ed.gov/ipeds/datacenter/DataFiles.aspx
- The site includes stata do files for reading csv into .dta format and labeling vars.
- For tables 1 and 2, only 2012 and 2013 data are necessary.
- Download the following files - hd2012, efa2012 and efa2012_dist - same for 2013.
- Add local and state unemployment rate data from the BLS - http://www.bls.gov/lau/#data.
- Finally, add data from the 2009 Barron's college rankings, merging on unitid.
- */
- cd "YOUR DIR HERE"
- use DGKY_online_replication, clear
- drop if opeflag==5 | opeflag==6
- keep if year==2012 | year==2013
- drop if cyactive!=1
- drop if sector==0
- gen type_cat=type
- replace type_cat=6 if type_cat==5
- replace type_cat=5 if type_cat==4
- replace type_cat=4 if type_cat==3
- replace type_cat=3 if type_cat==2 & (sector==4 | sector==7)
- *replace tot variable with enroll_UGdegseek - small number of institutions that didnt' respond to distance ed survey - all FPs*
- *assume zero online enrollment*
- replace tot=enroll_UGdegseek if tot==.
- foreach x in alldist somedist nodist {
- gen share_`x'=`x'/tot
- }
- foreach x in alldist_instate alldist_outstate alldist_outUS {
- gen share_`x'=`x'/alldist
- }
- *Table 1 - Enrollment in Online Courses by Undergraduate, Degree-Seeking US Students 2013*
- *Divide percentages in first three columns by fourth column to get totals*
- tabstat alldist somedist nodist tot if year==2013, by(type_cat) s(sum)
- *Divide percentages in first three columns by fourth column to get totals*
- tabstat alldist_instate alldist_outstate alldist_outUS alldist if year==2013, by(type_cat) s(sum)
- *Set up for price regressions*
- gen log_enroll=ln(tot)
- gen log_tuit=ln(statetuit)
- gen log_charge=ln(hrchg2)
- *geographic FE use CBSA, CSA, county - results don't change that much*
- *for chain results, use first 5 digits of opeid*
- gen length=length(opeid)
- gen temp1="0"
- gen temp2="00"
- egen temp3=concat(temp1 opeid) if length==7
- egen temp4=concat(temp2 opeid) if length==6
- gen opeid_string=temp3 if length==7
- replace opeid_string=temp4 if length==6
- replace opeid_string=opeid if length==8
- gen opeid_6=substr(opeid_string,1,6)
- egen opeid_year_group=group(opeid_6 year sector)
- set more off
- local covs "i.sector i.type_cat i.iclevel i.control i.hloffer i.deggrant i.locale unemp_rate miss_unemp i.barrons_rank_2009 i.admcon1 i.admcon2 i.admcon3 i.admcon4 i.admcon5 i.admcon6 i.admcon7 i.admcon8 i.admcon9 i.openadmp satpct- actwr75 satpct_miss- actwr75_miss"
- gen nonpub=nfp==1 | fp==1
- *Table 2 - Are Prices Lower When More Students Are Enrolled Online?*
- foreach x in pub nonpub {
- xi: reg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x', robust
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- xtset cbsa
- xi: xtreg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x'==1, fe vce(cluster cbsa)
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- xtset opeid_year_group
- xi: xtreg log_tuit share_alldist share_somedist i.year log_enroll `covs' if `x'==1, fe vce(cluster opeid_year_group)
- outreg2 using price_online, keep(share_alldist share_somedist) alpha(0.01, 0.05, 0.10) bracket nocons excel dec(3)
- }
-
- /*
- For Figure 1, need trend data going back to 2000.
- Pull the following IPEDS files (in addition to files from 2000-2011:
- hd (directory variables
- efa (enrollment - note file is by race and gender but just use overall)
- ic_ay - (tuition and fees - note some schools report tuition by program rather than overall. when overall tuition is missing, replace with tuition of most common program)
- Merge these files on unitid and year.
- */
- use DGKY_online_replication, clear
- *Create indicator variable for online schools, based on having more than 50% of students enrolled in exclusively distance education in 2012*
- *Backcast after that for years with no distance education data*
- gen share_alldist= alldist/ tot if year==2012
- gen temp=share_alldist>0.5 & share_alldist!=.
- bysort unitid: egen online=max(temp)
- drop temp
- *drop very competitive or higher, using Barron's rankings*
- drop if barrons_rank>=1 & barrons_rank<=3
- *Restrict to active, Title IV eligible institutions*
- keep if cyactive==1
- keep if opeflag==1
- *Fix an obvious mistake in U Phoenix tuition variable - in 2001, data was reported as the total cost of a 4 year degree instead of per-year tuition cost*
- *This typically happens when tuition is reported at the program level*
- *Ideally would not use this variable, but then tuition is missing for many for-profits and not evenly by year*
- *Price goes from around $15k to $63k in a single year, and then back down*
- *Solution is to divide by four*
- gen phoenix=regexm(instnm, "University Of Phoenix")
- replace statetuit=statetuit/4 if phoenix==1 & year==2001
- *Fix a small number of other similar mistakes - dividing by four or two, as appropriate*
- *Issue is that tuition is inconsistently reported at program vs. school level*
- *These fixes are conservative - needs to be almost exactly 4x or 2x tuition in years right around*
- *This is an art, not a science!*
- replace statetuit=statetuit/4 if unitid==102845 & year>=2002 & year<=2003
- replace statetuit=statetuit/2 if unitid==438601 & year>=2001 & year<=2002
- replace statetuit=statetuit/4 if unitid==438601 & year>=2003 & year<=2004
- replace statetuit=statetuit/4 if unitid==448628 & year==2006
- replace statetuit=statetuit/2 if unitid==121275 & year>=2003 & year<=2004
- replace statetuit=statetuit/2 if unitid==234216 & year>=2002 & year<=2003
- replace statetuit=statetuit/4 if unitid==234216 & year>=2004 & year<=2009
- replace statetuit=statetuit/4 if unitid==142054 & year>=2004 & year<=2005
- replace statetuit=statetuit/2 if unitid==197832 & year>=2002 & year<=2004
- replace statetuit=statetuit/2 if unitid==440925 & year>=2002 & year<=2003
- replace statetuit=statetuit/2 if unitid==441061 & year>=2003 & year<=2005
- replace statetuit=statetuit/2 if unitid==262305 & year>=2003 & year<=2005
- replace statetuit=statetuit/2 if unitid==436474 & year>=2003 & year<=2004
- replace statetuit=statetuit/2 if unitid==260813 & year>=2000 & year<=2004
- replace statetuit=statetuit/2 if unitid==432834 & year>=2003 & year<=2009
- replace statetuit=statetuit/2 if unitid==260789 & year>=2003 & year<=2009
- replace statetuit=statetuit/4 if unitid==437848 & year==2002
- replace statetuit=statetuit/2 if unitid==135939 & year>=2004 & year<=2005
- replace statetuit=statetuit/2 if unitid==103893 & year>=2000 & year<=2001
- *adjust for inflation, expressing in 2014 dollars, using http://data.bls.gov/cgi-bin/cpicalc.pl*
- replace statetuit=statetuit*1.37 if year==2000
- replace statetuit=statetuit*1.34 if year==2001
- replace statetuit=statetuit*1.32 if year==2002
- replace statetuit=statetuit*1.29 if year==2003
- replace statetuit=statetuit*1.25 if year==2004
- replace statetuit=statetuit*1.21 if year==2005
- replace statetuit=statetuit*1.17 if year==2006
- replace statetuit=statetuit*1.14 if year==2007
- replace statetuit=statetuit*1.10 if year==2008
- replace statetuit=statetuit*1.10 if year==2009
- replace statetuit=statetuit*1.09 if year==2010
- replace statetuit=statetuit*1.05 if year==2011
- replace statetuit=statetuit*1.03 if year==2012
- replace statetuit=statetuit*1.02 if year==2013
- gen cat=1 if online==1 & pub!=1 & (barrons_rank==5 | barrons_rank==.)
- replace cat=2 if online!=1 & (barrons_rank==5 | barrons_rank==.) & (sector==2 | sector==3)
- replace cat=3 if sector==1
- collapse (mean) statetuit [w=enrollFTUG], by(cat year)
- drop if cat==.
- forvalues y=1(1)3 {
- gen temp=statetuit if cat==`y'
- bysort year: egen statetuit_`y'=max(temp)
- drop temp
- }
- drop cat statetuit
- duplicates drop
- tsset year
- *Figure 1 - Trends in Tuition by Institution Type*
- tsline statetuit_1 statetuit_2 statetuit_3