楼主: ReneeBK
2318 10

【Lecture Notes】Political Science 207 [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4900份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49655 个
通用积分
55.9937
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2017-1-22 02:58:06 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
Political Science 207Winter Quarter 2014


Final ExamHomeworksData and R CodeReadings
  • Week 9: Difference-in-Difference, Matching, and Regression Discontinuity
    • Card, David, and Krueger, Alan B. 1994. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review 84: 772–793.
    • Ho, Daniel E., Kosuke Imai, Gary King, and Elizabeth A. Stuart. 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference." Political Analysis 15:199-236.
    • Gerber, Elisabeth R., and Daniel J. Hopkins. 2011. "When Mayors Matter: Estimating the Impact of Mayoral Partisanship on City Policy." American Journal of Political Science 55:326-339.
  • Week 8: Multilevel Models
  • Week 7: Multiple Imputation for Missing Data
    • King, Gary, James Honaker, Anne Joseph, and Kenneth Scheve, 2001. "Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation." American Political Science Review 95:49-70
    • Honaker, James, and Gary King. 2010. "What to do About Missing Values in Time Series Cross-Section Data." American Journal of Political Science 54:561-581.
  • Week 6: Selection Models
    • Berinksy, Adam. 1999. "The Two Faces of Public Opinion." American Journal of Political Science 43:1209-1230.
    • Bushway, Shaun, Brian D. Johnson, and Lee Ann Slocum. 2007. "Is the Magic Still There? The Use of the Heckman Two-Step Correction for Selection Bias in Criminology." Journal of Quantitative Criminology 23:151-178.
    • Dubin, Jeffrey A., and Douglas Rivers, 1990. "Selection Bias in Linear Regression, Logit, and Probit Models." Sociologial Methods and Research 18:360-390.
  • Week 5: Binary TSCS Models
    • Green, Donald P., Soo Yeon Kim, and David H. Yoon. 2001. "Dirty Pool" International Organization 55:441–468.
    • Beck, Nathaniel, and Jonathan N. Katz. 2001. "Throwing Out the Baby with the Bath Water: A Comment on Green, Kim, and Yoon." International Organization 55:487–495.
    • Beck, Nathaniel, Jonathan Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42:1260-1288.
    • Carter, David B., and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18:271-292.
  • Week 4: Survival Models
    • Box-Steffensmeier, Janet M. and Bradford S. Jones. 1997. "Time is of the Essence: Event History Models in Political Science."American Journal of Political Science 45:972–988.
    • Beck, Nathaniel, Jonathan Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42:1260-1288.
  • Week 3: Nonspherical Errors in TSCS Data
    • Beck, Nathaniel, and Jonathan N. Katz, 1995. "What to Do (and Not to Do) with Time-Series Cross-Section Data." American Political Science Review 89:634-647.
    • Beck, Nathaniel, and Jonathan N. Katz, 1996. "Nuisance vs. Substance: Specifying and Estimating Time-Series-Cross-Section Models." Political Analysis 6:1-36.
    • Alvarez, R. Michael, Geoffrey Garrett, and Peter Lange, 1991. "Government Partisanship, Labor Organization, and Macroeconomic Performance." American Political Science Review 85:539-556.
    • Beck, Nathaniel, Jonathan N. Katz, R. Michael Alvarez, Geoffrey Garrett, and Peter Lange, 1993. "Government Partisanship, Labor Organization, and Macroeconomic Performance: A Corrigendum." American Political Science Review 87:945-948.
  • Week 2: Instrumental Variables
    • Benoit, Kenneth, and Michael Marsh. 2008. "The Campaign Value of Incumbency: A New Solution to the Puzzle of Less Effective Incumbent Spending." American Journal of Political Science , 52:874-890.
    • Gerber, Alan S., and Donald P. Green. 2000. "The Effects of Canvassing, Telephone Calls, and Direct Mail on Voter Turnout: A Field Experiment." American Political Science Review 94:653-663.
    • Jacobson, Gary. 1978. "The Effects of Campaign Spending in Congressional Elections." American Political Science Review72:469-491.
    • Alvarez, R. Michael, and Garrett Glasgow. 1999. "Two-Stage Estimation of Non-Recursive Choice Models." Political Analysis8:147-165.
  • Week 1: Panel Data
    • Knack, Stephen, 1995. "Does 'Motor Voter' Work? Evidence from State-Level Data." Journal of Politics 57:796-811.
    • Stimson, James A., 1985. "Regression in Space and Time: A Statistical Essay." American Journal of Political Science 29:914-947.
    • Worrall, John L. 2010. "A User-friendly Introduction to Panel Data Modeling." Journal of Criminal Justice Education 21:182-196.

All course readings are available for download through the library website, either through JSTOR or our subscription to the electronic version of the journal. All campus URLs have access to these resources; if you want to download articles from home, see this.Course Materials


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:matching travel

沙发
ReneeBK 发表于 2017-1-22 02:58:43
  1. ## GIS in R ##

  2. install.packages("maps")
  3. library(maps)
  4. install.packages("mapdata")
  5. library(mapdata)

  6. map("usa", col="blue")

  7. map("state", col="blue")

  8. map("county", col="blue")

  9. install.packages("mapproj")
  10. library(mapproj)

  11. map("usa", col="blue", proj="sp_albers", par=c(30,40))  # conic projection
  12. map("usa", col="blue", proj="sp_mercator")  # cylindrical projection

  13. map("worldHires", "Canada", xlim=c(-141, -53), ylim=c(40, 85), col="gray", fill=TRUE)
  14. map("worldHires", "Canada", xlim=c(-140, -110), ylim=c(48, 60), col="gray", fill=TRUE)


  15. install.packages("rgdal")
  16. library(rgdal)

  17. sf.precincts <- readOGR(".", "sfprecincts")

  18. redtags <- read.csv("redtags.csv")


  19. install.packages("sp")  # should already be installed, but just in case
  20. library(sp)


  21. redtag.points <- SpatialPoints(cbind(redtags$lon, redtags$lat))
  22. proj4string(redtag.points) <- proj4string(sf.precincts)


  23. plot(sf.precincts)
  24. plot(redtag.points, add=TRUE, col="red", pch=20)

  25. sf.precincts@data$id <- seq(1,nrow(sf.precincts@data),1)

  26. redtag.overlay <- over(redtag.points, sf.precincts)
  27. redtag.overlay2 <- as.matrix(cbind(rownames(table(redtag.overlay$precinct)), table(redtag.overlay$precinct)))
  28. colnames(redtag.overlay2) <- c("precinct", "redtags")

  29. precinct.redtags <- merge(sf.precincts@data, redtag.overlay2, all.x=TRUE, sort=FALSE)
  30. precinct.redtags$redtags <- as.numeric(as.character(precinct.redtags$redtags))
  31. table(precinct.redtags$redtags)

  32. sf.precincts@data <- precinct.redtags[order(precinct.redtags$id),]
  33. sf.precincts@data$redtags[is.na(sf.precincts@data$redtags)] <- 0


  34. install.packages("RColorBrewer")
  35. library(RColorBrewer)


  36. # plotting red tagged buildings

  37. colors <- brewer.pal(5, "Reds")
  38. brks <- c(0,1,2,5,10,30)
  39. brknames <- c("0","1","2-4","5-9","10+")

  40. plot(sf.precincts, col=colors[findInterval(sf.precincts$redtags, brks, all.inside=TRUE)])
  41. plot(redtag.points, add=TRUE, col="blue", pch=1)


  42. pdf("redtag_plot.pdf")
  43. plot(sf.precincts, col=colors[findInterval(sf.precincts$redtags, brks, all.inside=TRUE)])
  44. legend("topright", legend=brknames, fill=colors)
  45. title("Red Tagged Buildings By Voting Precinct")
  46. dev.off()


  47. # plotting voter turnout

  48. sf.precincts@data$turnout87[sf.precincts@data$turnout87==1] <- NA # set parks to missing
  49. colors2 <- brewer.pal(5, "Greens")
  50. brks2 <- c(0,quantile(sf.precincts@data$turnout87, na.rm=TRUE))
  51. brknames2 <- c(paste(brks2[1], brks2[2], sep="-"), paste(brks2[2], brks2[3], sep="-"), paste(brks2[3], brks2[4], sep="-"), paste(brks2[4], brks2[5], sep="-"), paste(brks2[5], brks2[6], sep="-"))

  52. colors2.map <- colors2[findInterval(sf.precincts$turnout87, brks2, all.inside=TRUE)]
  53. colors2.map[is.na(colors2.map)] <- "#666666"

  54. pdf("turnout_plot.pdf")
  55. plot(sf.precincts, col=colors2.map)
  56. legend("topright", legend=brknames2, fill=colors2)
  57. title("Voter Turnout By Voting Precinct")
  58. dev.off()


  59. # calculation of distances

  60. install.packages("geosphere")
  61. library(geosphere)


  62. # a simple spatial autocorrelation factor as in the bridges paper

  63. precinct.dists <- as.matrix(dist(coordinates(sf.precincts)))
  64. precinct.dists.inv <- 1/precinct.dists
  65. diag(precinct.dists.inv) <- 0

  66. install.packages("ape")
  67. library(ape)

  68. Moran.I(sf.precincts@data$redtags, precinct.dists.inv, na.rm=TRUE)

  69. sf.precincts@data$sp.auto <- (precinct.dists.inv/1000) %*% sf.precincts@data$turnout89

  70. reg.model1 <- lm(turnout89 ~ redtags + sp.auto, data = sf.precincts@data)
  71. summary(reg.model1)
复制代码

藤椅
ReneeBK 发表于 2017-1-22 02:59:17
  1. ##########################
  2. ## 12/8/2013


  3. library(igraph)
  4. library(ANN)
  5. library(rgeos)
  6. library(rgdal)
  7. library(FNN)
  8. library(geosphere)

  9. sf.streets <- readOGR("C:\\R_group\\travel_routing", "tl_2012_06075_roads")

  10. sf.streets@data$kph <- 1
  11. sf.streets@data$kph[sf.streets@data$MTFCC=="S1100"] <- 100  # 65 mph
  12. sf.streets@data$kph[sf.streets@data$MTFCC=="S1200"] <- 65   # 45 mph
  13. sf.streets@data$kph[sf.streets@data$MTFCC=="S1400"] <- 40   # 25 mph
  14. sf.streets@data$kph[sf.streets@data$MTFCC=="S1630"] <- 30
  15. sf.streets@data$kph[sf.streets@data$MTFCC=="C3062"] <- 25
  16. sf.streets@data$kph[sf.streets@data$MTFCC=="S1730"] <- 15

  17. edge.count <- gIntersection(sf.streets,sf.streets)  ## printing shows 35542 edges


  18. edges <- matrix(NA,100000,6)

  19. row.start <- 0

  20. # This loop takes about 80 minutes to run on my home machine
  21. for (i in 1:length(sf.streets)) {
  22.   temp.int <- gIntersection(sf.streets, sf.streets[i,])
  23.   temp.edges <- do.call(rbind, lapply(temp.int@lines[[1]]@Lines, function(ls) {
  24.     as.vector(t(ls@coords))  }))
  25.   
  26.   temp.lengths <- (distHaversine(temp.edges[,1:2], temp.edges[,3:4])/(sf.streets@data$kph[i]*1000)) * 60  # travel time in minutes
  27.   
  28.   edges[(row.start+1):(row.start+nrow(temp.edges)), 1:4] <- temp.edges
  29.   edges[(row.start+1):(row.start+nrow(temp.edges)), 5] <- temp.lengths
  30.   edges[(row.start+1):(row.start+nrow(temp.edges)), 6] <- i
  31.   
  32.   row.start <- row.start + nrow(temp.edges)
  33.    
  34. }

  35. edges <- unique(na.omit(edges))

  36. # start from here

  37. library(igraph)
  38. library(ANN)
  39. library(rgeos)
  40. library(rgdal)
  41. library(FNN)
  42. library(geosphere)



  43. froms <- paste(edges[,1], edges[,2])
  44. tos <- paste(edges[,3], edges[,4])

  45. graph <- graph.edgelist(cbind(froms, tos), directed = FALSE)
  46. E(graph)$weight <- edges[,5]

  47. xy <- do.call(rbind, strsplit(V(graph)$name, " "))

  48. V(graph)$x <- as.numeric(xy[, 1])
  49. V(graph)$y <- as.numeric(xy[, 2])
  50. xyg <- cbind(V(graph)$x, V(graph)$y)

  51. ## pick route points here
  52. ## Coit Tower -122.405833, 37.8025 -- 37.8025, -122.405833
  53. ## Golden Gate Bridge (mid-span) -122.478611, 37.819722 -- 37.819722, -122.478611
  54. ## Candlestick Park -122.386111, 37.713611 -- 37.713611,-122.386111
  55. ## SF State -122.479722, 37.723333 -- 37.723333,-122.479722

  56. from <- cbind(-122.478611, 37.819722)
  57. to <- cbind(-122.405833, 37.8025)


  58. ifrom <- get.knnx(xyg, from, 1)$nn.index[1, 1]
  59. ito <- get.knnx(xyg, to, 1)$nn.index[1, 1]
  60. v.path <- get.shortest.paths(graph, ifrom, ito, output = "vpath")[[1]]
  61. e.path <- get.shortest.paths(graph, ifrom, ito, output = "epath")[[1]]



  62. route <- xyg[v.path,]
  63. totaldist <- 0
  64. cardinal.directions <- NULL
  65. for (i in 2:nrow(route)) {
  66.   tempdist <- distHaversine(route[i-1,], route[i,])
  67.   totaldist <- totaldist + tempdist
  68.   direction <- bearing(route[i-1, ], route[i, ])
  69.   cardinal.directions <- rbind(cardinal.directions, direction)
  70. }
  71. distance.km <- totaldist/1000 # distance in kilometers
  72. distance.miles <- totaldist/1609.34  # distance in miles
  73. travel.time <- sum(E(graph)$weight[v.path])  # travel time

  74. street.names <- as.character(sf.streets@data[edges[e.path,6],2])
  75. street.names[is.na(street.names)] <- "Unknown Road"

  76. repeated <- matrix(FALSE,length(street.names),1)
  77. for (j in 2:length(street.names)) {
  78.   repeated[j] <- (street.names[j] == street.names[j-1])
  79.   
  80. }

  81. compass.points <- cardinal.directions
  82. compass.points[cardinal.directions>337.5 | cardinal.directions<22.5] <- "N"
  83. compass.points[cardinal.directions>22.5 & cardinal.directions<67.5] <- "NE"
  84. compass.points[cardinal.directions>67.5 & cardinal.directions<112.5] <- "E"
  85. compass.points[cardinal.directions>112.5 & cardinal.directions<157.5] <- "SE"
  86. compass.points[cardinal.directions>157.5 & cardinal.directions<202.5] <- "S"
  87. compass.points[cardinal.directions>202.5 & cardinal.directions<247.5] <- "SW"
  88. compass.points[cardinal.directions>247.5 & cardinal.directions<292.5] <- "W"
  89. compass.points[cardinal.directions>292.5 & cardinal.directions<337.5] <- "NW"

  90. all.turns <- cbind(street.names, compass.points, repeated)
  91. turn.by.turn <- subset(all.turns, repeated==FALSE)[,1:2]
  92. rownames(turn.by.turn) <- NULL
  93. colnames(turn.by.turn) <- c("Street Name", "Direction")

  94. route.results <- cbind(travel.time, distance.miles)
  95. colnames(route.results) <- c("Travel Time (Minutes)", "Distance (Miles)")
  96. print(route.results)
  97. print(turn.by.turn)

  98. plot(sf.streets, col="gray")
  99. lines(V(graph)[v.path]$x, V(graph)[v.path]$y, lwd = 2, col = "red")
复制代码

板凳
ReneeBK 发表于 2017-1-22 03:00:26
  1. ## PS 207 class 9
  2. ## DiD, matching, RD

  3. # earned income tax credit data
  4. # EITC = earned income tax credit, a tax credit for lower income families with children.  
  5. # Did more women work after it was expanded in 1994?
  6. # urate = state unemployment rate
  7. # finc = family income
  8. # earn = earned income
  9. # unearn = unearned income

  10. library(foreign)
  11. eitc <- read.dta("eitc.dta")

  12. # Create two dummy variables to indicate before/after and treatment/control groups.

  13. # the EITC went into effect in the year 1994
  14. eitc$post93 <- as.numeric(eitc$year >= 1994)

  15. # The EITC only affects women with at least one child, so the treatment group is all women with children.
  16. eitc$anykids <- as.numeric(eitc$children >= 1)

  17. # First calculate treatment effect through simple algebra

  18. # Compute the four data points needed in the DiD calculation:
  19. a <- mean(eitc$work[eitc$post93==0 & eitc$anykids==0], na.rm=T)
  20. b <- mean(eitc$work[eitc$post93==0 & eitc$anykids==1], na.rm=T)
  21. c <- mean(eitc$work[eitc$post93==1 & eitc$anykids==0], na.rm=T)
  22. d <- mean(eitc$work[eitc$post93==1 & eitc$anykids==1], na.rm=T)

  23. # Compute the effect of the EITC on the employment of women with children:
  24. (d-c)-(b-a)

  25. ## Now a DiD regression model

  26. did.model1 <- lm(work ~ post93*anykids, data = eitc)
  27. summary(did.model1)

  28. did.model2 <- lm(work ~ post93*anykids + nonwhite + age + I(age^2) + ed + finc + I(finc-earn), data = eitc)
  29. summary(did.model2)

  30. ## placebo model
  31. ## demonstrate we don't get a treatment effect when picking a different year for the "treatment"

  32. ## subset to pre-treatment years
  33. eitc.sub <- eitc[eitc$year <= 1993,]

  34. # Create a new "after treatment" dummy variable
  35. eitc.sub$post91 <- as.numeric(eitc.sub$year >= 1992)

  36. # Run a placebo regression where placebo treatment = post91*anykids
  37. did.model3 <- lm(work ~ post91*anykids, data = eitc.sub)
  38. summary(did.model3)

  39. #########################
  40. ## Matching techniques ##
  41. #########################

  42. bridges <- read.dta("bridges.dta",  convert.factors=FALSE)

  43. install.packages("MatchIt")
  44. library(MatchIt)

  45. # exact matching (fails) #

  46. bridge.match <- matchit(totbexp ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="exact")



  47. # propensity score matching #

  48. bridge.match <- matchit(totbexp ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest")

  49. summary(bridge.match)

  50. plot(bridge.match)

  51. plot(bridge.match, type="jitter") # interactive!

  52. plot(bridge.match, type="hist")

  53. m.data <- match.data(bridge.match)

  54. nrow(m.data)


  55. # testing for balance with t-tests
  56. t.test(avgunemp ~ totbexp, data=bridges)
  57. t.test(avgunemp ~ totbexp, data=m.data)


  58. # mahalanobis distance matching #

  59. bridge.match2 <- matchit(totbexp ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", distance="mahalanobis")

  60. summary(bridge.match2)

  61. plot(bridge.match2)

  62. # jitter and histogram plots are only for propensity scores

  63. m2.data <- match.data(bridge.match2)

  64. nrow(m2.data)


  65. # propensity score matching, 2 matches per treated observation #

  66. bridge.match3 <- matchit(totbexp ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", ratio=2)

  67. summary(bridge.match3)

  68. plot(bridge.match3)

  69. plot(bridge.match3, type="jitter")

  70. plot(bridge.match3, type="hist")

  71. m3.data <- match.data(bridge.match3)

  72. nrow(m3.data)


  73. # propensity score matching,  discarding observations outside support #

  74. bridge.match4 <- matchit(totbexp ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", discard="both")

  75. summary(bridge.match4)

  76. plot(bridge.match4)

  77. plot(bridge.match4, type="jitter")

  78. plot(bridge.match4, type="hist")

  79. m4.data <- match.data(bridge.match4)

  80. nrow(m4.data)


  81. #install.packages("Zelig")
  82. #library(Zelig)
  83. # Zelig crashing, so using my own code

  84. # Negative binomial model with unmatched data #

  85. nbmod1 <- glm.nb(jumps2564_m  ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)),  data=bridges)
  86. summary(nbmod1)

  87. # Negative binomial model with propensity score matched data #

  88. nbmod2 <- glm.nb(jumps2564_m  ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)),  data=m.data)
  89. summary(nbmod2)

  90. # Negative binomial model with mahalanobis distance matched data #

  91. nbmod3 <- glm.nb(jumps2564_m  ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)),  data=m2.data)
  92. summary(nbmod3)

  93. # Negative binomial model with propensity score matching, 2 controls per treated #

  94. nbmod4 <- glm.nb(jumps2564_m  ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)),  data=m3.data)
  95. summary(nbmod4)

  96. # Negative binomial model with propensity score matching, discarding observations w/o common support #

  97. nbmod5 <- glm.nb(jumps2564_m  ~ spcorr_all  + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)),  data=m4.data)
  98. summary(nbmod5)

  99. x.t0 <- cbind(t(colMeans(cbind(1, bridges$spcorr_all, bridges$whitepct2564m, bridges$amindpct2564m, bridges$avgunemp, bridges$avgurban90))), 0)
  100. x.t1 <- cbind(t(colMeans(cbind(1, bridges$spcorr_all, bridges$whitepct2564m, bridges$amindpct2564m, bridges$avgunemp, bridges$avgurban90))), 1)
  101. offset.var <- mean(log(bridges$totpop2564m))

  102. mod1.treat0 <- exp(nbmod1$coef%*%t(x.t0) + offset.var)
  103. mod1.treat1 <- exp(nbmod1$coef%*%t(x.t1) + offset.var)
  104. att1 <- mod1.treat1 - mod1.treat0
  105. mod1.treat1
  106. mod1.treat0
  107. att1

  108. mod5.treat0 <- exp(nbmod5$coef%*%t(x.t0) + offset.var)
  109. mod5.treat1 <- exp(nbmod5$coef%*%t(x.t1) + offset.var)
  110. att5 <- mod5.treat1 - mod5.treat0
  111. mod5.treat1
  112. mod5.treat0
  113. att5




  114. ##############################
  115. ## Regression Discontinuity ##
  116. ##############################

  117. # incumbency advantage
  118. votedata <- read.dta("LMBdata.dta")

  119. votedata$treatment <- (votedata$lagdemvoteshare >= 0.5)

  120. rd.model1 <- lm(demvoteshare ~ lagdemvoteshare + treatment, data=votedata)
  121. summary(rd.model1)

  122. plot(demvoteshare ~ lagdemvoteshare, data=votedata)
  123. curve(rd.model1$coefficients[1] + rd.model1$coefficients[2]*x  + rd.model1$coefficients[3]*(x>0.5), from=0, to = 0.5, col="red", lw=2, add=TRUE)
  124. curve(rd.model1$coefficients[1] + rd.model1$coefficients[2]*x  + rd.model1$coefficients[3]*(x>0.5), from=0.501, to=1, col="red", lw=2, add=TRUE)


  125. rd.model2 <- lm(demvoteshare ~ lagdemvoteshare*treatment, data=votedata)
  126. summary(rd.model2)

  127. plot(demvoteshare ~ lagdemvoteshare, data=votedata)
  128. curve(rd.model2$coefficients[1] + rd.model2$coefficients[2]*x  + rd.model2$coefficients[3]*(x>0.5) + rd.model2$coefficients[4]*(x>0.5)*x, from=0, to = 0.5, col="red", lw=2, add=TRUE)
  129. curve(rd.model2$coefficients[1] + rd.model2$coefficients[2]*x  + rd.model2$coefficients[3]*(x>0.5) + rd.model2$coefficients[4]*(x>0.5)*x, from=0.501, to = 1, col="red", lw=2, add=TRUE)


  130. rd.model3 <- lm(demvoteshare ~ lagdemvoteshare*treatment + I(lagdemvoteshare^2)*treatment, data=votedata)
  131. summary(rd.model3)

  132. plot(demvoteshare ~ lagdemvoteshare, data=votedata)
  133. curve(rd.model3$coefficients[1] + rd.model3$coefficients[2]*x  + rd.model3$coefficients[3]*(x>0.5) +  rd.model3$coefficients[4]*x*x + rd.model3$coefficients[5]*(x>0.5)*x + rd.model3$coefficients[6]*(x>0.5)*x*x, from=0, to = 0.5, col="red", lw=2, add=TRUE)
  134. curve(rd.model3$coefficients[1] + rd.model3$coefficients[2]*x  + rd.model3$coefficients[3]*(x>0.5) +  rd.model3$coefficients[4]*x*x + rd.model3$coefficients[5]*(x>0.5)*x + rd.model3$coefficients[6]*(x>0.5)*x*x, from=0.501, to = 1, col="red", lw=2, add=TRUE)


  135. install.packages("rdd")
  136. library(rdd)


  137. rd.model4 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata)
  138. summary(rd.model4)
  139. plot(rd.model4)

  140. # placebo test around median of lagged vote share

  141. rd.model5 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = median(votedata$lagdemvoteshare, na.rm=T), data=votedata)
  142. summary(rd.model5)

  143. # trying different bandwidths

  144. rd.model4$bw

  145. rd.model.05 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.05)
  146. summary(rd.model.05)
  147. plot(rd.model.05)

  148. rd.model.025 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.025)
  149. summary(rd.model.025)
  150. plot(rd.model.025)

  151. rd.model.01 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.01)
  152. summary(rd.model.01)
  153. plot(rd.model.01)
复制代码

报纸
ReneeBK 发表于 2017-1-22 03:00:50
  1. ## multilevel model code for R ##

  2. install.packages("lme4")
  3. library(lme4)



  4. #### First example -- exam scores
  5. # normexam = test scores
  6. # school = school id
  7. # standLRT = individual score on a different test
  8. # schavg = average intake score

  9. install.packages("mlmRev")
  10. library(mlmRev)

  11. data(Exam)
  12. head(Exam)

  13. lmer(normexam ~ 1 + (1 | school), data=Exam) # random intercept only

  14. lmer(normexam ~ standLRT + (1 | school), data=Exam) # ranomd intercept plus fixed effect

  15. lmer(normexam ~ standLRT + (standLRT | school), data=Exam) # random intercept, random slope

  16. lmer(normexam ~ standLRT + schavg + (1 + standLRT | school), data=Exam) # random intercept, individual and group predictor

  17. lmer(normexam ~ standLRT * schavg + (1 + standLRT | school), data=Exam) # random intercept, cross-level interaction







  18. data(InstEval)
  19. # s = student number
  20. # d = professor number
  21. # studage = student age (# semesters enrolled)
  22. # lectage = lecture age (# semesters lecture was in the past)
  23. # service = dummy variable for course taught outside department
  24. # dept = department number
  25. # y = student rating, higher is better

  26. InstEval$studage <- as.numeric(as.character(InstEval$studage)) # convert from factor to number
  27. head(InstEval)

  28. ## linear regression

  29. linmod1 <- lm(y ~ studage + service, data=InstEval)
  30. summary(linmod1)

  31. linmod2 <- lm(y ~ studage + service + as.factor(dept), data=InstEval)
  32. summary(linmod2)


  33. ## linear mixed model, random intercept

  34. mixedmod1 <- lmer(y ~ studage + service + (1 | dept), data=InstEval)
  35. summary(mixedmod1)

  36. ## linear regression, interaction model

  37. linmod3 <- lm(y ~ studage + service + (studage * dept), data=InstEval)
  38. summary(linmod3)

  39. ## linear mixed model, random intercept and random coefficient on student age by department
  40. ## note studage enters twice -- otherwise mean effect assumed to be zero

  41. mixedmod2 <- lmer(y ~ service + studage + (1 + studage | dept), data=InstEval)
  42. summary(mixedmod2)

  43. # examine random and fixed effects
  44. fixef(mixedmod2)
  45. ranef(mixedmod2)

  46. # plot results

  47. dotplot(ranef(mixedmod2))

  48. randeffs <- unlist(ranef(mixedmod2))
  49. reglines <- cbind((randeffs[1:14] + fixef(mixedmod2)[1]), (randeffs[15:28] + fixef(mixedmod2)[2]))

  50. plot(y ~ studage, ylim = c(2,3.5), data=InstEval)
  51. for (i in 1:nrow(reglines)) {
  52.   curve(reglines[i,1] + reglines[i,2] * x, col=i, add=TRUE)
  53. }


  54. # compare multilevel to separate regressions -- shrinkage estimator


  55. mixedmod3 <- lmer(y ~ studage + (1 + studage | dept), data=InstEval)
  56. summary(mixedmod3)
  57. randeffs3 <- unlist(ranef(mixedmod3))
  58. reglines3 <- cbind((randeffs3[1:14] + fixef(mixedmod3)[1]), (randeffs3[15:28] + fixef(mixedmod3)[2]))

  59. alpha.hat <- NULL
  60. beta.hat <- NULL
  61. for(i in levels(InstEval$dept)){
  62.   unit.lm <- lm(y ~ studage, data = subset(InstEval, dept == i) )
  63.   alpha.hat <- append(alpha.hat, coef(unit.lm)[1])
  64.   beta.hat <- append(beta.hat, coef(unit.lm)[2])
  65. }

  66. pooled.model <- lm(y ~ studage, data=InstEval)
  67. pooled.beta <- coef(pooled.model)[2]

  68. plot(reglines3[,2] ~ beta.hat)
  69. curve(1*x, add=TRUE)
  70. abline(v=pooled.beta)



  71. ## Using Zelig

  72. install.packages("ZeligMultilevel")
  73. library(ZeligMultilevel)


  74. # example 1

  75. z.reg1 <- zelig(y ~ studage + service + tag(1 | dept), data=InstEval, model="ls.mixed")
  76. summary(z.reg1)

  77. x.high <- setx(z.reg1, studage=8)
  78. x.low <- setx(z.reg1, studage=2)

  79. s.reg1 <- sim(z.reg1, x=x.high, x1=x.low)
  80. summary(s.reg1)

  81. # example 2 -- note assumed intercept in random component

  82. z.reg2 <- zelig(y ~ studage + service + tag(studage | dept), data=InstEval, model="ls.mixed")
  83. summary(z.reg2)

  84. x.high <- setx(z.reg2, studage=8)
  85. x.low <- setx(z.reg2, studage=2)

  86. s.reg2 <- sim(z.reg2, x=x.high, x1=x.low)
  87. summary(s.reg2)





  88. ## multilevel logits

  89. library(foreign)
  90. votedata <- read.dta("polls.dta")

  91. pooled.logit1 <- glm(bush ~ black + female + edu + age, family=binomial(link="logit"), data=votedata)
  92. summary(pooled.logit1)


  93. ## Multilevel logit -- random intercept
  94. multi.logit1 <- glmer(bush ~ black + female + edu + age + (1 | state), family=binomial(link="logit"), data=votedata)
  95. summary(multi.logit1)


  96. ## Multilevel logit -- random slopes

  97. multi.logit2 <- glmer(bush ~ black + female + age + (1 + edu | state), family=binomial(link="logit"), data=votedata)
  98. summary(multi.logit2)


  99. ## Using Zelig

  100. # logit example 1

  101. z.logit1 <- zelig(bush ~ black + female + edu + age + tag(1 | state), data=votedata, model="logit.mixed")
  102. summary(z.logit1)

  103. x.high <- setx(z.logit1, age=4)
  104. x.low <- setx(z.logit1, age=1)

  105. s.logit1 <- sim(z.logit1, x=x.high, x1=x.low)
  106. summary(s.logit1)

  107. # logit example 2 -- note assumed intercept in random component

  108. z.logit2 <- zelig(bush ~ black + female + age + tag(edu| state), data=votedata, model="logit.mixed")
  109. summary(z.logit2)

  110. ## something goes wrong from here with setx command

  111. #x.high <- setx(z.logit2, age=4)
  112. #x.low <- setx(z.logit2, age=1)

  113. #s.logit2 <- sim(z.logit2, x=x.high, x1=x.low)
  114. #summary(s.logit2)
复制代码

地板
ReneeBK 发表于 2017-1-22 03:01:17
  1. ## multiple imputation code for R ##

  2. ## Example 1 ##

  3. library(foreign)
  4. CCSData <- read.dta("CCS2010v2.dta", convert.factors=FALSE)

  5. install.packages("Rcpp") # we need to do this because otherwise Amelia might load a defunct version that doesn't work

  6. install.packages("Amelia")
  7. library(Amelia)

  8. ## imputation ##

  9. impute.out <- amelia(CCSData, m=5, noms=c("develop1","develop2"))

  10. # built-in diagnostics #

  11. summary(impute.out)

  12. missmap(impute.out)

  13. plot(impute.out)

  14. overimpute(impute.out, var="age")

  15. ## save imputed datasets ##

  16. write.amelia(obj=impute.out, file.stem = "impdata", format = "dta")

  17. impd1 <- subset(impute.out$imputations[[1]])
  18. impd2 <- subset(impute.out$imputations[[2]])
  19. impd3 <- subset(impute.out$imputations[[3]])
  20. impd4 <- subset(impute.out$imputations[[4]])
  21. impd5 <- subset(impute.out$imputations[[5]])


  22. install.packages("Zelig")
  23. library(Zelig)
  24. install.packages("ZeligChoice")
  25. library(ZeligChoice)

  26. #regmod1.out <- zelig(as.factor(develop1) ~ incgroup + rent + southcoast + NEP, model="mlogit", data=CCSData)
  27. #summary(regmod1.out)

  28. #impmod1.out <- zelig(as.factor(develop1) ~ incgroup + rent + southcoast + NEP, model="mlogit", data=mi(impd1, impd2, impd3, impd4, impd5))
  29. #summary(impmod1.out)  # not working -- we can do it by hand

  30. install.packages("nnet")
  31. library(nnet)

  32. regmod1.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=CCSData))
  33. coeffs <- cbind(t(regmod1.out$coefficients[1,]), t(regmod1.out$coefficients[2,]))
  34. ses <- sqrt(diag(solve(regmod1.out$Hessian)))
  35. zs <- coeffs/ses
  36. ps <- 2*(1 - pnorm(abs(zs)))

  37. reg.final <- t(rbind(coeffs, ses, zs, ps))
  38. colnames(reg.final) <- c("Coeff", "SE", "Z", "P")

  39. impmod1.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd1))
  40. impmod2.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd2))
  41. impmod3.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd3))
  42. impmod4.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd4))
  43. impmod5.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd5))

  44. coeffs1 <- cbind(t(impmod1.out$coefficients[1,]), t(impmod1.out$coefficients[2,]))
  45. se1 <- sqrt(diag(solve(impmod1.out$Hessian)))
  46. coeffs2 <- cbind(t(impmod2.out$coefficients[1,]), t(impmod2.out$coefficients[2,]))
  47. se2 <- sqrt(diag(solve(impmod2.out$Hessian)))
  48. coeffs3 <- cbind(t(impmod3.out$coefficients[1,]), t(impmod3.out$coefficients[2,]))
  49. se3 <- sqrt(diag(solve(impmod3.out$Hessian)))
  50. coeffs4 <- cbind(t(impmod4.out$coefficients[1,]), t(impmod4.out$coefficients[2,]))
  51. se4 <- sqrt(diag(solve(impmod4.out$Hessian)))
  52. coeffs5 <- cbind(t(impmod5.out$coefficients[1,]), t(impmod5.out$coefficients[2,]))
  53. se5 <- sqrt(diag(solve(impmod5.out$Hessian)))

  54. mi.coeffs <- rbind(coeffs1, coeffs2, coeffs3, coeffs4, coeffs5)
  55. mi.se <- rbind(se1, se2, se3, se4, se5)

  56. mi.results <- mi.meld(mi.coeffs, mi.se)
  57. mi.results

  58. mi.z <- mi.results$q.mi/mi.results$se.mi
  59. mi.p <- 2*(1 - pnorm(abs(mi.z)))

  60. mi.final <- cbind(t(mi.results$q.mi), t(mi.results$se.mi), t(mi.z), t(mi.p))
  61. colnames(mi.final) <- c("Coeff", "SE", "Z", "P")

  62. # without imputation
  63. reg.final
  64. #with imputation
  65. mi.final

  66. ## Example 2 ##

  67. MIData2 <- read.dta("development.dta", convert.factors=FALSE)

  68. MIData2sub <- subset(MIData2, select=c(gxpdhlth, gini, g, glag, dictator, births, infmort, cath, moslem, femsec, fertil,country,year))

  69. impute2.out <- amelia(MIData2sub, m=5, cs="country", ts="year", noms="dictator")

  70. tscsPlot(impute2.out, var="infmort", cs="1")

  71. impute3.out <- amelia(MIData2sub, m=5, cs="country", ts="year", noms="dictator", polytime=3)

  72. tscsPlot(impute3.out, var="infmort", cs="1")

  73. regmod2.out <- zelig(infmort ~ gxpdhlth + glag + dictator + femsec, model="ls", data=MIData2)
  74. summary(regmod2.out)

  75. impd1 <- subset(impute3.out$imputations[[1]])
  76. impd2 <- subset(impute3.out$imputations[[2]])
  77. impd3 <- subset(impute3.out$imputations[[3]])
  78. impd4 <- subset(impute3.out$imputations[[4]])
  79. impd5 <- subset(impute3.out$imputations[[5]])

  80. impmod2.out <- zelig(infmort ~ gxpdhlth + glag + dictator + femsec, model="ls", data=mi(impd1, impd2, impd3, impd4, impd5))
  81. summary(impmod2.out)

  82. impmod2.out <- zelig(g ~ gxpdhlth + glag + dictator + femsec, model="ls", data=mi(impd1, impd2, impd3, impd4, impd5))
  83. summary(impmod2.out)
复制代码

7
ReneeBK 发表于 2017-1-22 03:02:02
  1. ## Selection models in R ##

  2. install.packages("sampleSelection")
  3. library(sampleSelection)

  4. data("Mroz87")
  5. Mroz87$kids <- (Mroz87$kids5 + Mroz87$kids618 > 0)
  6. head(Mroz87)

  7. # Female labor supply (lfp = labour force participation)

  8. ## Outcome equations without correcting for selection
  9. # I() means "as-is" -- do calculation in parentheses then use as variable

  10. ## Comparison of linear regression and selection model

  11. outcome1 <- lm(wage ~ exper, data = Mroz87)
  12. summary(outcome1)

  13. selection1 <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper,
  14. data = Mroz87, method = "2step")
  15. summary(selection1)

  16. plot(Mroz87$wage ~ Mroz87$exper)
  17. curve(outcome1$coeff[1] + outcome1$coeff[2]*x, col="black", lwd="2", add=TRUE)
  18. curve(selection1$coeff[1] + selection1$coeff[2]*x, col="orange", lwd="2", add=TRUE)


  19. ## A more complete model comparison

  20. outcome2 <- lm(wage ~ exper + I( exper^2 ) + educ + city, data = Mroz87)
  21. summary(outcome1)

  22. ## Correcting for selection

  23. selection.twostep2 <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city,
  24. data = Mroz87, method = "2step")
  25. summary(selection.twostep2)

  26. selection.mle <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city,
  27. data = Mroz87, method = "mle")
  28. summary(selection.mle)


  29. ## Heckman model selection "by hand" ##

  30. seleqn1 <- glm(lfp ~ age + I(age^2) + faminc + kids + educ, family=binomial(link="probit"), data=Mroz87)
  31. summary(seleqn1)

  32. ## Calculate inverse Mills ratio by hand ##

  33. Mroz87$IMR <- dnorm(seleqn1$linear.predictors)/pnorm(seleqn1$linear.predictors)

  34. ## Outcome equation correcting for selection ##

  35. outeqn1 <- lm(wage ~ exper + I(exper^2) + educ + city + IMR, data=Mroz87, subset=(lfp==1))
  36. summary(outeqn1)

  37. ## compare to selection package -- coefficients right, se's wrong
  38. summary(selection.twostep2)


  39. ## interpretation
  40. ## If our independent variables does not appear in the selection equation, we can interpret beta as in linear regression
  41. ## If it does appear in the selection equation, we must calculate:

  42. beta.educ.sel <- selection.twostep2$coefficients[6]
  43. beta.educ.out <- selection.twostep2$coefficients[10]
  44. beta.IMR <- selection.twostep2$coefficients[12]
  45. delta <- selection.twostep2$imrDelta

  46. marginal.effect <- beta.educ.out - beta.educ.sel * beta.IMR * delta
  47. mr2 <- marginal.effect * Mroz87$educ


  48. plot(Mroz87$wage ~ Mroz87$educ)
  49. lines(mr2 ~ Mroz87$educ, type="l", col="green", lwd="2")

  50. ## Selection with a binary outcome variable
  51. ## Data from Kimball (2006)

  52. library(foreign)
  53. conflict.data <- read.dta("MissingLink_JPRfinal.dta",  convert.factors=FALSE)
  54. conflict.data <- na.omit(conflict.data)
  55. head(conflict.data)

  56. ## A probit model for conflict

  57. probit.model <- glm(conflict ~ relcap+contig+jtdem+jaut+pwrs+allform2, family=binomial(link=probit), data=conflict.data)
  58. summary(probit.model)

  59. install.packages("Zelig")
  60. library(Zelig)
  61. install.packages("ZeligChoice")
  62. library(ZeligChoice)


  63. selection.formula <- list(mu1 = conflict ~ relcap+contig+jtdem+jaut+pwrs,
  64. mu2 = allform2 ~ relcap+logdist+contig+jtdem+jaut+sharerival)

  65. selection.binary <- zelig(selection.formula, model = "bprobit", data = conflict.data)
  66. summary(selection.binary)

  67. x.contig <- setx(selection.binary, contig=1)
  68. sim.binary1 <- sim(selection.binary, x = x.contig)
  69. summary(sim.binary1)
  70. plot(sim.binary1)

  71. x.noncontig <- setx(selection.binary, contig=0)
  72. sim.binary2 <- sim(selection.binary, x = x.contig, x1=x.noncontig)
  73. summary(sim.binary2)
  74. plot(sim.binary2)
复制代码

8
ReneeBK 发表于 2017-1-22 03:05:22
  1. ## Models for BTSCS, Class 5

  2. IRdata <- read.table("OR.txt", header=T, sep="\t")

  3. IRdata <- as.data.frame(na.omit(IRdata))

  4. View(IRdata)

  5. ## Pooled logit ##

  6. logitmod1 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade, family=binomial(link="logit"), data=IRdata, x=TRUE)
  7. summary(logitmod1)

  8. ## Fixed effects logit ##
  9. ## the pglm package can supposedly also do this, but it seems to crash ##

  10. install.packages("glmmML")
  11. library(glmmML)

  12. logitfemod1 <- glmmboot(dispute ~ dem + growth + allies + contig + capratio + trade, family=binomial(link="logit"), data=IRdata, cluster=ordyid)
  13. summary(logitfemod1)

  14. ## list first 10 estimated fixed effects ##

  15. logitfemod1$frail[1:10]

  16. ## how many observations in each model?

  17. length(logitmod1$fitted.values)

  18. dyadrep <- as.vector(table(IRdata$ordyid))
  19. validobs <- as.numeric(logitfemod1$frail > -Inf)
  20. sum(dyadrep * validobs)

  21. ## dynamics
  22. ## Logit with time dummy variables ##

  23. logitmod2 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade + as.factor(py), family=binomial(link="logit"), data=IRdata)
  24. summary(logitmod2)

  25. ## Logit with time effects: cubic polynomial ##

  26. IRdata$py2 <- (IRdata$py)^2
  27. IRdata$py3 <- (IRdata$py)^3

  28. logitmod3 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade + py + py2 + py3, family=binomial(link="logit"), data=IRdata)
  29. summary(logitmod3)

  30. ## plot hazard rate over time

  31. xb <- (colMeans(cbind(1, IRdata$dem, IRdata$growth, IRdata$allies, IRdata$contig, IRdata$capratio, IRdata$trade))) %*% logitmod3$coefficients[1:7]
  32. time <- seq(1,50,1)

  33. hazard <- plogis(xb + (time*logitmod3$coefficients[8]) + ((time^2)*logitmod3$coefficients[9]) + ((time^3)*logitmod3$coefficients[10]))
  34. plot(hazard~time, type="l")
复制代码

9
ReneeBK 发表于 2017-1-22 03:06:55
  1. ## survival models in R ##

  2. library(foreign)

  3. coalition.data <- read.dta("coalition.dta",  convert.factors=FALSE)
  4. coalition.data$fractionalization <- coalition.data$fractionalization/1000

  5. ## duration measured in months

  6. olsmodel1 <- lm(duration ~ investiture + fractionalization + polarization + majority_government + crisis, data=coalition.data)
  7. summary(olsmodel1)

  8. install.packages("Zelig")
  9. library(Zelig)

  10. ## the exponential survival model

  11. exp.survival <- zelig(Surv(duration, censor12) ~ investiture + fractionalization + polarization + majority_government + crisis,
  12. model="exp", data=coalition.data)
  13. summary(exp.survival)

  14. # expected values and first differences
  15. x.minority <- setx(exp.survival, majority_government = 0)
  16. x.majority <- setx(exp.survival, majority_government = 1)

  17. exp.survival.sim <- sim(exp.survival, x=x.minority, x1=x.majority)
  18. summary(exp.survival.sim)

  19. plot(exp.survival.sim)


  20. ## The Weibull model -- currently not working in Zelig, so we can work around it

  21. #weibull.survival <- zelig(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, model="weibull", data=coalition.data)

  22. weibull.survival <- survreg(Surv(duration, censor12) ~ investiture + fractionalization + polarization + majority_government + crisis,
  23. dist="weibull", data=coalition.data, x=TRUE)  # note x=TRUE
  24. summary(weibull.survival)

  25. # scale is <1 so hazard rate decreasing over time

  26. # plot hazard rate over time
  27. p <- 1/weibull.survival$scale
  28. t <- seq(1,60,1)
  29. lambda <- exp(colMeans(weibull.survival$x) %*% weibull.survival$coefficients)
  30. weibull.hazard <- lambda * p * (lambda * t)^(1-p)
  31. plot(weibull.hazard, type="l")


  32. # expected values and first differences

  33. x.minority <- append(colMeans(weibull.survival$x),1)  # The 1 is so we can pick up the scale parameter
  34. x.minority["majority_government"] <- 0
  35. x.majority <- x.minority
  36. x.majority["majority_government"] <- 1

  37. weibull.coeffs <- append(weibull.survival$coefficients, log(weibull.survival$scale))

  38. betas <- mvrnorm(1000, weibull.coeffs, vcov(weibull.survival))

  39. expect.min <- 1/exp(-x.minority %*% t(betas))
  40. expect.maj <- 1/exp(-x.majority %*% t(betas))

  41. weibull.fd <- expect.maj - expect.min

  42. mean.min <- mean(expect.min)
  43. mean.maj <- mean(expect.maj)
  44. sd.min <- apply(expect.min,1,sd)
  45. sd.maj <- apply(expect.maj,1,sd)
  46. mean.fd <- mean(weibull.fd)
  47. sd.fd <- apply(weibull.fd,1,sd)

  48. fd.results <- rbind(cbind(mean.min, sd.min), cbind(mean.maj, sd.maj), cbind(mean.fd, sd.fd))
  49. colnames(fd.results) <- c("Mean", "SD")
  50. rownames(fd.results) <- c("Minority Govt", "Majority Govt", "FD")
  51. print(fd.results)


  52. ## The Cox proportional hazard model -- once again not working in Zelig

  53. #coxph.survival <- exp.survival <- zelig(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, model="coxph", data=coalition.data)

  54. #install.packages("survival") -- already installed with Zelig
  55. #library(survival)

  56. coxph.survival <- coxph(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, data=coalition.data, x=TRUE)  # note x=TRUE
  57. summary(coxph.survival)  # note reversal of signs!

  58. # estimated survival function

  59. plot(survfit(coxph.survival), xlab="Months", ylab="Governments Surviving")

  60. x.minority <- colMeans(coxph.survival$x)  
  61. x.minority["majority_government"] <- 0
  62. x.majority <- x.minority
  63. x.majority["majority_government"] <- 1
  64. hyp.data <- as.data.frame(rbind(x.minority,x.majority))

  65. plot(survfit(coxph.survival, newdata=hyp.data), xlab="Months", ylab="Governments Surviving", conf.int=T, lty=c(1,2))
  66. legend("topright", legend=c("minority govt","majority govt"), lty=c(1,2))

  67. # interpreting in terms of expected time to failure is hard because we do not estimate a baseline
  68. # instead look at changes in the hazard rate

  69. hazard.pct.change <- ((exp(x.minority %*% t(betas)) - exp(x.majority %*% t(betas)))/exp(x.minority %*% t(betas)))*100
  70. hazard.mean <- mean(hazard.pct.change)
  71. hazard.sd <- apply(hazard.pct.change,1,sd)
  72. cbind(hazard.mean, hazard.sd)  # minority gov't 47% more likely to fail
复制代码

10
ReneeBK 发表于 2017-1-22 03:07:19
  1. ## 2SLS in R ##

  2. library(foreign)

  3. AllData <- read.dta("newdail2002v2.dta",  convert.factors=FALSE)
  4. CampaignData <- na.omit(subset(AllData, select=c(votes1st,incumb,spend_regular,spend_regularXinc,spend_public,partyquota1997,electoraK,dublin,senator,councillor,wonseat)))

  5. ## OLS ##

  6. olsmodel <- lm(votes1st ~ incumb + spend_regular + spend_regularXinc, data=CampaignData)
  7. summary(olsmodel)

  8. olsmodel2 <- lm(votes1st ~ incumb + spend_regular + spend_regularXinc + spend_public, data=CampaignData)
  9. summary(olsmodel2)

  10. ## 2SLS by hand -- this replicates models 1 and 2, Table 3 in Benoit and Laver 2008

  11. first.stage.1 <- lm(spend_regular ~ partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  12. summary(first.stage.1)

  13. CampaignData$instrumented.spending <- first.stage.1$fitted.values
  14. CampaignData$inst.spend.inc <- CampaignData$instrumented.spending * CampaignData$incumb

  15. second.stage.1 <- lm(votes1st ~ incumb + instrumented.spending + inst.spend.inc, data=CampaignData)
  16. summary(second.stage.1)


  17. first.stage.2 <- lm(spend_public ~ partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  18. summary(first.stage.2)

  19. CampaignData$instrumented.pspending <- first.stage.2$fitted.values

  20. second.stage.2 <- lm(votes1st ~ incumb + instrumented.spending + inst.spend.inc + instrumented.pspending, data=CampaignData)
  21. summary(second.stage.2)


  22. ## Issue to consider -- is interaction also endogenous?
  23. ## Example of difference in standard errors

  24. second.stage.1v2 <- lm(votes1st ~ instrumented.spending, data=CampaignData)
  25. summary(second.stage.1v2)

  26. ## 2SLS ##
  27. ## compare standard errors

  28. install.packages("AER")
  29. library(AER)

  30. tslsmodel <- ivreg(votes1st  ~ spend_regular | partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  31. summary(tslsmodel)





  32. ## Complete models with 2SLS
  33. ## note incumbency is also an instrument

  34. tslsmodel.1 <- ivreg(votes1st  ~ incumb + spend_regular + spend_regularXinc | incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  35. summary(tslsmodel.1)

  36. tslsmodel.2 <- ivreg(votes1st  ~ incumb + spend_regular + spend_regularXinc + spend_public | incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  37. summary(tslsmodel.2)


  38. ## An alternate package

  39. install.packages("sem")
  40. library(sem)

  41. tslsmodel.3 <- tsls(votes1st  ~ incumb + spend_regular + spend_regularXinc, ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  42. summary(tslsmodel.3)

  43. tslsmodel.4 <- tsls(votes1st  ~ incumb + spend_regular + spend_regularXinc + spend_public , ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  44. summary(tslsmodel.4)


  45. ## 2 stage estimator with a binary dv

  46. probit.model <- glm(wonseat ~ incumb + spend_regular + spend_regularXinc + spend_public, family=binomial(link=probit), data=CampaignData)
  47. summary(probit.model)


  48. ## (sort of) replication of table 5

  49. tspmodel.1 <- glm(wonseat ~ incumb + instrumented.spending + inst.spend.inc + instrumented.pspending, family=binomial(link=probit), data=CampaignData)
  50. summary(tspmodel.1)

  51. ## 2SCML

  52. reg.inst1 <- lm(spend_regular ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  53. reg.inst2 <- lm(spend_regularXinc ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
  54. reg.inst3 <- lm(spend_public ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)

  55. CampaignData$inst1 <- reg.inst1$residuals
  56. CampaignData$inst2 <- reg.inst2$residuals
  57. CampaignData$inst3 <- reg.inst3$residuals

  58. tscml.model <- glm(wonseat ~ incumb + spend_regular + spend_regularXinc + spend_public + inst1 + inst2 + inst3, family=binomial(link=probit), data=CampaignData)
  59. summary(tscml.model)

  60. ## test for endogeneity

  61. install.packages("lmtest")
  62. library(lmtest)

  63. lrtest(probit.model, tscml.model)
复制代码

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-23 02:56