【Lecture Notes】Political Science 207

1关注
62粉丝

VIP

已卖：4901份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库 其他...

R资源总汇

Panel Data Analysis

Experimental Design

0%

威望: 1 级
论坛币: 49675 个
通用积分: 56.2487
学术水平: 370 点
热心指数: 273 点
信用等级: 335 点
经验: 57805 点
帖子: 4005
精华: 21
在线时间: 582 小时
注册时间: 2005-5-8
最后登录: 2023-11-26

楼主

ReneeBK 发表于 2017-1-22 02:58:06 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Political Science 207Winter Quarter 2014

Final Exam

Final Exam

Homeworks

Data and R Code

Readings

Week 9: Difference-in-Difference, Matching, and Regression Discontinuity
- Card, David, and Krueger, Alan B. 1994. "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review 84: 772–793.
- Ho, Daniel E., Kosuke Imai, Gary King, and Elizabeth A. Stuart. 2007. "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference." Political Analysis 15:199-236.
- Gerber, Elisabeth R., and Daniel J. Hopkins. 2011. "When Mayors Matter: Estimating the Impact of Mayoral Partisanship on City Policy." American Journal of Political Science 55:326-339.
Week 8: Multilevel Models
- Jones, Bradford S. 2009. "Multilevel Modeling." In the The Oxford Handbook of Political Methodology, J. Box-Steffensmeier, H. Brady, and D. Collier (eds). New York: Oxford University Press.
- Steenbergen, Marco R., and Bradford S. Jones. 2002. "Modeling Multilevel Data Structures." American Journal of Political Science 46:218-237.
Week 7: Multiple Imputation for Missing Data
- King, Gary, James Honaker, Anne Joseph, and Kenneth Scheve, 2001. "Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation." American Political Science Review 95:49-70
- Honaker, James, and Gary King. 2010. "What to do About Missing Values in Time Series Cross-Section Data." American Journal of Political Science 54:561-581.
Week 6: Selection Models
- Berinksy, Adam. 1999. "The Two Faces of Public Opinion." American Journal of Political Science 43:1209-1230.
- Bushway, Shaun, Brian D. Johnson, and Lee Ann Slocum. 2007. "Is the Magic Still There? The Use of the Heckman Two-Step Correction for Selection Bias in Criminology." Journal of Quantitative Criminology 23:151-178.
- Dubin, Jeffrey A., and Douglas Rivers, 1990. "Selection Bias in Linear Regression, Logit, and Probit Models." Sociologial Methods and Research 18:360-390.
Week 5: Binary TSCS Models
- Green, Donald P., Soo Yeon Kim, and David H. Yoon. 2001. "Dirty Pool" International Organization 55:441–468.
- Beck, Nathaniel, and Jonathan N. Katz. 2001. "Throwing Out the Baby with the Bath Water: A Comment on Green, Kim, and Yoon." International Organization 55:487–495.
- Beck, Nathaniel, Jonathan Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42:1260-1288.
- Carter, David B., and Curtis S. Signorino. 2010. "Back to the Future: Modeling Time Dependence in Binary Data." Political Analysis 18:271-292.
Week 4: Survival Models
- Box-Steffensmeier, Janet M. and Bradford S. Jones. 1997. "Time is of the Essence: Event History Models in Political Science."American Journal of Political Science 45:972–988.
- Beck, Nathaniel, Jonathan Katz, and Richard Tucker. 1998. "Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable." American Journal of Political Science 42:1260-1288.
Week 3: Nonspherical Errors in TSCS Data
- Beck, Nathaniel, and Jonathan N. Katz, 1995. "What to Do (and Not to Do) with Time-Series Cross-Section Data." American Political Science Review 89:634-647.
- Beck, Nathaniel, and Jonathan N. Katz, 1996. "Nuisance vs. Substance: Specifying and Estimating Time-Series-Cross-Section Models." Political Analysis 6:1-36.
- Alvarez, R. Michael, Geoffrey Garrett, and Peter Lange, 1991. "Government Partisanship, Labor Organization, and Macroeconomic Performance." American Political Science Review 85:539-556.
- Beck, Nathaniel, Jonathan N. Katz, R. Michael Alvarez, Geoffrey Garrett, and Peter Lange, 1993. "Government Partisanship, Labor Organization, and Macroeconomic Performance: A Corrigendum." American Political Science Review 87:945-948.
Week 2: Instrumental Variables
- Benoit, Kenneth, and Michael Marsh. 2008. "The Campaign Value of Incumbency: A New Solution to the Puzzle of Less Effective Incumbent Spending." American Journal of Political Science , 52:874-890.
- Gerber, Alan S., and Donald P. Green. 2000. "The Effects of Canvassing, Telephone Calls, and Direct Mail on Voter Turnout: A Field Experiment." American Political Science Review 94:653-663.
- Jacobson, Gary. 1978. "The Effects of Campaign Spending in Congressional Elections." American Political Science Review72:469-491.
- Alvarez, R. Michael, and Garrett Glasgow. 1999. "Two-Stage Estimation of Non-Recursive Choice Models." Political Analysis8:147-165.
Week 1: Panel Data
- Knack, Stephen, 1995. "Does 'Motor Voter' Work? Evidence from State-Level Data." Journal of Politics 57:796-811.
- Stimson, James A., 1985. "Regression in Space and Time: A Statistical Essay." American Journal of Political Science 29:914-947.
- Worrall, John L. 2010. "A User-friendly Introduction to Panel Data Modeling." Journal of Criminal Justice Education 21:182-196.

All course readings are available for download through the library website, either through JSTOR or our subscription to the electronic version of the journal. All campus URLs have access to these resources; if you want to download articles from home, see this.Course Materials

Syllabus

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏1 回帖

关键词：matching travel

相关帖子

沙发

ReneeBK 发表于 2017-1-22 02:58:43

## GIS in R ##
install.packages("maps")
library(maps)
install.packages("mapdata")
library(mapdata)
map("usa", col="blue")
map("state", col="blue")
map("county", col="blue")
install.packages("mapproj")
library(mapproj)
map("usa", col="blue", proj="sp_albers", par=c(30,40)) # conic projection
map("usa", col="blue", proj="sp_mercator") # cylindrical projection
map("worldHires", "Canada", xlim=c(-141, -53), ylim=c(40, 85), col="gray", fill=TRUE)
map("worldHires", "Canada", xlim=c(-140, -110), ylim=c(48, 60), col="gray", fill=TRUE)
install.packages("rgdal")
library(rgdal)
sf.precincts <- readOGR(".", "sfprecincts")
redtags <- read.csv("redtags.csv")
install.packages("sp") # should already be installed, but just in case
library(sp)
redtag.points <- SpatialPoints(cbind(redtags$lon, redtags$lat))
proj4string(redtag.points) <- proj4string(sf.precincts)
plot(sf.precincts)
plot(redtag.points, add=TRUE, col="red", pch=20)
sf.precincts@data$id <- seq(1,nrow(sf.precincts@data),1)
redtag.overlay <- over(redtag.points, sf.precincts)
redtag.overlay2 <- as.matrix(cbind(rownames(table(redtag.overlay$precinct)), table(redtag.overlay$precinct)))
colnames(redtag.overlay2) <- c("precinct", "redtags")
precinct.redtags <- merge(sf.precincts@data, redtag.overlay2, all.x=TRUE, sort=FALSE)
precinct.redtags$redtags <- as.numeric(as.character(precinct.redtags$redtags))
table(precinct.redtags$redtags)
sf.precincts@data <- precinct.redtags[order(precinct.redtags$id),]
sf.precincts@data$redtags[is.na(sf.precincts@data$redtags)] <- 0
install.packages("RColorBrewer")
library(RColorBrewer)
# plotting red tagged buildings
colors <- brewer.pal(5, "Reds")
brks <- c(0,1,2,5,10,30)
brknames <- c("0","1","2-4","5-9","10+")
plot(sf.precincts, col=colors[findInterval(sf.precincts$redtags, brks, all.inside=TRUE)])
plot(redtag.points, add=TRUE, col="blue", pch=1)
pdf("redtag_plot.pdf")
plot(sf.precincts, col=colors[findInterval(sf.precincts$redtags, brks, all.inside=TRUE)])
legend("topright", legend=brknames, fill=colors)
title("Red Tagged Buildings By Voting Precinct")
dev.off()
# plotting voter turnout
sf.precincts@data$turnout87[sf.precincts@data$turnout87==1] <- NA # set parks to missing
colors2 <- brewer.pal(5, "Greens")
brks2 <- c(0,quantile(sf.precincts@data$turnout87, na.rm=TRUE))
brknames2 <- c(paste(brks2[1], brks2[2], sep="-"), paste(brks2[2], brks2[3], sep="-"), paste(brks2[3], brks2[4], sep="-"), paste(brks2[4], brks2[5], sep="-"), paste(brks2[5], brks2[6], sep="-"))
colors2.map <- colors2[findInterval(sf.precincts$turnout87, brks2, all.inside=TRUE)]
colors2.map[is.na(colors2.map)] <- "#666666"
pdf("turnout_plot.pdf")
plot(sf.precincts, col=colors2.map)
legend("topright", legend=brknames2, fill=colors2)
title("Voter Turnout By Voting Precinct")
dev.off()
# calculation of distances
install.packages("geosphere")
library(geosphere)
# a simple spatial autocorrelation factor as in the bridges paper
precinct.dists <- as.matrix(dist(coordinates(sf.precincts)))
precinct.dists.inv <- 1/precinct.dists
diag(precinct.dists.inv) <- 0
install.packages("ape")
library(ape)
Moran.I(sf.precincts@data$redtags, precinct.dists.inv, na.rm=TRUE)
sf.precincts@data$sp.auto <- (precinct.dists.inv/1000) %*% sf.precincts@data$turnout89
reg.model1 <- lm(turnout89 ~ redtags + sp.auto, data = sf.precincts@data)
summary(reg.model1)

复制代码

藤椅

ReneeBK 发表于 2017-1-22 02:59:17

##########################
## 12/8/2013
library(igraph)
library(ANN)
library(rgeos)
library(rgdal)
library(FNN)
library(geosphere)
sf.streets <- readOGR("C:\\R_group\\travel_routing", "tl_2012_06075_roads")
sf.streets@data$kph <- 1
sf.streets@data$kph[sf.streets@data$MTFCC=="S1100"] <- 100 # 65 mph
sf.streets@data$kph[sf.streets@data$MTFCC=="S1200"] <- 65 # 45 mph
sf.streets@data$kph[sf.streets@data$MTFCC=="S1400"] <- 40 # 25 mph
sf.streets@data$kph[sf.streets@data$MTFCC=="S1630"] <- 30
sf.streets@data$kph[sf.streets@data$MTFCC=="C3062"] <- 25
sf.streets@data$kph[sf.streets@data$MTFCC=="S1730"] <- 15
edge.count <- gIntersection(sf.streets,sf.streets) ## printing shows 35542 edges
edges <- matrix(NA,100000,6)
row.start <- 0
# This loop takes about 80 minutes to run on my home machine
for (i in 1:length(sf.streets)) {
temp.int <- gIntersection(sf.streets, sf.streets[i,])
temp.edges <- do.call(rbind, lapply(temp.int@lines[[1]]@Lines, function(ls) {
as.vector(t(ls@coords)) }))
temp.lengths <- (distHaversine(temp.edges[,1:2], temp.edges[,3:4])/(sf.streets@data$kph[i]*1000)) * 60 # travel time in minutes
edges[(row.start+1):(row.start+nrow(temp.edges)), 1:4] <- temp.edges
edges[(row.start+1):(row.start+nrow(temp.edges)), 5] <- temp.lengths
edges[(row.start+1):(row.start+nrow(temp.edges)), 6] <- i
row.start <- row.start + nrow(temp.edges)
}
edges <- unique(na.omit(edges))
# start from here
library(igraph)
library(ANN)
library(rgeos)
library(rgdal)
library(FNN)
library(geosphere)
froms <- paste(edges[,1], edges[,2])
tos <- paste(edges[,3], edges[,4])
graph <- graph.edgelist(cbind(froms, tos), directed = FALSE)
E(graph)$weight <- edges[,5]
xy <- do.call(rbind, strsplit(V(graph)$name, " "))
V(graph)$x <- as.numeric(xy[, 1])
V(graph)$y <- as.numeric(xy[, 2])
xyg <- cbind(V(graph)$x, V(graph)$y)
## pick route points here
## Coit Tower -122.405833, 37.8025 -- 37.8025, -122.405833
## Golden Gate Bridge (mid-span) -122.478611, 37.819722 -- 37.819722, -122.478611
## Candlestick Park -122.386111, 37.713611 -- 37.713611,-122.386111
## SF State -122.479722, 37.723333 -- 37.723333,-122.479722
from <- cbind(-122.478611, 37.819722)
to <- cbind(-122.405833, 37.8025)
ifrom <- get.knnx(xyg, from, 1)$nn.index[1, 1]
ito <- get.knnx(xyg, to, 1)$nn.index[1, 1]
v.path <- get.shortest.paths(graph, ifrom, ito, output = "vpath")[[1]]
e.path <- get.shortest.paths(graph, ifrom, ito, output = "epath")[[1]]
route <- xyg[v.path,]
totaldist <- 0
cardinal.directions <- NULL
for (i in 2:nrow(route)) {
tempdist <- distHaversine(route[i-1,], route[i,])
totaldist <- totaldist + tempdist
direction <- bearing(route[i-1, ], route[i, ])
cardinal.directions <- rbind(cardinal.directions, direction)
}
distance.km <- totaldist/1000 # distance in kilometers
distance.miles <- totaldist/1609.34 # distance in miles
travel.time <- sum(E(graph)$weight[v.path]) # travel time
street.names <- as.character(sf.streets@data[edges[e.path,6],2])
street.names[is.na(street.names)] <- "Unknown Road"
repeated <- matrix(FALSE,length(street.names),1)
for (j in 2:length(street.names)) {
repeated[j] <- (street.names[j] == street.names[j-1])
}
compass.points <- cardinal.directions
compass.points[cardinal.directions>337.5 | cardinal.directions<22.5] <- "N"
compass.points[cardinal.directions>22.5 & cardinal.directions<67.5] <- "NE"
compass.points[cardinal.directions>67.5 & cardinal.directions<112.5] <- "E"
compass.points[cardinal.directions>112.5 & cardinal.directions<157.5] <- "SE"
compass.points[cardinal.directions>157.5 & cardinal.directions<202.5] <- "S"
compass.points[cardinal.directions>202.5 & cardinal.directions<247.5] <- "SW"
compass.points[cardinal.directions>247.5 & cardinal.directions<292.5] <- "W"
compass.points[cardinal.directions>292.5 & cardinal.directions<337.5] <- "NW"
all.turns <- cbind(street.names, compass.points, repeated)
turn.by.turn <- subset(all.turns, repeated==FALSE)[,1:2]
rownames(turn.by.turn) <- NULL
colnames(turn.by.turn) <- c("Street Name", "Direction")
route.results <- cbind(travel.time, distance.miles)
colnames(route.results) <- c("Travel Time (Minutes)", "Distance (Miles)")
print(route.results)
print(turn.by.turn)
plot(sf.streets, col="gray")
lines(V(graph)[v.path]$x, V(graph)[v.path]$y, lwd = 2, col = "red")

复制代码

板凳

ReneeBK 发表于 2017-1-22 03:00:26

## PS 207 class 9
## DiD, matching, RD
# earned income tax credit data
# EITC = earned income tax credit, a tax credit for lower income families with children.
# Did more women work after it was expanded in 1994?
# urate = state unemployment rate
# finc = family income
# earn = earned income
# unearn = unearned income
library(foreign)
eitc <- read.dta("eitc.dta")
# Create two dummy variables to indicate before/after and treatment/control groups.
# the EITC went into effect in the year 1994
eitc$post93 <- as.numeric(eitc$year >= 1994)
# The EITC only affects women with at least one child, so the treatment group is all women with children.
eitc$anykids <- as.numeric(eitc$children >= 1)
# First calculate treatment effect through simple algebra
# Compute the four data points needed in the DiD calculation:
a <- mean(eitc$work[eitc$post93==0 & eitc$anykids==0], na.rm=T)
b <- mean(eitc$work[eitc$post93==0 & eitc$anykids==1], na.rm=T)
c <- mean(eitc$work[eitc$post93==1 & eitc$anykids==0], na.rm=T)
d <- mean(eitc$work[eitc$post93==1 & eitc$anykids==1], na.rm=T)
# Compute the effect of the EITC on the employment of women with children:
(d-c)-(b-a)
## Now a DiD regression model
did.model1 <- lm(work ~ post93*anykids, data = eitc)
summary(did.model1)
did.model2 <- lm(work ~ post93*anykids + nonwhite + age + I(age^2) + ed + finc + I(finc-earn), data = eitc)
summary(did.model2)
## placebo model
## demonstrate we don't get a treatment effect when picking a different year for the "treatment"
## subset to pre-treatment years
eitc.sub <- eitc[eitc$year <= 1993,]
# Create a new "after treatment" dummy variable
eitc.sub$post91 <- as.numeric(eitc.sub$year >= 1992)
# Run a placebo regression where placebo treatment = post91*anykids
did.model3 <- lm(work ~ post91*anykids, data = eitc.sub)
summary(did.model3)
#########################
## Matching techniques ##
#########################
bridges <- read.dta("bridges.dta", convert.factors=FALSE)
install.packages("MatchIt")
library(MatchIt)
# exact matching (fails) #
bridge.match <- matchit(totbexp ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="exact")
# propensity score matching #
bridge.match <- matchit(totbexp ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest")
summary(bridge.match)
plot(bridge.match)
plot(bridge.match, type="jitter") # interactive!
plot(bridge.match, type="hist")
m.data <- match.data(bridge.match)
nrow(m.data)
# testing for balance with t-tests
t.test(avgunemp ~ totbexp, data=bridges)
t.test(avgunemp ~ totbexp, data=m.data)
# mahalanobis distance matching #
bridge.match2 <- matchit(totbexp ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", distance="mahalanobis")
summary(bridge.match2)
plot(bridge.match2)
# jitter and histogram plots are only for propensity scores
m2.data <- match.data(bridge.match2)
nrow(m2.data)
# propensity score matching, 2 matches per treated observation #
bridge.match3 <- matchit(totbexp ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", ratio=2)
summary(bridge.match3)
plot(bridge.match3)
plot(bridge.match3, type="jitter")
plot(bridge.match3, type="hist")
m3.data <- match.data(bridge.match3)
nrow(m3.data)
# propensity score matching, discarding observations outside support #
bridge.match4 <- matchit(totbexp ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totpop2564m, data=MatchData2, method="nearest", discard="both")
summary(bridge.match4)
plot(bridge.match4)
plot(bridge.match4, type="jitter")
plot(bridge.match4, type="hist")
m4.data <- match.data(bridge.match4)
nrow(m4.data)
#install.packages("Zelig")
#library(Zelig)
# Zelig crashing, so using my own code
# Negative binomial model with unmatched data #
nbmod1 <- glm.nb(jumps2564_m ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)), data=bridges)
summary(nbmod1)
# Negative binomial model with propensity score matched data #
nbmod2 <- glm.nb(jumps2564_m ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)), data=m.data)
summary(nbmod2)
# Negative binomial model with mahalanobis distance matched data #
nbmod3 <- glm.nb(jumps2564_m ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)), data=m2.data)
summary(nbmod3)
# Negative binomial model with propensity score matching, 2 controls per treated #
nbmod4 <- glm.nb(jumps2564_m ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)), data=m3.data)
summary(nbmod4)
# Negative binomial model with propensity score matching, discarding observations w/o common support #
nbmod5 <- glm.nb(jumps2564_m ~ spcorr_all + whitepct2564m + amindpct2564m + avgunemp + avgurban90 + totbexp + offset(log(totpop2564m)), data=m4.data)
summary(nbmod5)
x.t0 <- cbind(t(colMeans(cbind(1, bridges$spcorr_all, bridges$whitepct2564m, bridges$amindpct2564m, bridges$avgunemp, bridges$avgurban90))), 0)
x.t1 <- cbind(t(colMeans(cbind(1, bridges$spcorr_all, bridges$whitepct2564m, bridges$amindpct2564m, bridges$avgunemp, bridges$avgurban90))), 1)
offset.var <- mean(log(bridges$totpop2564m))
mod1.treat0 <- exp(nbmod1$coef%*%t(x.t0) + offset.var)
mod1.treat1 <- exp(nbmod1$coef%*%t(x.t1) + offset.var)
att1 <- mod1.treat1 - mod1.treat0
mod1.treat1
mod1.treat0
att1
mod5.treat0 <- exp(nbmod5$coef%*%t(x.t0) + offset.var)
mod5.treat1 <- exp(nbmod5$coef%*%t(x.t1) + offset.var)
att5 <- mod5.treat1 - mod5.treat0
mod5.treat1
mod5.treat0
att5
##############################
## Regression Discontinuity ##
##############################
# incumbency advantage
votedata <- read.dta("LMBdata.dta")
votedata$treatment <- (votedata$lagdemvoteshare >= 0.5)
rd.model1 <- lm(demvoteshare ~ lagdemvoteshare + treatment, data=votedata)
summary(rd.model1)
plot(demvoteshare ~ lagdemvoteshare, data=votedata)
curve(rd.model1$coefficients[1] + rd.model1$coefficients[2]*x + rd.model1$coefficients[3]*(x>0.5), from=0, to = 0.5, col="red", lw=2, add=TRUE)
curve(rd.model1$coefficients[1] + rd.model1$coefficients[2]*x + rd.model1$coefficients[3]*(x>0.5), from=0.501, to=1, col="red", lw=2, add=TRUE)
rd.model2 <- lm(demvoteshare ~ lagdemvoteshare*treatment, data=votedata)
summary(rd.model2)
plot(demvoteshare ~ lagdemvoteshare, data=votedata)
curve(rd.model2$coefficients[1] + rd.model2$coefficients[2]*x + rd.model2$coefficients[3]*(x>0.5) + rd.model2$coefficients[4]*(x>0.5)*x, from=0, to = 0.5, col="red", lw=2, add=TRUE)
curve(rd.model2$coefficients[1] + rd.model2$coefficients[2]*x + rd.model2$coefficients[3]*(x>0.5) + rd.model2$coefficients[4]*(x>0.5)*x, from=0.501, to = 1, col="red", lw=2, add=TRUE)
rd.model3 <- lm(demvoteshare ~ lagdemvoteshare*treatment + I(lagdemvoteshare^2)*treatment, data=votedata)
summary(rd.model3)
plot(demvoteshare ~ lagdemvoteshare, data=votedata)
curve(rd.model3$coefficients[1] + rd.model3$coefficients[2]*x + rd.model3$coefficients[3]*(x>0.5) + rd.model3$coefficients[4]*x*x + rd.model3$coefficients[5]*(x>0.5)*x + rd.model3$coefficients[6]*(x>0.5)*x*x, from=0, to = 0.5, col="red", lw=2, add=TRUE)
curve(rd.model3$coefficients[1] + rd.model3$coefficients[2]*x + rd.model3$coefficients[3]*(x>0.5) + rd.model3$coefficients[4]*x*x + rd.model3$coefficients[5]*(x>0.5)*x + rd.model3$coefficients[6]*(x>0.5)*x*x, from=0.501, to = 1, col="red", lw=2, add=TRUE)
install.packages("rdd")
library(rdd)
rd.model4 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata)
summary(rd.model4)
plot(rd.model4)
# placebo test around median of lagged vote share
rd.model5 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = median(votedata$lagdemvoteshare, na.rm=T), data=votedata)
summary(rd.model5)
# trying different bandwidths
rd.model4$bw
rd.model.05 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.05)
summary(rd.model.05)
plot(rd.model.05)
rd.model.025 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.025)
summary(rd.model.025)
plot(rd.model.025)
rd.model.01 <- RDestimate(demvoteshare ~ lagdemvoteshare, cutpoint = 0.5, data=votedata, bw=0.01)
summary(rd.model.01)
plot(rd.model.01)

复制代码

报纸

ReneeBK 发表于 2017-1-22 03:00:50

## multilevel model code for R ##
install.packages("lme4")
library(lme4)
#### First example -- exam scores
# normexam = test scores
# school = school id
# standLRT = individual score on a different test
# schavg = average intake score
install.packages("mlmRev")
library(mlmRev)
data(Exam)
head(Exam)
lmer(normexam ~ 1 + (1 | school), data=Exam) # random intercept only
lmer(normexam ~ standLRT + (1 | school), data=Exam) # ranomd intercept plus fixed effect
lmer(normexam ~ standLRT + (standLRT | school), data=Exam) # random intercept, random slope
lmer(normexam ~ standLRT + schavg + (1 + standLRT | school), data=Exam) # random intercept, individual and group predictor
lmer(normexam ~ standLRT * schavg + (1 + standLRT | school), data=Exam) # random intercept, cross-level interaction
data(InstEval)
# s = student number
# d = professor number
# studage = student age (# semesters enrolled)
# lectage = lecture age (# semesters lecture was in the past)
# service = dummy variable for course taught outside department
# dept = department number
# y = student rating, higher is better
InstEval$studage <- as.numeric(as.character(InstEval$studage)) # convert from factor to number
head(InstEval)
## linear regression
linmod1 <- lm(y ~ studage + service, data=InstEval)
summary(linmod1)
linmod2 <- lm(y ~ studage + service + as.factor(dept), data=InstEval)
summary(linmod2)
## linear mixed model, random intercept
mixedmod1 <- lmer(y ~ studage + service + (1 | dept), data=InstEval)
summary(mixedmod1)
## linear regression, interaction model
linmod3 <- lm(y ~ studage + service + (studage * dept), data=InstEval)
summary(linmod3)
## linear mixed model, random intercept and random coefficient on student age by department
## note studage enters twice -- otherwise mean effect assumed to be zero
mixedmod2 <- lmer(y ~ service + studage + (1 + studage | dept), data=InstEval)
summary(mixedmod2)
# examine random and fixed effects
fixef(mixedmod2)
ranef(mixedmod2)
# plot results
dotplot(ranef(mixedmod2))
randeffs <- unlist(ranef(mixedmod2))
reglines <- cbind((randeffs[1:14] + fixef(mixedmod2)[1]), (randeffs[15:28] + fixef(mixedmod2)[2]))
plot(y ~ studage, ylim = c(2,3.5), data=InstEval)
for (i in 1:nrow(reglines)) {
curve(reglines[i,1] + reglines[i,2] * x, col=i, add=TRUE)
}
# compare multilevel to separate regressions -- shrinkage estimator
mixedmod3 <- lmer(y ~ studage + (1 + studage | dept), data=InstEval)
summary(mixedmod3)
randeffs3 <- unlist(ranef(mixedmod3))
reglines3 <- cbind((randeffs3[1:14] + fixef(mixedmod3)[1]), (randeffs3[15:28] + fixef(mixedmod3)[2]))
alpha.hat <- NULL
beta.hat <- NULL
for(i in levels(InstEval$dept)){
unit.lm <- lm(y ~ studage, data = subset(InstEval, dept == i) )
alpha.hat <- append(alpha.hat, coef(unit.lm)[1])
beta.hat <- append(beta.hat, coef(unit.lm)[2])
}
pooled.model <- lm(y ~ studage, data=InstEval)
pooled.beta <- coef(pooled.model)[2]
plot(reglines3[,2] ~ beta.hat)
curve(1*x, add=TRUE)
abline(v=pooled.beta)
## Using Zelig
install.packages("ZeligMultilevel")
library(ZeligMultilevel)
# example 1
z.reg1 <- zelig(y ~ studage + service + tag(1 | dept), data=InstEval, model="ls.mixed")
summary(z.reg1)
x.high <- setx(z.reg1, studage=8)
x.low <- setx(z.reg1, studage=2)
s.reg1 <- sim(z.reg1, x=x.high, x1=x.low)
summary(s.reg1)
# example 2 -- note assumed intercept in random component
z.reg2 <- zelig(y ~ studage + service + tag(studage | dept), data=InstEval, model="ls.mixed")
summary(z.reg2)
x.high <- setx(z.reg2, studage=8)
x.low <- setx(z.reg2, studage=2)
s.reg2 <- sim(z.reg2, x=x.high, x1=x.low)
summary(s.reg2)
## multilevel logits
library(foreign)
votedata <- read.dta("polls.dta")
pooled.logit1 <- glm(bush ~ black + female + edu + age, family=binomial(link="logit"), data=votedata)
summary(pooled.logit1)
## Multilevel logit -- random intercept
multi.logit1 <- glmer(bush ~ black + female + edu + age + (1 | state), family=binomial(link="logit"), data=votedata)
summary(multi.logit1)
## Multilevel logit -- random slopes
multi.logit2 <- glmer(bush ~ black + female + age + (1 + edu | state), family=binomial(link="logit"), data=votedata)
summary(multi.logit2)
## Using Zelig
# logit example 1
z.logit1 <- zelig(bush ~ black + female + edu + age + tag(1 | state), data=votedata, model="logit.mixed")
summary(z.logit1)
x.high <- setx(z.logit1, age=4)
x.low <- setx(z.logit1, age=1)
s.logit1 <- sim(z.logit1, x=x.high, x1=x.low)
summary(s.logit1)
# logit example 2 -- note assumed intercept in random component
z.logit2 <- zelig(bush ~ black + female + age + tag(edu| state), data=votedata, model="logit.mixed")
summary(z.logit2)
## something goes wrong from here with setx command
#x.high <- setx(z.logit2, age=4)
#x.low <- setx(z.logit2, age=1)
#s.logit2 <- sim(z.logit2, x=x.high, x1=x.low)
#summary(s.logit2)

复制代码

地板

ReneeBK 发表于 2017-1-22 03:01:17

## multiple imputation code for R ##
## Example 1 ##
library(foreign)
CCSData <- read.dta("CCS2010v2.dta", convert.factors=FALSE)
install.packages("Rcpp") # we need to do this because otherwise Amelia might load a defunct version that doesn't work
install.packages("Amelia")
library(Amelia)
## imputation ##
impute.out <- amelia(CCSData, m=5, noms=c("develop1","develop2"))
# built-in diagnostics #
summary(impute.out)
missmap(impute.out)
plot(impute.out)
overimpute(impute.out, var="age")
## save imputed datasets ##
write.amelia(obj=impute.out, file.stem = "impdata", format = "dta")
impd1 <- subset(impute.out$imputations[[1]])
impd2 <- subset(impute.out$imputations[[2]])
impd3 <- subset(impute.out$imputations[[3]])
impd4 <- subset(impute.out$imputations[[4]])
impd5 <- subset(impute.out$imputations[[5]])
install.packages("Zelig")
library(Zelig)
install.packages("ZeligChoice")
library(ZeligChoice)
#regmod1.out <- zelig(as.factor(develop1) ~ incgroup + rent + southcoast + NEP, model="mlogit", data=CCSData)
#summary(regmod1.out)
#impmod1.out <- zelig(as.factor(develop1) ~ incgroup + rent + southcoast + NEP, model="mlogit", data=mi(impd1, impd2, impd3, impd4, impd5))
#summary(impmod1.out) # not working -- we can do it by hand
install.packages("nnet")
library(nnet)
regmod1.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=CCSData))
coeffs <- cbind(t(regmod1.out$coefficients[1,]), t(regmod1.out$coefficients[2,]))
ses <- sqrt(diag(solve(regmod1.out$Hessian)))
zs <- coeffs/ses
ps <- 2*(1 - pnorm(abs(zs)))
reg.final <- t(rbind(coeffs, ses, zs, ps))
colnames(reg.final) <- c("Coeff", "SE", "Z", "P")
impmod1.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd1))
impmod2.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd2))
impmod3.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd3))
impmod4.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd4))
impmod5.out <- summary(multinom(develop1 ~ incgroup + rent + southcoast + NEP, Hess=TRUE, data=impd5))
coeffs1 <- cbind(t(impmod1.out$coefficients[1,]), t(impmod1.out$coefficients[2,]))
se1 <- sqrt(diag(solve(impmod1.out$Hessian)))
coeffs2 <- cbind(t(impmod2.out$coefficients[1,]), t(impmod2.out$coefficients[2,]))
se2 <- sqrt(diag(solve(impmod2.out$Hessian)))
coeffs3 <- cbind(t(impmod3.out$coefficients[1,]), t(impmod3.out$coefficients[2,]))
se3 <- sqrt(diag(solve(impmod3.out$Hessian)))
coeffs4 <- cbind(t(impmod4.out$coefficients[1,]), t(impmod4.out$coefficients[2,]))
se4 <- sqrt(diag(solve(impmod4.out$Hessian)))
coeffs5 <- cbind(t(impmod5.out$coefficients[1,]), t(impmod5.out$coefficients[2,]))
se5 <- sqrt(diag(solve(impmod5.out$Hessian)))
mi.coeffs <- rbind(coeffs1, coeffs2, coeffs3, coeffs4, coeffs5)
mi.se <- rbind(se1, se2, se3, se4, se5)
mi.results <- mi.meld(mi.coeffs, mi.se)
mi.results
mi.z <- mi.results$q.mi/mi.results$se.mi
mi.p <- 2*(1 - pnorm(abs(mi.z)))
mi.final <- cbind(t(mi.results$q.mi), t(mi.results$se.mi), t(mi.z), t(mi.p))
colnames(mi.final) <- c("Coeff", "SE", "Z", "P")
# without imputation
reg.final
#with imputation
mi.final
## Example 2 ##
MIData2 <- read.dta("development.dta", convert.factors=FALSE)
MIData2sub <- subset(MIData2, select=c(gxpdhlth, gini, g, glag, dictator, births, infmort, cath, moslem, femsec, fertil,country,year))
impute2.out <- amelia(MIData2sub, m=5, cs="country", ts="year", noms="dictator")
tscsPlot(impute2.out, var="infmort", cs="1")
impute3.out <- amelia(MIData2sub, m=5, cs="country", ts="year", noms="dictator", polytime=3)
tscsPlot(impute3.out, var="infmort", cs="1")
regmod2.out <- zelig(infmort ~ gxpdhlth + glag + dictator + femsec, model="ls", data=MIData2)
summary(regmod2.out)
impd1 <- subset(impute3.out$imputations[[1]])
impd2 <- subset(impute3.out$imputations[[2]])
impd3 <- subset(impute3.out$imputations[[3]])
impd4 <- subset(impute3.out$imputations[[4]])
impd5 <- subset(impute3.out$imputations[[5]])
impmod2.out <- zelig(infmort ~ gxpdhlth + glag + dictator + femsec, model="ls", data=mi(impd1, impd2, impd3, impd4, impd5))
summary(impmod2.out)
impmod2.out <- zelig(g ~ gxpdhlth + glag + dictator + femsec, model="ls", data=mi(impd1, impd2, impd3, impd4, impd5))
summary(impmod2.out)

复制代码

7楼

ReneeBK 发表于 2017-1-22 03:02:02

## Selection models in R ##
install.packages("sampleSelection")
library(sampleSelection)
data("Mroz87")
Mroz87$kids <- (Mroz87$kids5 + Mroz87$kids618 > 0)
head(Mroz87)
# Female labor supply (lfp = labour force participation)
## Outcome equations without correcting for selection
# I() means "as-is" -- do calculation in parentheses then use as variable
## Comparison of linear regression and selection model
outcome1 <- lm(wage ~ exper, data = Mroz87)
summary(outcome1)
selection1 <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper,
data = Mroz87, method = "2step")
summary(selection1)
plot(Mroz87$wage ~ Mroz87$exper)
curve(outcome1$coeff[1] + outcome1$coeff[2]*x, col="black", lwd="2", add=TRUE)
curve(selection1$coeff[1] + selection1$coeff[2]*x, col="orange", lwd="2", add=TRUE)
## A more complete model comparison
outcome2 <- lm(wage ~ exper + I( exper^2 ) + educ + city, data = Mroz87)
summary(outcome1)
## Correcting for selection
selection.twostep2 <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city,
data = Mroz87, method = "2step")
summary(selection.twostep2)
selection.mle <- selection(selection = lfp ~ age + I(age^2) + faminc + kids + educ, outcome = wage ~ exper + I(exper^2) + educ + city,
data = Mroz87, method = "mle")
summary(selection.mle)
## Heckman model selection "by hand" ##
seleqn1 <- glm(lfp ~ age + I(age^2) + faminc + kids + educ, family=binomial(link="probit"), data=Mroz87)
summary(seleqn1)
## Calculate inverse Mills ratio by hand ##
Mroz87$IMR <- dnorm(seleqn1$linear.predictors)/pnorm(seleqn1$linear.predictors)
## Outcome equation correcting for selection ##
outeqn1 <- lm(wage ~ exper + I(exper^2) + educ + city + IMR, data=Mroz87, subset=(lfp==1))
summary(outeqn1)
## compare to selection package -- coefficients right, se's wrong
summary(selection.twostep2)
## interpretation
## If our independent variables does not appear in the selection equation, we can interpret beta as in linear regression
## If it does appear in the selection equation, we must calculate:
beta.educ.sel <- selection.twostep2$coefficients[6]
beta.educ.out <- selection.twostep2$coefficients[10]
beta.IMR <- selection.twostep2$coefficients[12]
delta <- selection.twostep2$imrDelta
marginal.effect <- beta.educ.out - beta.educ.sel * beta.IMR * delta
mr2 <- marginal.effect * Mroz87$educ
plot(Mroz87$wage ~ Mroz87$educ)
lines(mr2 ~ Mroz87$educ, type="l", col="green", lwd="2")
## Selection with a binary outcome variable
## Data from Kimball (2006)
library(foreign)
conflict.data <- read.dta("MissingLink_JPRfinal.dta", convert.factors=FALSE)
conflict.data <- na.omit(conflict.data)
head(conflict.data)
## A probit model for conflict
probit.model <- glm(conflict ~ relcap+contig+jtdem+jaut+pwrs+allform2, family=binomial(link=probit), data=conflict.data)
summary(probit.model)
install.packages("Zelig")
library(Zelig)
install.packages("ZeligChoice")
library(ZeligChoice)
selection.formula <- list(mu1 = conflict ~ relcap+contig+jtdem+jaut+pwrs,
mu2 = allform2 ~ relcap+logdist+contig+jtdem+jaut+sharerival)
selection.binary <- zelig(selection.formula, model = "bprobit", data = conflict.data)
summary(selection.binary)
x.contig <- setx(selection.binary, contig=1)
sim.binary1 <- sim(selection.binary, x = x.contig)
summary(sim.binary1)
plot(sim.binary1)
x.noncontig <- setx(selection.binary, contig=0)
sim.binary2 <- sim(selection.binary, x = x.contig, x1=x.noncontig)
summary(sim.binary2)
plot(sim.binary2)

复制代码

8楼

ReneeBK 发表于 2017-1-22 03:05:22

## Models for BTSCS, Class 5
IRdata <- read.table("OR.txt", header=T, sep="\t")
IRdata <- as.data.frame(na.omit(IRdata))
View(IRdata)
## Pooled logit ##
logitmod1 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade, family=binomial(link="logit"), data=IRdata, x=TRUE)
summary(logitmod1)
## Fixed effects logit ##
## the pglm package can supposedly also do this, but it seems to crash ##
install.packages("glmmML")
library(glmmML)
logitfemod1 <- glmmboot(dispute ~ dem + growth + allies + contig + capratio + trade, family=binomial(link="logit"), data=IRdata, cluster=ordyid)
summary(logitfemod1)
## list first 10 estimated fixed effects ##
logitfemod1$frail[1:10]
## how many observations in each model?
length(logitmod1$fitted.values)
dyadrep <- as.vector(table(IRdata$ordyid))
validobs <- as.numeric(logitfemod1$frail > -Inf)
sum(dyadrep * validobs)
## dynamics
## Logit with time dummy variables ##
logitmod2 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade + as.factor(py), family=binomial(link="logit"), data=IRdata)
summary(logitmod2)
## Logit with time effects: cubic polynomial ##
IRdata$py2 <- (IRdata$py)^2
IRdata$py3 <- (IRdata$py)^3
logitmod3 <- glm(dispute ~ dem + growth + allies + contig + capratio + trade + py + py2 + py3, family=binomial(link="logit"), data=IRdata)
summary(logitmod3)
## plot hazard rate over time
xb <- (colMeans(cbind(1, IRdata$dem, IRdata$growth, IRdata$allies, IRdata$contig, IRdata$capratio, IRdata$trade))) %*% logitmod3$coefficients[1:7]
time <- seq(1,50,1)
hazard <- plogis(xb + (time*logitmod3$coefficients[8]) + ((time^2)*logitmod3$coefficients[9]) + ((time^3)*logitmod3$coefficients[10]))
plot(hazard~time, type="l")

复制代码

9楼

ReneeBK 发表于 2017-1-22 03:06:55

## survival models in R ##
library(foreign)
coalition.data <- read.dta("coalition.dta", convert.factors=FALSE)
coalition.data$fractionalization <- coalition.data$fractionalization/1000
## duration measured in months
olsmodel1 <- lm(duration ~ investiture + fractionalization + polarization + majority_government + crisis, data=coalition.data)
summary(olsmodel1)
install.packages("Zelig")
library(Zelig)
## the exponential survival model
exp.survival <- zelig(Surv(duration, censor12) ~ investiture + fractionalization + polarization + majority_government + crisis,
model="exp", data=coalition.data)
summary(exp.survival)
# expected values and first differences
x.minority <- setx(exp.survival, majority_government = 0)
x.majority <- setx(exp.survival, majority_government = 1)
exp.survival.sim <- sim(exp.survival, x=x.minority, x1=x.majority)
summary(exp.survival.sim)
plot(exp.survival.sim)
## The Weibull model -- currently not working in Zelig, so we can work around it
#weibull.survival <- zelig(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, model="weibull", data=coalition.data)
weibull.survival <- survreg(Surv(duration, censor12) ~ investiture + fractionalization + polarization + majority_government + crisis,
dist="weibull", data=coalition.data, x=TRUE) # note x=TRUE
summary(weibull.survival)
# scale is <1 so hazard rate decreasing over time
# plot hazard rate over time
p <- 1/weibull.survival$scale
t <- seq(1,60,1)
lambda <- exp(colMeans(weibull.survival$x) %*% weibull.survival$coefficients)
weibull.hazard <- lambda * p * (lambda * t)^(1-p)
plot(weibull.hazard, type="l")
# expected values and first differences
x.minority <- append(colMeans(weibull.survival$x),1) # The 1 is so we can pick up the scale parameter
x.minority["majority_government"] <- 0
x.majority <- x.minority
x.majority["majority_government"] <- 1
weibull.coeffs <- append(weibull.survival$coefficients, log(weibull.survival$scale))
betas <- mvrnorm(1000, weibull.coeffs, vcov(weibull.survival))
expect.min <- 1/exp(-x.minority %*% t(betas))
expect.maj <- 1/exp(-x.majority %*% t(betas))
weibull.fd <- expect.maj - expect.min
mean.min <- mean(expect.min)
mean.maj <- mean(expect.maj)
sd.min <- apply(expect.min,1,sd)
sd.maj <- apply(expect.maj,1,sd)
mean.fd <- mean(weibull.fd)
sd.fd <- apply(weibull.fd,1,sd)
fd.results <- rbind(cbind(mean.min, sd.min), cbind(mean.maj, sd.maj), cbind(mean.fd, sd.fd))
colnames(fd.results) <- c("Mean", "SD")
rownames(fd.results) <- c("Minority Govt", "Majority Govt", "FD")
print(fd.results)
## The Cox proportional hazard model -- once again not working in Zelig
#coxph.survival <- exp.survival <- zelig(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, model="coxph", data=coalition.data)
#install.packages("survival") -- already installed with Zelig
#library(survival)
coxph.survival <- coxph(Surv(duration, censor12) ~ identifiability + volatility + response + investiture + polarization + fractionalization + majority_government, data=coalition.data, x=TRUE) # note x=TRUE
summary(coxph.survival) # note reversal of signs!
# estimated survival function
plot(survfit(coxph.survival), xlab="Months", ylab="Governments Surviving")
x.minority <- colMeans(coxph.survival$x)
x.minority["majority_government"] <- 0
x.majority <- x.minority
x.majority["majority_government"] <- 1
hyp.data <- as.data.frame(rbind(x.minority,x.majority))
plot(survfit(coxph.survival, newdata=hyp.data), xlab="Months", ylab="Governments Surviving", conf.int=T, lty=c(1,2))
legend("topright", legend=c("minority govt","majority govt"), lty=c(1,2))
# interpreting in terms of expected time to failure is hard because we do not estimate a baseline
# instead look at changes in the hazard rate
hazard.pct.change <- ((exp(x.minority %*% t(betas)) - exp(x.majority %*% t(betas)))/exp(x.minority %*% t(betas)))*100
hazard.mean <- mean(hazard.pct.change)
hazard.sd <- apply(hazard.pct.change,1,sd)
cbind(hazard.mean, hazard.sd) # minority gov't 47% more likely to fail

复制代码

10楼

ReneeBK 发表于 2017-1-22 03:07:19

## 2SLS in R ##
library(foreign)
AllData <- read.dta("newdail2002v2.dta", convert.factors=FALSE)
CampaignData <- na.omit(subset(AllData, select=c(votes1st,incumb,spend_regular,spend_regularXinc,spend_public,partyquota1997,electoraK,dublin,senator,councillor,wonseat)))
## OLS ##
olsmodel <- lm(votes1st ~ incumb + spend_regular + spend_regularXinc, data=CampaignData)
summary(olsmodel)
olsmodel2 <- lm(votes1st ~ incumb + spend_regular + spend_regularXinc + spend_public, data=CampaignData)
summary(olsmodel2)
## 2SLS by hand -- this replicates models 1 and 2, Table 3 in Benoit and Laver 2008
first.stage.1 <- lm(spend_regular ~ partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(first.stage.1)
CampaignData$instrumented.spending <- first.stage.1$fitted.values
CampaignData$inst.spend.inc <- CampaignData$instrumented.spending * CampaignData$incumb
second.stage.1 <- lm(votes1st ~ incumb + instrumented.spending + inst.spend.inc, data=CampaignData)
summary(second.stage.1)
first.stage.2 <- lm(spend_public ~ partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(first.stage.2)
CampaignData$instrumented.pspending <- first.stage.2$fitted.values
second.stage.2 <- lm(votes1st ~ incumb + instrumented.spending + inst.spend.inc + instrumented.pspending, data=CampaignData)
summary(second.stage.2)
## Issue to consider -- is interaction also endogenous?
## Example of difference in standard errors
second.stage.1v2 <- lm(votes1st ~ instrumented.spending, data=CampaignData)
summary(second.stage.1v2)
## 2SLS ##
## compare standard errors
install.packages("AER")
library(AER)
tslsmodel <- ivreg(votes1st ~ spend_regular | partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(tslsmodel)
## Complete models with 2SLS
## note incumbency is also an instrument
tslsmodel.1 <- ivreg(votes1st ~ incumb + spend_regular + spend_regularXinc | incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(tslsmodel.1)
tslsmodel.2 <- ivreg(votes1st ~ incumb + spend_regular + spend_regularXinc + spend_public | incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(tslsmodel.2)
## An alternate package
install.packages("sem")
library(sem)
tslsmodel.3 <- tsls(votes1st ~ incumb + spend_regular + spend_regularXinc, ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(tslsmodel.3)
tslsmodel.4 <- tsls(votes1st ~ incumb + spend_regular + spend_regularXinc + spend_public , ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
summary(tslsmodel.4)
## 2 stage estimator with a binary dv
probit.model <- glm(wonseat ~ incumb + spend_regular + spend_regularXinc + spend_public, family=binomial(link=probit), data=CampaignData)
summary(probit.model)
## (sort of) replication of table 5
tspmodel.1 <- glm(wonseat ~ incumb + instrumented.spending + inst.spend.inc + instrumented.pspending, family=binomial(link=probit), data=CampaignData)
summary(tspmodel.1)
## 2SCML
reg.inst1 <- lm(spend_regular ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
reg.inst2 <- lm(spend_regularXinc ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
reg.inst3 <- lm(spend_public ~ incumb + partyquota1997 + electoraK + dublin + senator + councillor, data=CampaignData)
CampaignData$inst1 <- reg.inst1$residuals
CampaignData$inst2 <- reg.inst2$residuals
CampaignData$inst3 <- reg.inst3$residuals
tscml.model <- glm(wonseat ~ incumb + spend_regular + spend_regularXinc + spend_public + inst1 + inst2 + inst3, family=binomial(link=probit), data=CampaignData)
summary(tscml.model)
## test for endogeneity
install.packages("lmtest")
library(lmtest)
lrtest(probit.model, tscml.model)

复制代码

【Lecture Notes】Political Science 207 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

【Lecture Notes】Political Science 207 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群