Courses Notes: Applied Multilevel Modeling

0关注
62粉丝

VIP

已卖：4196份资源

院士

67%

还不是VIP/贵宾

-

TA的文库 其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

0%

威望: 0 级
论坛币: 50294 个
通用积分: 83.8106
学术水平: 253 点
热心指数: 300 点
信用等级: 208 点
经验: 41518 点
帖子: 3256
精华: 14
在线时间: 766 小时
注册时间: 2006-5-4
最后登录: 2022-11-6

楼主

Lisrelchen 发表于 2016-5-29 05:29:17 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Applied Multilevel Modeling

This course covers the theory and application of multilevel statistical models. Research data in the social sciences are often grouped in ways that impact our statistical analyses (e.g., in marital studies, spouses are more similar to one another than other study participants) and lead to interesting and substantive hypotheses (e.g., how do qualities of the relationship interact with an individual’s personality?). We will focus on why these types of data are problematic for classical statistics and the advantages of multilevel approaches. The seminar will heavily emphasize the practical application of multilevel models and rely on examples to demonstrate their need and application. We will cover the general aspects of multilevel models as well as their extension to longitudinal and multivariate data.

Note that the R code below is a modified version of the R code posted at UCLA’s ATS website for Singer and Willett’s Applied Longitudinal Data Analysis book:http://www.ats.ucla.edu/stat/R/examples/alda/

本帖隐藏的内容

[Syllabus]
[Lecture Notes]
[Lecture Slides]
[R code]

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏1 回帖

关键词：Multilevel Modeling Applied Courses Course individual another similar course impact

本帖被以下文库推荐

· Multilevel NewOccidental|主题: 247, 订阅: 21

沙发

Lisrelchen 发表于 2016-5-29 05:31:06

### R code to accompany Singer & Willett's ALDA book (12/27/06)
#
################################################################################
###
### Chapter 2
###
################################################################################
#
### NOTE: datasets can be downloaded from:
#
### http://www.ats.ucla.edu/stat/examples/alda.htm
#
### The R code below uses the "comma separated text files" linked toward the top
### of the page.
#
#inputting and printing the tolerance "person-level" data
tolerance <- read.table(file = file.choose(), sep=",", header=TRUE)
names(tolerance) <- tolower(names(tolerance))
summary(tolerance)
str(tolerance)
print(tolerance)
### NOTE: perhaps something on converting wide to long formats using reshape lib
#
tol.df2 <- reshape(tolerance, idvar = "id", direction = "long",
varying = list(names(tolerance)[2:6]), v.name = "tolerance")
tol.df2 <- reshape(tolerance, idvar = "id", direction = "long",
varying = list(c("tol11","tol12","tol13","tol14","tol15")), v.name = "tolerance")
tol.df2$age <- tol.df2$time + 10
### see reshape library as well (and functions melt() and cast()
#
### re-order data.frame by id and time
tol.df2 <- with(tol.df2, tol.df2[order(id, time),])
### also see sort_df in reshape library
#inputting and printing the tolerance "person-level" data
tolerance.pp <- read.table(file = file.choose(), sep=",", header=T)
names(tolerance.pp) <- tolower(names(tolerance.pp))
summary(tolerance.pp)
print(tolerance.pp[c(1:9, 76:80), ])
# Table 2.1, p. 20.
# Bivariate correlation among tolerance scores assessed on five occasions.
round(cor(tolerance[ , 2:6]), 2)
# Fig. 2.2, p. 25.
# Empirical growth plots.
library(lattice) # needed for xyplot()
tolerance.pp$id <- factor(tolerance.pp$id)
### NOTE: changing id to factor will print id levels in panels
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age | id, data = tolerance.pp, as.table = TRUE, ,
xlab = "AGE", ylab = "TOL", pch = 16, ylim = c(0, 4))
# Fig. 2.3, p. 27.
# Smooth nonparametric trajectories superimposed on empirical growth plots.
### NOTE: following won't work with panel.loess due to small n
xyplot(tolerance ~ age | id, data = tolerance.pp,
panel = function(x, y){
panel.xyplot(x, y, pch = 16)
panel.lines(smooth.spline(x, y, df=4), type = "l", col = "red", lwd = 2)
}, xlab = "AGE", ylab = "TOL", ylim=c(0, 4), as.table=T)
xyplot(tolerance ~ age | male, data = tolerance.pp, as.table = TRUE, ,
xlab = "AGE", ylab = "TOL", pch = 16, ylim = c(0, 4),
type = c("p","smooth"))
# Table 2.2, p. 30.
# Fitting separate within person OLS regression models.
### alternative using lmList
library(nlme)
### fit regressions by id
tol.lis <- lmList(tolerance ~ time | id, data = tolerance.pp)
### get individuals coefficients "augmented" by other variables in data.frame
tol.lis.aug <- coef(tol.lis, augFrame = TRUE)
names(tol.lis.aug)[1] <- "intercept"
print(tol.lis.aug)
### summary of individual fits
tol.lis.sum <- summary(tol.lis)
print(tol.lis.sum)
str(tol.lis.sum)
### plot 95% CI around intercepts and slopes
plot(intervals(tol.lis))
# Fig. 2.4, p. 31.
### stem and leaf of intercepts
stem(coef(tol.lis)[[1]])
### stem and leaf of slopes
stem(coef(tol.lis)[[2]])
# stem plot for R sq
names(tol.lis.sum)
stem(tol.lis.sum$r.squared)
# Fig. 2.5, p. 32.
# Fitted OLS trajectories superimposed on the empirical growth plots.
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age | id, data = tolerance.pp,
panel = function(x, y){
panel.xyplot(x, y)
panel.lmline(x, y)
}, xlab = "AGE", ylab = "TOL", ylim=c(0, 4), as.table=T)
# Fig. 2.6, p. 34.
# The collection of the trajectories of the raw data in the top panel; in the bottom panel is the collection of fitted OLS trajectories.
#plot of the raw data
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age, data = tolerance.pp, groups = id,
panel = panel.superpose, panel.groups = panel.lmline,
xlab = "AGE", ylab = "TOL", ylim=c(0, 4))
### alternative with mean line
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age, data = tolerance.pp, groups = id,
panel = function(x, y, subscripts, groups){
panel.grid()
panel.superpose(x, y, subscripts, groups, panel.groups = "panel.lmline")
panel.lmline(x,y, lwd=3)
},
xlab = "AGE", ylab = "TOL", ylim=c(0, 4)) # auto.key = TRUE adds key but too big
# Table 2.3, p. 37
# Descriptive statistics of the estimates obtained by fitting the linear model by id.
#obtaining the intercepts from linear model by id
mean(coef(tol.lis)[[1]])
sd(coef(tol.lis)[[1]])
#obtaining the slopes from linear model by id
mean(coef(tol.lis)[[2]])
sd(coef(tol.lis)[[2]])
cor(coef(tol.lis))
# Fig. 2.7, 38
# OLS fitted trajectories separated by levels of selected predictors.
# The two first panels are separated by gender; the two last are separated by levels of exposure.
### convert to categorical variables (ie, factors) to use in trellis plots
tolerance.pp$male <- factor(tolerance.pp$male, 0:1, c("Male","Female"))
tolerance.pp$exp.cut <- tolerance.pp$exposure < 1.145
tolerance.pp$exp.cut <- factor(tolerance.pp$exp.cut, c(FALSE, TRUE), c("Low","High"))
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age | male, data = tolerance.pp, groups = id,
xlab = "AGE", ylab = "TOL", type = "r", lwd = 2)
### with mean line
xyplot(tolerance ~ age | male, data = tolerance.pp, groups = id,
panel = function(x, y, subscripts, groups){
panel.grid()
panel.superpose(x, y, subscripts, groups,
panel.groups = "panel.lmline")
panel.lmline(x,y, lwd=3)
}, xlab = "AGE", ylab = "TOL", ylim=c(0, 4))
trellis.device(theme = col.whitebg)
xyplot(tolerance ~ age | exp.cut, data = tolerance.pp, groups = id,
xlab = "AGE", ylab = "TOL", type = "r", lwd = 2)
### with mean line
xyplot(tolerance ~ age | exp.cut, data = tolerance.pp, groups = id,
panel = function(x, y, subscripts, groups){
panel.grid()
panel.superpose(x, y, subscripts, groups,
panel.groups = "panel.lmline")
panel.lmline(x,y, lwd=3)
}, xlab = "AGE", ylab = "TOL", ylim=c(0, 4))
# Fig. 2.8, p. 40.
# OLS estimates plotted against the predictors male and exposure.
par(mfrow=c(2,2))
plot(intercept ~ exposure, data = tol.lis.aug, pch = 16)
plot(intercept ~ male, data = tol.lis.aug, col = "yellow") # boxplot by default
plot(time ~ exposure, data = tol.lis.aug, pch = 16)
plot(time ~ male, data = tol.lis.aug, col = "yellow") # boxplot by default
### correlations between OLS intecepts/slopes and predictors
with(tol.lis.aug, cor(as.numeric(male), intercept))
with(tol.lis.aug, cor(as.numeric(male), time))
with(tol.lis.aug, cor(exposure, intercept))
with(tol.lis.aug, cor(exposure, time))
### NOTE: something on reliability, precision, and variance?

复制代码

藤椅

Lisrelchen 发表于 2016-5-29 05:32:17

### R code to accompany Singer & Willett's ALDA book (12/27/06)
#
################################################################################
###
### Chapter 3
###
################################################################################
# Inputting and printing the early intervention data set, table 3.1, p. 48.
# reading in the opposites data
#
### NOTE: the early intervention data used in Chapter 3 was not "released" to the
### public by the researchers; thus, the data below use the "opposites" data
### from Chapter 7, but the analyses map on directly to those done
opposites <- read.table(file.choose(), header = TRUE, sep=",")
names(opposites) <- tolower(names(opposites))
summary(opposites)
### "cut" cog into low/high to mimic program variable in early intervention data
opposites$cog2 <- cut(opposites$cog, breaks = 2, labels = c("Low","High"))
### change id to factor for correct labels in xyplot strips
opposites$id <- factor(opposites$id)
#Table 3.1, p. 48
print(opposites[1:12,])
# Fig. 3.1, p. 50.
# OLS trajectories superimposed on the empirical growth plots.
library(lattice)
trellis.device(device = pdf, height = 8.5, width = 11,
file = "My plots.pdf", theme = col.whitebg)
xyplot(opp ~ time | id, data = opposites, type = c("g","p","r"),
as.table = TRUE,
layout = c(4,3,3))
### insert more graphs...
dev.off()
### for a more useful ordering, order by fitted intercept
trellis.device(theme = col.whitebg)
xyplot(opp ~ time | id, data = opposites, type = c("g","p","r"),
as.table = TRUE, index.cond = function(x, y) coef(lm(y ~ x))[1])
### same, but order by slopes
xyplot(opp ~ time | id, data = opposites, type = c("g","p","r"),
as.table = TRUE, index.cond = function(x, y) coef(lm(y ~ x))[2])
# Fig. 3.3, p. 57.
# Fitted OLS trajectories and stem plots of fitted initial status and fitted rate of change by id.
# fitting the linear model by id
library(nlme)
lis.3.3 <- lmList(opp ~ time | id, data = opposites)
coef.3.3 <- coef(lis.3.3, augFrame = TRUE)
sum.3.3 <- summary(lis.3.3)
#plotting the linear fit by id
trellis.device(theme = col.whitebg)
xyplot(opp ~ time, data = opposites, groups = id,
panel = function(x, y, subscripts, groups){
panel.grid()
panel.superpose(x, y, subscripts, groups, panel.groups = "panel.lmline")
panel.lmline(x,y, lwd=3)
},
xlab = "OPP", ylab = "TIME") # auto.key = TRUE adds key but too big
# stem plot for fitted initial value
stem(coef.3.3[,1])
#stem plot for fitted rate of change
stem(coef.3.3[,2])
#stem plot for sigma.sq
stem(sum.3.3$r.squared)
# Fig. 3.4, p. 59.
# The top panel represents fitted OLS trajectories for program=0;
# the bottom panel represents fitted OLS trajectories for program=1.
xyplot(opp ~ time | cog2, data = opposites, groups = id,
type = c("g","r"))
xyplot(opp ~ time | cog2, data = opposites, groups = id,
panel = function(x, y, subscripts, groups){
panel.grid()
panel.superpose(x, y, subscripts, groups, panel.groups = "panel.lmline")
panel.loess(x,y, lwd=3)
},
xlab = "OPP", ylab = "TIME") # auto.key = TRUE adds key but too big
# Table 3.3, p. 69
model.3.3 <- lme(opp ~ cog2*time, data = opposites,
random = ~ time | id, method = "ML",
control = list(msVerbose = TRUE, niterEM = 0, opt = "nlminb"))
summary(model.3.3)
# Fig. 3.5 on page 71.
pred.3.3 <- expand.grid(time = 0:3, cog2 = c("Low","High"))
pred.3.3$pred <- predict(model.3.3, pred.3.3, level = 0)
library(lattice)
trellis.device(theme = col.whitebg)
xyplot(pred ~ time, data = pred.3.3, groups = cog2,
type = c("g","l"), lwd = 2, auto.key = list(points = FALSE, lines = TRUE))
model.3.3.1 <- update(model.3.3, opp ~ -1 + cog2/time)
summary(model.3.3.1)
library(gmodels)
fixef(model.3.3.1)
c.mat <- rbind(
'H vs. L at T0' = c(1, -1, 0, 0),
'H vs. L at T1' = c(1, -1, 1, -1),
'H vs. L at T2' = c(1, -1, 2, -2),
'H vs. L at T3' = c(1, -1, 3, -3),
'H vs. L at T4' = c(1, -1, 4, -4)
)
estimable(model.3.3.1, c.mat)

复制代码

板凳

Lisrelchen 发表于 2016-5-29 05:33:03

### R code to accompany Singer & Willett's ALDA book (12/27/06)
#
################################################################################
###
### Chapter 4
###
################################################################################
alcohol1 <- read.table(file = file.choose(), header = TRUE, sep=",")
names(alcohol1) <- tolower(names(alcohol1))
alcohol1$id <- factor(alcohol1$id)
alcohol1$coa <- factor(alcohol1$coa, 0:1, c("Not CoA", "CoA"))
# Fig. 4.1, p. 77.
# Empirical growth plots with superimposed OLS trajectories.
### how to subset a data.frame
mysub <- sample(levels(alcohol1$id), 5, replace = FALSE)
sample(1:20, 5)
### subset the whole dataset by groups in "mysub"
mysub.df <- subset(alcohol1, id %in% mysub)
mysub.df <- subset(alcohol1, coa == "CoA")
library(lattice)
trellis.device(theme = col.whitebg)
xyplot(alcuse ~ age | id,
data = alcohol1[alcohol1$id %in% c(4, 14, 23, 32, 41, 56, 65, 82), ],
type = c("p","r"),
ylim=c(-1, 4), as.table=T)
xyplot(alcuse ~ age | id,
data = alcohol1,
type = c("p","r"),
ylim=c(-1, 4), as.table=T,
subset = alcohol1$id %in% mysub)
# Fig. 4.2, p.79.
# Fitted OLS trajectories displayed separately by coa status and peer levels.
### alternative
library(nlme)
alcuse.lis <- lmList(alcuse ~ age_14 | id, data = alcohol1)
alcuse.lis.aug <- coef(alcuse.lis, augFrame = TRUE)
names(alcuse.lis.aug)[1] <- "intercept"
trellis.device(theme = col.whitebg)
xyplot(alcuse ~ age | coa,
data = alcohol1,
groups = id,
type = "r", lwd = 2)
### alternate for peer plot - coplot
coplot(alcuse ~ jitter(age) | peer, data = alcohol1,
panel = panel.smooth, lwd=2, col = "red", span=1)
# Table 4.1, p. 94-95.
# Model A
model.a <- lme(alcuse ~ 1, data = alcohol1, random = ~1 |id,
method = "ML")
summary(model.a)
VarCorr(model.a)
#Model B
model.b <- lme(alcuse ~ age_14 , data = alcohol1, random = ~ age_14 | id,
method = "ML")
summary(model.b)
#Model C
model.c <- update(model.b, alcuse ~ coa*age_14)
summary(model.c)
#Model D
model.d <- update(model.b, alcuse ~ age_14*(coa + peer))
summary(model.d)
#Model E
model.e <- update(model.b, alcuse ~ coa + peer*age_14)
summary(model.e)
#Model F
model.f <- update(model.b, alcuse ~ coa + cpeer*age_14)
summary(model.f)
#Model G
model.g <- update(model.b, alcuse ~ ccoa + cpeer*age_14)
summary(model.g)
### compare models
anova(model.a, model.b, model.c, model.d, model.e, model.f, model.g)
# Fig. 4.3, p. 99.
# Model B
### alternative
#
### could use predict for all these...
#
### model b
pred.b <- data.frame(age_14 = 0:2)
pred.b$pred <- predict(model.b, pred.b, level=0)
plot(pred ~ I(age_14 + 14), data = pred.b, type = "l", lwd = 2,
ylim = c(0, 2), xlim = c(13, 17),
ylab = "Alcuse", xlab = "Age")
### model c
pred.c <- expand.grid(age_14 = 0:2, coa = c("Not CoA","CoA"))
pred.c$pred <- predict(model.c, pred.c, level=0)
xyplot(pred ~ I(age_14 + 14), data = pred.c, type = "l", lwd = 2,
groups = coa, lty = 1:2, auto.key = list(points = FALSE, lines = TRUE),
ylim = c(0, 2), xlim = c(13, 17),
ylab = "Alcuse", xlab = "Age")
### model e
#
### peer is continuous, so let's plot regression surface
quantile(alcohol1$peer, c(0.05, 0.95))
pred.e <- expand.grid(age_14 = 0:2, coa = c("Not CoA","CoA"),
peer = seq(0, 2, 0.1))
pred.e$pred <- predict(model.e, pred.e, level=0)
trellis.device(theme = col.whitebg)
wireframe(pred ~ I(age_14 + 14) + peer, data = pred.e, type = "l", lwd = 2,
groups = coa, lty = 1:2, auto.key = list(points = FALSE, lines = TRUE),
ylim = c(0, 2), xlim = c(13, 17),
zlab = "Alcuse", ylab = "Peer", xlab = "Age")
### NOTE: would be easier to see with dynamic 3D graph, eg, rgl or rggobi
# Fig. 4.4
par(mfrow=c(2,2))
plot(intercept ~ coa, data = alcuse.lis.aug, col = "yellow")
plot(intercept ~ peer, data = alcuse.lis.aug, pch=16)
abline(h=0)
plot(age_14 ~ coa, data = alcuse.lis.aug, col = "yellow")
plot(age_14 ~ peer, data = alcuse.lis.aug, pch=16)
abline(h=0)
# Fig. 4.5, p. 131.
# Normality assumption plots.
### alternative
qqnorm(model.f, ~ resid(., type = "r"))
# Upper right panel
### alternative
plot(model.f, resid(., type = "p") ~ as.numeric(id), abline = 0, pch = 16)
plot(model.f, resid(., type = "p") ~ fitted(.), abline = 0, pch = 16)
### NOTE: id was converted to a factor, and plot() wants a numeric var
# Middle left panel
### alternative
qqnorm(model.f, ~ranef(.))
# Middle right panel
### alternative
qqnorm(model.f, ~ranef(.), standard = TRUE, abline = c(0,1))
apply(ranef(model.f, standard = TRUE), 2, mean)
apply(ranef(model.f, standard = TRUE), 2, sd)
### NOTE: standardized REs don't have SD of 1...?
plot(model.f, ranef(., standard = TRUE)[[1]] ~ as.numeric(id), abline = 0, pch = 16)
plot(model.f, ranef(., standard = TRUE)[[2]] ~ as.numeric(id), abline = 0, pch = 16)
# Fig. 4.6, p. 133.
# Homoscedasticity plots.
#
### alternative
plot(model.f, resid(.) ~ age, xlim = c(13,17), abline = 0,
type = c("p","smooth"), lwd = 2)
### NOTE: smoother suggests some nonlinearity
plot(model.f, ranef(.)[[1]] ~ coa, abline = 0, pch=16)
plot(model.f, ranef(.)[[1]] ~ peer, abline = 0, pch=16,
type = c("p","smooth"), lwd = 2)
plot(model.f, ranef(.)[[2]] ~ coa, abline = 0, pch=16)
plot(model.f, ranef(.)[[2]] ~ peer, abline = 0, pch=16,
type = c("p","smooth"), lwd = 2)
### Fig. 4.7, p. 136
plot(comparePred(alcuse.lis, model.b, primary = ~ alcohol1$age_14, length.out = 2))
### NOTE: above not working quite right for some reason...
plot(augPred(model.b, level = 0:1, primary = ~ age_14, length.out = 2))

复制代码

报纸

Lisrelchen 发表于 2016-5-29 05:33:40

################################################################################
###
### Chapter 5
###
################################################################################
reading <- read.table(file = file.choose(), header=T, sep=",")
names(reading) <- tolower(names(reading))
reading$id <- factor(reading$id)
summary(reading)
# Table 5.1, p. 141.
reading[reading$id %in% c(4, 27, 31, 33, 41, 49, 69, 77, 87), ]
# Fig. 5.1, p. 143.
# Empirical change plots with superimposed OLS trajectories.
# The +'s and solid lines correspond to time using the child's target
# age at data collection, whereas the dots and dashed lines correspond
# to time using the child's observed age.
xyplot(piat ~ age | id,
data = reading[reading$id %in% c(4, 27, 31, 33, 41, 49, 69, 77, 87), ],
panel = function(x, y, subscripts){
panel.xyplot(x, y, pch=16)
panel.lmline(x,y, lty=4)
panel.xyplot(reading$agegrp[subscripts], y, pch=3)
panel.lmline(reading$agegrp[subscripts],y)
},
ylim=c(0, 80), as.table=T, subscripts=T)
# Creating the centered variables called agegrp.c and age.c, p. 144.
mat2 <- reading[ ,3:4] - 6.5
dimnames(mat2)[[2]] <- c("agegrp.c", "age.c")
reading <- cbind(reading, mat2)
### or, a bit more transparently...
reading$age.c <- reading$age - 6.5
reading$agegrp.c <- reading$agegrp - 6.5
# Table 5.2, p. 145.
# Note: The degres of freedom used to in the calculations of the intercept
# is different from the results in the book and this difference in partitioning
# results also in slight differences in the standard error of the estimates.
#Using the agegrp variable.
lme.agegrp <- lme(piat ~ agegrp.c, reading, random = ~ agegrp.c | id, method="ML")
summary(lme.agegrp)
#Using the age variable.
lme.age <- lme(piat ~ age.c, reading, random= ~ age.c | id, method="ML")
summary(lme.age)
# Inputting the wages data set.
wages <- read.table(file = file.choose(), header=T, sep=",")
names(wages) <- tolower(names(wages))
### NOTE: for some reason, names have underscores and not periods
names(wages) <- gsub("_",".", names(wages))
wages$id <- factor(wages$id)
wages$black <- factor(wages$black, 0:1, c("White/Latino","Black"))
# Table 5.3, p. 147.
wages[wages$id %in% c(206, 332, 1028), c(1, 3, 2, 6, 8, 10)]
# Table 5.4, p. 149.
#Model A
model.5a <- lme(lnw ~ exper, wages, random= ~ exper | id, method="ML",
control = list(msVerbose = TRUE, niterEM = 200))
### NOTE: boosting EM iterations speeds convergence
summary(model.5a)
#Model B
model.5b <- update(model.5a, lnw ~ exper*hgc.9 + exper*black)
summary(model.5b)
#Model C
model.5c <- update(model.5b, lnw ~ exper + exper:black + hgc.9)
summary(model.5c)
# Fig. 5.2, p. 150.
# Log wage trajectories for four prototypical dropouts from model C.
pred.5c <- expand.grid(exper = 0:11, black = c("White/Latino", "Black"),
hgc.9 = c(0, 3))
pred.5c$pred <- predict(model.5c, pred.5c, level=0)
pred.5c$hgc.9 <- factor(pred.5c$hgc.9, c(0,3), c("9th Grade","12th Grade"))
xyplot(pred ~ exper |hgc.9, data = pred.5c, groups = black, type = "l", lwd = 2,
ylab = "LNW", xlab = "EXPER")
# Inputting the small wages data set.
wages.small <- read.table(file.choose(), header=T, sep=",")
names(wages.small) <- tolower(names(wages.small))
### NOTE: for some reason, names have underscores and not periods
names(wages.small) <- gsub("_",".", names(wages.small))
# Table 5.5, p. 154.
#Model A
model.5.5a <- lme(lnw ~ exper + hgc.9 + exper:black, wages.small,
random = ~ exper | id, method = "ML",
control = list(msVerbose = TRUE, niterEM = 1000))
### NOTE: jack up EM iterations to 1000
summary(model.5.5a)
#Model C
model.5.5c <- update(model.5.5a, random = ~ 1 | id)
anova(model.5.5c, model.5.5a)
### Simpler model preferred
# Inputting the unemployment data set.
unemployment <- read.table(file = file.choose(), header=T, sep=",")
names(unemployment) <- tolower(names(unemployment))
unemployment$id <- factor(unemployment$id)
# Table 5.6, p. 161.
unemployment[unemployment$id %in% c(7589, 55697, 67641, 65441, 53782),]
# Table 5.7, p. 163.
#Model A
model.5.7a <- lme(cesd ~ months, data = unemployment, random = ~ months | id,
control = list(msVerbose = TRUE, niterEM = 100), method="ML")
summary(model.5.7a)
qqnorm(model.5.7a, ~ resid(., type = "r"))
hist(resid(model.5.7a), col = "yellow"))
### NOTE: bit of skew in residuals
#Model B
model.5.7b <- update(model.5.7a, cesd ~ months + unemp)
summary(model.5.7b)
#Model C
model.5.7c <- update(model.5.7b, cesd ~ months*unemp)
summary(model.5.7c)
#Model D
model.5.7d <- update(model.5.7b, cesd ~ unemp + months:unemp,
random = ~ unemp + months:unemp | id,
control = list(msVerbose = TRUE, niterEM = 100,
opt = "optim"))
summary(model.5.7d)
### NOTE: interesting that switching the optimizer to "optim" vs. "nlminb"
### made the difference in convergence...
# Fig. 5.3, p. 165
pred.5.3 <- data.frame(months = rep(0:15, 4),
unemp = c(rep(1, 16), rep(c(1,0), c(6,10)),
rep(c(1,0), c(11,5)), rep(c(1,0,1), c(6, 5, 5))))
pred.5.3$quad <- factor(rep(1:4, each = nrow(pred.5.3)/4))
pred.5.3$pred <- predict(model.5.7b, pred.5.3, level=0)
trellis.device(theme = col.whitebg)
xyplot(pred ~ months | quad, data = pred.5.3, type = "l", lwd = 2,
ylab = "CES-D", xlab = "Months since job loss")
### NOTE: with significant futzing, we could make it identical...
# Fig. 5.4, p. 167.
pred.5.7 <- expand.grid(months = 0:15, unemp = c(0,1))
pred.5.7$pred.b <- predict(model.5.7b, pred.5.7, level = 0)
pred.5.7$pred.c <- predict(model.5.7c, pred.5.7, level = 0)
pred.5.7$pred.d <- predict(model.5.7d, pred.5.7, level = 0)
xyplot(pred.b ~ months, data = pred.5.7, groups = unemp,
type = "l", lwd = 2, lty = 1:2, auto.key = list(lines = TRUE),
ylab = "CES-D", xlab = "Months since job loss")
xyplot(pred.c ~ months, data = pred.5.7, groups = unemp,
type = "l", lwd = 2, lty = 1:2, auto.key = list(lines = TRUE),
ylab = "CES-D", xlab = "Months since job loss")
xyplot(pred.d ~ months, data = pred.5.7, groups = unemp,
type = "l", lwd = 2, lty = 1:2, auto.key = list(lines = TRUE),
ylab = "CES-D", xlab = "Months since job loss")
# Table 5.8, p. 175.
#Model A
model.5.8a <- lme(lnw ~ hgc.9 + ue.7 + exper + exper:black, data = wages,
random = ~ exper | id, method = "ML",
control = list(msVerbose = TRUE, niterEM = 100))
summary(model.5.8a)
#Model B
### NOTE: don't see where
model.5.8b <- update(model.5.8a, lnw ~ hgc.9 + ue.mean + ue.person.centered +
exper + exper:black)
summary(model.5.8b)
#Model C
model.5.8c <- update(model.5.8b, lnw ~ hgc.9 + ue1 + ue.centert1 + exper + exper:black)
summary(model.5.8c)
# Inputting the medication data set.
medication <- read.table(file = file.choose(), header=T, sep=",")
names(medication) <- tolower(names(medication))
medication$id <- factor(medication$id)
# Table 5.9, p. 182.
medication[c(1:6, 11, 16:21), c(3:8)]
# Table 5.10, p. 184.
#Using time (Model A)
model.5.10a <- lme(pos ~ treat*time, data = medication,
random = ~ time | id, method = "ML")
summary(model.5.10a)
#Using time - 3.33 (Model B)
model.5.10b <- update(model.5.10a, pos ~ treat*time333)
summary(model.5.10b)
#Using time - 6.67 (Model C)
model.5.10c <- update(model.5.10b, pos ~ treat*time667)
summary(model.5.10c)
# Fig. 5.5, p. 185.
# Note: The vertical lines reflect the magnitude of the effect of treatment
# when time is centered at different values.
days.seq <- seq(0, 7)
fixef.a <- fixef(model.5.10a)
trt <- fixef.a[[1]] + fixef.a[[2]] + (fixef.a[[3]]+fixef.a[[4]])*days.seq
cnt <- fixef.a[[1]] + fixef.a[[3]]*days.seq
### NOTE: most of these plots look better with PDF or postscript
pdf("Figure 5.5.pdf", height = 8, width = 10.5)
plot(days.seq, trt, ylim=c(140, 190), xlim=c(0, 7), type="l", lwd = 2,
xlab="Days", ylab=expression(widehat(POS)))
lines(days.seq, cnt, lty=4, lwd = 2)
legend(0, 190, c("treatment", "control"), lty=c(1, 4), lwd = 2)
segments(0, fixef.a[[1]] + fixef.a[[3]]*0, 0,
fixef.a[[1]] + fixef.a[[2]] + (fixef.a[[3]]+fixef.a[[4]])*0, lty = 2)
segments(3.33, fixef.a[[1]] + fixef.a[[3]]*3.33, 3.33,
fixef.a[[1]] + fixef.a[[2]] + (fixef.a[[3]]+fixef.a[[4]])*3.33, lty = 2)
segments(6.670, fixef.a[[1]] + fixef.a[[3]]*6.670, 6.670,
fixef.a[[1]] + fixef.a[[2]] + (fixef.a[[3]]+fixef.a[[4]])*6.670, lty = 2)
dev.off()

复制代码

地板

Lisrelchen 发表于 2016-5-29 05:34:17

################################################################################
###
### Chapter 6
###
################################################################################
# Table 6.1, p. 192.
# Creating the ged*exper variable in the wages data set.
wages$ged.exper <- wages$ged*wages$exper
print(wages[wages$id %in% c(206,2365,4384),c(1:5, 16)])
# Table 6.2, p. 203.
model.6.2a <- lme(lnw ~ exper + hgc.9 + exper:black + ue.7, data = wages,
random = ~ exper | id, method = "ML",
control = list(msVerbose = TRUE, opt = "optim", trace = 1))
2*model.6.2a$logLik
model.6.2b <- update(model.6.2a, lnw ~ exper + hgc.9 + exper:black + ue.7 + ged,
random = ~ exper + ged | id)
2*model.6.2b$logLik
model.6.2c <- update(model.6.2b, random = ~ exper | id)
2*model.6.2c$logLik
anova(model.6.2a, model.6.2c, model.6.2b)
model.6.2d <- update(model.6.2a, lnw ~ exper + hgc.9 + exper:black + ue.7 + postexp,
random = ~ exper + postexp | id,
control = list(msVerbose = TRUE, niterEM = 100, opt = "optim"))
2*model.6.2d$logLik
anova(model.6.2a, model.6.2d)
model.6.2e <- update(model.6.2d, random = ~ exper| id)
2*model.6.2e$logLik
model.6.2f <- update(model.6.2a, lnw ~ exper + hgc.9 + exper:black + ue.7 + postexp + ged,
random = ~ exper + postexp + ged | id,
control = list(msVerbose = TRUE, niterEM = 500, opt = "optim"))
### NOTE: with big var-cov models, a bigger EM burn-in with optim boosts convergence
### notably
2*model.6.2f$logLik
model.6.2g <- update(model.6.2f, random = ~ exper + ged | id)
2*model.6.2g$logLik
model.6.2h <- update(model.6.2f, random = ~ exper + postexp | id)
2*model.6.2h$logLik
model.6.2i <- update(model.6.2f, lnw ~ exper + hgc.9 + exper:black + ue.7 + ged + exper:ged,
random = ~ exper + ged + exper:ged | id)
2*model.6.2i$logLik
model.6.2j <- update(model.6.2i, random = ~ ged + exper | id)
2*model.6.2j$logLik
### NOTE: examine all models; LR tests won't (all) be correct, but AIC/BIC can be
### compared
anova(model.6.2a, model.6.2b, model.6.2c, model.6.2d, model.6.2e, model.6.2f,
model.6.2g, model.6.2h, model.6.2i, model.6.2j)
### NOTE: AIC prefers model F, but BIC prefers several ahead of F with E its favorite
# Table 6.3, p. 205.
# Summary of model F.
summary(model.6.2f)
pairs(model.6.2f, pch = ".", cex = 2)
### NOTE: notable correlations among random-effects
# Fig. 6.4, p. 209.
# We must first read in the alcohol data set and then run model e from table 4.1, p. 94.
### we can do these using predict
pred.6.4e <- expand.grid(age_14 = seq(0, 2, 0.1), coa = c("Not CoA","CoA"),
peer = c(0.655, 1.381))
pred.6.4e$pred <- predict(model.e, pred.6.4e, level=0)
pred.6.4e$pred <- pred.6.4e$pred^2 # square it to go back to original metric
trellis.device(theme = col.whitebg)
xyplot(pred ~ I(age_14 + 14) | peer, data = pred.6.4e, type = "l", lwd = 2,
groups = coa, lty = 1:2, auto.key = list(points = FALSE, lines = TRUE, lwd=2),
xlim = c(13, 17),
ylab = "Alcuse", xlab = "Age")
#reading in the alcohol data
alcohol <- read.table("d:/alcohol1_pp.txt", header=T, sep=",")
#table 4.1, model e
model.e <- lme(alcuse~coa+peer*age_14 , data=alcohol, random= ~ age_14 | id,
method="ML")
#obtaining the fixed effects parameters
fixef.e <- fixef(model.e)
#obtaining the predicted values and squaring them
fit2.ec0p0 <- (fixef.e[[1]] + .655*fixef.e[[3]] +
alcohol$age_14[1:3]*fixef.e[[4]] +
.655*alcohol$age_14[1:3]*fixef.e[[5]])^2
fit2.ec0p1 <- (fixef.e[[1]] + 1.381*fixef.e[[3]] +
alcohol$age_14[1:3]*fixef.e[[4]] +
1.381*alcohol$age_14[1:3]*fixef.e[[5]] )^2
fit2.ec1p0 <- (fixef.e[[1]] + fixef.e[[2]] + .655*fixef.e[[3]] +
alcohol$age_14[1:3]*fixef.e[[4]] +
.655*alcohol$age_14[1:3]*fixef.e[[5]] )^2
fit2.ec1p1 <- (fixef.e[[1]] + fixef.e[[2]] + 1.381*fixef.e[[3]] +
alcohol$age_14[1:3]*fixef.e[[4]] +
1.381*alcohol$age_14[1:3]*fixef.e[[5]])^2
plot(alcohol1$age[1:3], fit2.ec0p0, ylim=c(0, 3), type="n",
ylab="predicted alcuse squared", xlab="age")
lines(spline(alcohol1$age[1:3], fit2.ec0p0), pch=2, type="b")
lines(spline(alcohol1$age[1:3], fit2.ec0p1), type="b", pch=0)
lines(spline(alcohol1$age[1:3], fit2.ec1p0), type="b", pch=17)
lines(spline(alcohol1$age[1:3], fit2.ec1p1), type="b", pch=15)
title("Non-linear Change")
legend(14, 3, marks=c(2, 0, 17, 15), c("COA=0, low peer", "COA=0, high peer",
"COA=1, low peer", "COA=1, high peer"))
# Reading in the berkeley data set and then creating the transformed variables for iq and age.
berkeley <- read.table(file.choose(), header=T, sep=",")
names(berkeley) <- tolower(names(berkeley))
berkeley$age2.3 <- berkeley$age^(1/2.3)
berkeley$iq2.3 <- berkeley$iq^2.3
# Fig. 6.6, p. 212.
par(mfrow=c(1,3))
plot(berkeley$age, berkeley$iq, type="p", ylim=c(0, 250), xlim=c(0, 60),
ylab="IQ", xlab="TIME", pch=16)
plot(berkeley$age, berkeley$iq2.3, type="p", ylim=c(0, 300000), xlim=c(0, 60),
ylab="IQ^(2.3)", xlab="TIME", pch=16)
plot(berkeley$age2.3, berkeley$iq, type="p", ylim=c(0, 250), xlim=c(0, 6),
ylab="IQ", xlab="TIME^(1/2.3)", pch=16)
### Table 6.4?
# Reading in the external data set.
external <- read.table(file.choose(), header = TRUE, sep=",")
names(external) <- tolower(names(external))
# Creating the higher order variables for grade.
#
### NOTE: not needed...
external$grade2 <- external$grade^2
external$grade3 <- external$grade^3
external$grade4 <- external$grade^4
# Fig. 6.7, p. 218.
par(mfrow=c(2,4))
model.6.7a <- lm(external ~ poly(grade, 2), data = external,
subset = id == "1")
plot(1:6, fitted(model.6.7a), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "1", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "1", external)[[1]]), lty = 3)
model.6.7b <- lm(external ~ poly(grade, 2), data = external,
subset = id == "6")
plot(1:6, fitted(model.6.7b), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "6", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "6", external)[[1]]), lty = 3)
model.6.7c <- lm(external ~ grade, data = external,
subset = id == "11")
plot(1:6, fitted(model.6.7c), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "11", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "11", external)[[1]]), lty = 3)
### NOTE: looks like a quadratic would fit better...
model.6.7d <- lm(external ~ grade, data = external,
subset = id == "25")
plot(1:6, fitted(model.6.7d), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "25", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "25", external)[[1]]), lty = 3)
model.6.7e <- lm(external ~ poly(grade, 3), data = external,
subset = id == "34")
plot(1:6, fitted(model.6.7e), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "34", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "34", external)[[1]]), lty = 3)
model.6.7f <- lm(external ~ poly(grade, 4), data = external,
subset = id == "36")
plot(1:6, fitted(model.6.7f), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "36", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "36", external)[[1]]), lty = 3)
model.6.7g <- lm(external ~ poly(grade, 2), data = external,
subset = id == "40")
plot(1:6, fitted(model.6.7g), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "40", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "40", external)[[1]]), lty = 3)
model.6.7h <- lm(external ~ poly(grade, 2), data = external,
subset = id == "26")
plot(1:6, fitted(model.6.7h), type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == "26", external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == "26", external)[[1]]), lty = 3)
### alternative in a for loop
ex.ids <- c("1","6","11","25","34","36","40","26")
par(mfrow=c(2,4))
for (i in 1:length(ex.ids)){
fit <- lm(external ~ poly(grade, 4), data = external,
subset = id == ex.ids[i])
pred <- predict(fit, data.frame(grade = seq(1, 6, 0.1)))
plot(x = seq(1, 6, 0.1), y = pred, type = "l", lwd = 2, col = "red",
ylim = c(0,60), ylab = "EXTERNAL", xlab = "GRADE")
points(1:6, subset(external, id == ex.ids[i], external)[[1]], pch = 16)
lines(spline(1:6, subset(external, id == ex.ids[i], external)[[1]]), lty = 3)
}
### NOTE: could use drop1() to find best fit if we wanted...
# Table 6.5, p. 221.
# Comparison of fitting alternative polynomial change trajectories to the external data set.
model.6.5a <- lme(external ~ 1, random = ~ 1 | id, method = "ML", external)
summary(model.6.5a)
model.6.5b <- update(model.6.5a, external ~ time, random = ~ time | id)
summary(model.6.5b)
### NOTE: could use poly(time, degree) for orthogonal polynomials
model.6.5c <- update(model.6.5a, external ~ time + I(time^2),
random = ~ time + I(time^2) | id,
control = list(msVerbose = TRUE, niterEM = 100, opt = "optim"))
summary(model.6.5c)
model.6.5d <- update(model.6.5c, external ~ time + I(time^2) + I(time^3),
random = ~ time + I(time^2) + I(time^3) | id)
summary(model.6.5d)
pairs(model.6.5d, pch = 16) # super high correlations
model.6.5d.1 <- update(model.6.5c, external ~ poly(time, 3),
random = ~ poly(time, 3) | id)
summary(model.6.5d.1)
### NOTE: converges faster with orthogonal polynomials, which remove non-essential
### correlation
# Reading in the fox and geese data.
fg.df <- read.table(file.choose(), header = TRUE, sep=",")
names(fg.df) <- tolower(names(fg.df))
fg.df$id <- factor(fg.df$id)
str(fg.df)
# Fig. 6.8, p. 227.
# Empirical growth plots for 8 children in the fox and geese data.
xyplot(nmoves ~ game | id,
data=fg.df[fg.df$id %in% c(1, 4, 6, 7, 8, 11, 12, 15), ],
ylim=c(0, 25), as.table = T, pch = 16)
# Table 6.6, p. 231.
# Fitting a logistic model to the fox and geese data.
model.6.6a <- nlme(nmoves ~ 1 + 19/(1 + xmid*exp( -scal*game + u)),
fixed = scal + xmid ~ 1,
random = scal + u ~ 1 |id,
start = c(scal = 0.2, xmid = 12),
data = fg.df,
control = list(msVerbose = TRUE, niterEM = 100))
summary(model.6.6a)
model.6.6b <- nlme(nmoves ~ 1 + 19/(1 + xmid*exp(-scal10*game - scal01*read.c - scal11*readc.game + u)),
fixed = scal10 + scal01 + scal11 + xmid ~ 1,
random = scal10 + u ~ 1 |id,
start = c(scal10 = 0.12, scal01 = -0.4, scal11 = 0.04, xmid = 12),
data = fg.df)
summary(model.b)
### NOTE: need to check book for read.c and readc.game
# Fig. 6.10, p. 232.
fixef.a <- fixef(model.a)
fit.a <- 1 + 19/(1 + fixef.a[[2]]*exp(-fixef.a[[1]]*fg.df$game[1:27]))
plot(fg.df$game[1:27], fit.a, ylim=c(0, 25), type="l",
ylab="predicted nmoves", xlab="game")
title("Model A \n Unconditional logistic growth")
fixef.b <- fixef(model.b)
fit.b.high <- 1 + 19/(1+fixef.b[[4]]*exp(-fixef.b[[1]]*fg.df$game[1:27] -
fixef.b[[2]]*1.58 - fixef.b[[3]]*1.58*fg.df$game[1:27]))
fit.b.low <- 1 + 19/(1+fixef.b[[4]]*exp(-fixef.b[[1]]*fg.df$game[1:27] -
fixef.b[[2]]*(-1.58) - fixef.b[[3]]*(-1.58)*fg.df$game[1:27]))
plot(fg.df$game[1:27], fit.b.high, ylim=c(0, 25), type="l",
ylab="predicted nmoves", xlab="game")
lines(fg.df$game[1:27], fit.b.low, lty=3)
title("Model B \n Fitted logistic growth by reading level")
legend(1, 25, c("High Reading level","Low reading level"), lty=c(1, 3))

复制代码

7楼

Lisrelchen 发表于 2016-5-29 05:34:56

################################################################################
###
### Chapter 7
###
################################################################################
opposites <- read.table(file.choose(), header = TRUE, sep=",")
names(opposites) <- tolower(names(opposites))
opposites$id <- factor(opposites$id)
# Table 7.2, p. 246.
opp.reml <- lme(opp ~ time*ccog,
data = opposites,
random = ~ time | id,
control = list(msVerbose = TRUE, niterEM = 100, opt = "optim"))
summary(opp.reml)
# Table 7.3, p. 258-259.
### NOTE: why did S&W drop the random-effects in fitting these models?
#
#compound symmetry
fit.cs <- gls(opp ~ time * ccog, data = opposites,
correlation = corCompSymm(form = ~ time | id))
summary(fit.cs) ; 2 * logLik(fit.cs) ; fit.cs$sigma^2
#autoregressive
fit.ar1 <- gls(opp ~ time * ccog, data = opposites,
corr = corAR1(form = ~ wave | id))
2 * logLik(fit.ar1) ; fit.ar1$sigma^2
# Table 7.4, p. 265.
# Standard error covariance structure
summary(opp.reml)
# Unstructured error covariance structure
opp.unstr <- update(opp.reml, correlation = corSymm(form = ~ wave | id))
summary(opp.unstr)
### Additional analyses
#
### ACF plot from random intercept/slope model
plot(ACF(opp.reml), alpha = 0.01)
### add AR1
opp.reml.ar1 <- update(opp.reml, corr = corAR1(form = ~ wave | id))
anova(opp.reml, opp.reml.ar1)
### not preferred
anova(fit.cs, fit.ar1, opp.reml, opp.reml.ar1, opp.unstr)
### either the gls with AR1 or random intercepts/slopes
#
### NOTE: possibly examine variogram and other correlation methods

复制代码

8楼

richardgu26 发表于 2016-5-29 07:01:56

9楼

dingyuezhang 发表于 2016-5-29 08:06:55

谢谢楼主分享！

10楼

frankly1020

发表于 2016-5-29 10:42:47

好书，学习啦

Courses Notes: Applied Multilevel Modeling [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

本帖隐藏的内容

扫码加我拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

初级热心勋章

本版微信群

Courses Notes: Applied Multilevel Modeling [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

本帖隐藏的内容

扫码加我 拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

初级热心勋章

本版微信群

扫码加我拉你入群