【独家发布】Jared E. Knowles:Fun with merMod Objects

0关注
62粉丝

VIP

已卖：4196份资源

院士

67%

还不是VIP/贵宾

-

TA的文库 其他...

Bayesian NewOccidental

Spatial Data Analysis

东西方数据挖掘

0%

威望: 0 级
论坛币: 50294 个
通用积分: 83.8106
学术水平: 253 点
热心指数: 300 点
信用等级: 208 点
经验: 41518 点
帖子: 3256
精华: 14
在线时间: 766 小时
注册时间: 2006-5-4
最后登录: 2022-11-6

楼主

Lisrelchen 发表于 2016-7-2 23:46:16 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Fun with merMod Objects
Jared E. Knowles
Saturday, May 17, 2014
Introduction
First of all, be warned, the terminology surrounding multilevel models is vastly inconsistent. For example, multilevel models themselves may be referred to as hierarchical linear models, random effects models, multilevel models, random intercept models, random slope models, or pooling models. Depending on the discipline, software used, and the academic literature many of these terms may be referring to the same general modeling strategy. In this tutorial I will attempt to provide a user guide to multilevel modeling by demonstrating how to fit multilevel models in R and by attempting to connect the model fitting procedure to commonly used terminology used regarding these models.
We will cover the following topics:
The structure and methods of merMod objects
Extracting random effects of merMod objects
Plotting and interpreting merMod objects
If you haven’t already, make sure you head over to the Getting Started With Multilevel Models tutorial in order to ensure you have set up your environment correctly and installed all the necessary packages. The tl;dr is that you will need:
A current version of R (2.15 or greater)
The lme4 package (install.packages("lme4"))

复制代码

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：Objects object Jared With merm discipline literature themselves referring software

本帖被以下文库推荐

· Multilevel NewOccidental|主题: 247, 订阅: 21

沙发

Lisrelchen 发表于 2016-7-2 23:47:02

Read in the data
Multilevel models are appropriate for a particular kind of data structure where units are nested within groups (generally 5+ groups) and where we want to model the group structure of the data. We will use data from Jon Starkweather at the University of North Texas on student social attributes within schools and classes. Visit the excellent tutorial available here for more.
library(lme4) # load library
library(arm) # convenience functions for regression in R
lmm.data <- read.table("http://www.unt.edu/rss/class/Jon/R_SC/Module9/lmm.data.txt",
header=TRUE, sep=",", na.strings="NA", dec=".", strip.white=TRUE)
#summary(lmm.data)
head(lmm.data)
## id extro open agree social class school
## 1 1 63.69 43.43 38.03 75.06 d IV
## 2 2 69.48 46.87 31.49 98.13 a VI
## 3 3 79.74 32.27 40.21 116.34 d VI
## 4 4 62.97 44.41 30.51 90.47 c IV
## 5 5 64.25 36.86 37.44 98.52 d IV
## 6 6 50.97 46.26 38.83 75.22 d I
Here we have data on the extroversion of subjects nested within classes and within schools.
Let’s understand the structure of the data a bit before we begin:
str(lmm.data)
## 'data.frame': 1200 obs. of 7 variables:
## $ id : int 1 2 3 4 5 6 7 8 9 10 ...
## $ extro : num 63.7 69.5 79.7 63 64.2 ...
## $ open : num 43.4 46.9 32.3 44.4 36.9 ...
## $ agree : num 38 31.5 40.2 30.5 37.4 ...
## $ social: num 75.1 98.1 116.3 90.5 98.5 ...
## $ class : Factor w/ 4 levels "a","b","c","d": 4 1 4 3 4 4 4 4 1 2 ...
## $ school: Factor w/ 6 levels "I","II","III",..: 4 6 6 4 4 1 3 4 3 1 ...
Here we see we have two possible grouping variables – class and school. Let’s explore them a bit further:
table(lmm.data$class)
##
## a b c d
## 300 300 300 300
table(lmm.data$school)
##
## I II III IV V VI
## 200 200 200 200 200 200
table(lmm.data$class, lmm.data$school)
##
## I II III IV V VI
## a 50 50 50 50 50 50
## b 50 50 50 50 50 50
## c 50 50 50 50 50 50
## d 50 50 50 50 50 50
This is a perfectly balanced dataset. In all likelihood you aren’t working with a perfectly balanced dataset, but we’ll explore the implications for that in the future. For now, let’s plot the data a bit. Using the excellent xyplot function in the lattice package, we can explore the relationship between schools and classes across our variables.
require(lattice)
xyplot(extro ~ open + social + agree | class, data = lmm.data,
auto.key = list(x = .85, y = .035, corner = c(0, 0)),
layout = c(4,1), main = "Extroversion by Class")

复制代码

藤椅

Lisrelchen 发表于 2016-7-2 23:51:24

By school we see that students are tightly grouped, but that school I and school VI show substantially more dispersion than the other schools. The same pattern among our predictors holds between schools as it did between classes. Let’s put it all together:
xyplot(extro ~ open + social + agree | school + class, data = lmm.data,
auto.key = list(x = .85, y = .035, corner = c(0, 0)),
main = "Extroversion by School and Class")

复制代码

板凳

Lisrelchen 发表于 2016-7-2 23:52:21

Exploring the Internals of a merMod Object
In the last tutorial we fit a series of random intercept models to our nested data. We will examine the lmerMod object produced when we fit this model in much more depth in order to understand how to work with mixed effect models in R. We start by fitting a the basic example below grouped by class:
MLexamp1 <- lmer(extro ~ open + agree + social + (1|school), data=lmm.data)
class(MLexamp1)
## [1] "lmerMod"
## attr(,"package")
## [1] "lme4"
First, we see that MLexamp1 is now an R object of the class lmerMod. This lmerMod object is an S4 class, and to explore its structure, we use slotNames:
slotNames(MLexamp1)
## [1] "resp" "Gp" "call" "frame" "flist" "cnms" "lower"
## [8] "theta" "beta" "u" "devcomp" "pp" "optinfo"
#showMethods(classes="lmerMod")
Within the lmerMod object we see a number of objects that we may wish to explore. To look at any of these, we can simply type MLexamp1@ and then the slot name itself. For example:
MLexamp1@call # returns the model call
## lmer(formula = extro ~ open + agree + social + (1 | school),
## data = lmm.data)
MLexamp1@beta # returns the betas
## [1] 59.116514 0.009751 0.027788 -0.002151
class(MLexamp1@frame) # returns the class for the frame slot
## [1] "data.frame"
head(MLexamp1@frame) # returns the model frame
## extro open agree social school
## 1 63.69 43.43 38.03 75.06 IV
## 2 69.48 46.87 31.49 98.13 VI
## 3 79.74 32.27 40.21 116.34 VI
## 4 62.97 44.41 30.51 90.47 IV
## 5 64.25 36.86 37.44 98.52 IV
## 6 50.97 46.26 38.83 75.22 I
The merMod object has a number of methods available – too many to enumerate here. But, we will go over some of the more common in the list below:
methods(class = "merMod")
## [1] anova.merMod* as.function.merMod* coef.merMod*
## [4] confint.merMod deviance.merMod* df.residual.merMod*
## [7] drop1.merMod* extractAIC.merMod* extractDIC.merMod*
## [10] family.merMod* fitted.merMod* fixef.merMod*
## [13] formula.merMod* isGLMM.merMod* isLMM.merMod*
## [16] isNLMM.merMod* isREML.merMod* logLik.merMod*
## [19] model.frame.merMod* model.matrix.merMod* ngrps.merMod*
## [22] nobs.merMod* plot.merMod* predict.merMod*
## [25] print.merMod* profile.merMod* qqmath.merMod*
## [28] ranef.merMod* refit.merMod* refitML.merMod*
## [31] residuals.merMod* sigma.merMod* simulate.merMod*
## [34] summary.merMod* terms.merMod* update.merMod*
## [37] VarCorr.merMod* vcov.merMod weights.merMod*
##
## Non-visible functions are asterisked
A common need is to extract the fixed effects from a merMod object. fixef extracts a named numeric vector of the fixed effects, which is handy.
fixef(MLexamp1)
## (Intercept) open agree social
## 59.116514 0.009751 0.027788 -0.002151
If you want to get a sense of the p-values or statistical significance of these parameters, first consult the lme4 help by running ?mcmcsamp for a rundown of various ways of doing this. One convenient way built into the package is:
confint(MLexamp1, level = 0.99)
## 0.5 % 99.5 %
## .sig01 4.91840 23.88758
## .sigma 2.53287 2.81456
## (Intercept) 46.27751 71.95610
## open -0.02465 0.04415
## agree -0.01164 0.06721
## social -0.01493 0.01063
From here we can see first that our fixed effect parameters overlap 0 indicating no evidence of an effect. We can also see that .sig01, which is our estimate of the variability in the random effect, is very large and very widely defined. This indicates we may have a lack of precision between our groups - either because the group effect is small between groups, we have too few groups to get a more precise estimate, we have too few units within each group, or a combination of all of the above.
Another common need is to extract the residual standard error, which is necessary for calculating effect sizes. To get a named vector of the residual standard error:
sigma(MLexamp1)
## [1] 2.671
For example, it is common practice in education research to standardize fixed effects into “effect sizes” by dividing the fixed effect paramters by the residual standard error, which can be accomplished in lme4 easily:
fixef(MLexamp1) / sigma(MLexamp1)
## (Intercept) open agree social
## 22.1336705 0.0036508 0.0104042 -0.0008055
From this, we can see that our predictors of openness, agreeableness and social are virtually useless in predicting extroversion – as our plots showed. Let’s turn our attention to the random effects next.
Explore Group Variation and Random Effects
In all likelihood you fit a mixed-effect model because you are directly interested in the group-level variation in your model. It is not immediately clear how to explore this group level variation from the results of summary.merMod. What we get from this output is the variance and the standard deviation of the group effect, but we do not get effects for individual groups. This is where the ranef function comes in handy.
ranef(MLexamp1)
## $school
## (Intercept)
## I -14.091
## II -6.183
## III -1.971
## IV 1.966
## V 6.331
## VI 13.948
Running the ranef function gives us the intercepts for each school, but not much additional information – for example the precision of these estimates. To do that, we need some additional commands:
re1 <- ranef(MLexamp1, condVar=TRUE) # save the ranef.mer object
class(re1)
## [1] "ranef.mer"
attr(re1[[1]], which = "postVar")
## , , 1
##
## [,1]
## [1,] 0.03565
##
## , , 2
##
## [,1]
## [1,] 0.03565
##
## , , 3
##
## [,1]
## [1,] 0.03565
##
## , , 4
##
## [,1]
## [1,] 0.03565
##
## , , 5
##
## [,1]
## [1,] 0.03565
##
## , , 6
##
## [,1]
## [1,] 0.03565
The ranef.mer object is a list which contains a data.frame for each group level. The dataframe contains the random effects for each group (here we only have an intercept for each school). When we ask lme4 for the conditional variance of the random effects it is stored in an attribute of those dataframes as a list of variance-covariance matrices.
This structure is indeed complicated, but it is powerful as it allows for nested, grouped, and cross-level random effects. Also, the creators of lme4 have provided users with some simple shortcuts to get what they are really interested in out of a ranef.mer object.
re1 <- ranef(MLexamp1, condVar=TRUE, whichel = "school")
print(re1)
## $school
## (Intercept)
## I -14.091
## II -6.183
## III -1.971
## IV 1.966
## V 6.331
## VI 13.948
##
## with conditional variances for "school"
dotplot(re1)
## $school
plot of chunk ranef3
This graphic shows a dotplot of the random effect terms, also known as a caterpillar plot. Here you can clearly see the effects of each school on extroversion as well as their standard errors to help identify how distinct the random effects are from one another. Interpreting random effects is notably tricky, but for assistance I would recommend looking at a few of these resources:
Gelman and Hill 2006 - Data Analysis Using Regression and Multilevel/Hierarchical Techniques
John Fox - An R Compantion to Applied Regression web appendix

复制代码

报纸

Lisrelchen 发表于 2016-7-2 23:52:45

Using Simulation and Plots to Explore Random Effects
A common econometric approach is to create what are known as empirical Bayes estimates of the group-level terms. Unfortunately there is not much agreement about what constitutes a proper standard error for the random effect terms or even how to consistently define empirical Bayes estimates.1 However, in R there are a few additional ways to get estimates of the random effects that provide the user with information about the relative sizes of the effects for each unit and the precision in that estimate. To do this, we use the sim function in the arm package.2
# A function to extract simulated estimates of random effect paramaters from
# lme4 objects using the sim function in arm
# whichel = the character for the name of the grouping effect to extract estimates for
# nsims = the number of simulations to pass to sim
# x = model object
REsim <- function(x, whichel=NULL, nsims){
require(plyr)
mysim <- sim(x, n.sims = nsims)
if(missing(whichel)){
dat <- plyr::adply(mysim@ranef[[1]], c(2, 3), plyr::each(c(mean, median, sd)))
warning("Only returning 1st random effect because whichel not specified")
} else{
dat <- plyr::adply(mysim@ranef[[whichel]], c(2, 3), plyr::each(c(mean, median, sd)))
}
return(dat)
}
REsim(MLexamp1, whichel = "school", nsims = 1000)
## X1 X2 mean median sd
## 1 I (Intercept) -14.053 -14.093 3.990
## 2 II (Intercept) -6.141 -6.122 3.988
## 3 III (Intercept) -1.934 -1.987 3.981
## 4 IV (Intercept) 2.016 2.051 3.993
## 5 V (Intercept) 6.378 6.326 3.981
## 6 VI (Intercept) 13.992 14.011 3.987
The REsim function returns for each school the level name X1, the estimate name, X2, the mean of the estimated values, the median, and the standard deviation of the estimates.
Another convenience function can help us plot these results to see how they compare to the results of dotplot:
# Dat = results of REsim
# scale = factor to multiply sd by
# var = character of "mean" or "median"
# sd = character of "sd"
plotREsim <- function(dat, scale, var, sd){
require(eeptools)
dat[, sd] <- dat[, sd] * scale
dat[, "ymax"] <- dat[, var] + dat[, sd]
dat[, "ymin"] <- dat[, var] - dat[, sd]
dat[order(dat[, var]), "id"] <- c(1:nrow(dat))
ggplot(dat, aes_string(x = "id", y = var, ymax = "ymax",
ymin = "ymin")) +
geom_pointrange() + theme_dpi() +
labs(x = "Group", y = "Effect Range", title = "Effect Ranges") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.text.x = element_blank(), axis.ticks.x = element_blank()) +
geom_hline(yintercept = 0, color = I("red"), size = I(1.1))
}
plotREsim(REsim(MLexamp1, whichel = "school", nsims = 1000), scale = 1.2,
var = "mean", sd = "sd")
plot of chunk ebplot
This presents a more conservative view of the variation between random effect components. Depending on how your data was collected and your research question, alternative ways of estimating these effect sizes are possible. However, proceed with caution.3
Another approach recommended by the authors of lme4 involves the RLRsim package. Using this package we can test whether or not inclusion of the random effects improves model fit and we can evaluate the p-value of additional random effect terms using a likelihood ratio test based on simulation.4
library(RLRsim)
m0 <- lm(extro ~ agree + open + social, data =lmm.data) # fit the null model
exactLRT(m = MLexamp1, m0 = m0)
##
## simulated finite sample distribution of LRT. (p-value based on
## 10000 simulated values)
##
## data:
## LRT = 2958, p-value < 2.2e-16
Here exactLRT issues a warning because we originally fit the model with REML instead of full maximum likelihood. Fortunately, the refitML function in lme4 allows us to easily refit our model using full maximum likelihood to conduct an exact test easily.
mA <- refitML(MLexamp1)
exactLRT(m= mA, m0 = m0)
##
## simulated finite sample distribution of LRT. (p-value based on
## 10000 simulated values)
##
## data:
## LRT = 2958, p-value < 2.2e-16
Here we can see that the inclusion of our grouping variable is significant, even though the effect of each individual group may be substantively small and/or imprecisely measured. This is important in understanding the correct specification of the model. Our next tutorial will cover specification tests like this in more detail.

复制代码

地板

Lisrelchen 发表于 2016-7-2 23:53:15

What do Random Effects Matter?
How do interpret the substantive impact of our random effects? This is often critical in observation work trying to use a multilevel structure to understand the impact that the grouping can have on the individual observation. To do this we select 12 random cases and then we simulate their predicted value of extro if they were placed in each of the 6 schools. Note, that this is a very simple simulation just using the mean of the fixed effect and the conditional mode of the random effect and not replicating or sampling to get a sense of the variability. This will be left as an exercise to the reader and/or a future tutorial!
# Simulate
# Let's create 12 cases of students
#
#sample some rows
simX <- sample(lmm.data$id, 12)
simX <- lmm.data[lmm.data$id %in% simX, c(3:5)] # get their data
# add an intercept
simX[, "Intercept"] <- 1
simX <- simX[, c(4, 1:3)] # reorder
simRE <- REsim(MLexamp1, whichel = "school", nsims = 1000) # simulate randome effects
simX$case <- row.names(simX) # create a case ID
# expand a grid of case IDs by schools
simDat <- expand.grid(case = row.names(simX), school = levels(lmm.data$school))
simDat <- merge(simX, simDat) # merge in the data
# Create the fixed effect predictor
simDat[, "fepred"] <- (simDat[, 2] * fixef(MLexamp1)[1]) + (simDat[, 3] * fixef(MLexamp1)[2]) +
(simDat[, 4] * fixef(MLexamp1)[3]) + (simDat[, 5] * fixef(MLexamp1)[4])
# Add the school effects
simDat <- merge(simDat, simRE[, c(1, 3)], by.x = "school", by.y="X1")
simDat$yhat <- simDat$fepred + simDat$mean # add the school specific intercept
Now that we have set up a simulated dataframe, let’s plot it, first by case:
qplot(school, yhat, data = simDat) + facet_wrap(~case) + theme_dpi()

复制代码

7楼

Lisrelchen 发表于 2016-7-2 23:53:46

Here we can clearly see that within each school the cases are relatively the same indicating that the group effect is larger than the individual effects.
These plots are useful in demonstrating the relative importance of group and individual effects in a substantive fashion. Even more can be done to make the the graphs more informative, such as placing references to the total variability of the outcome and also looking at the distance moving groups moves each observation from its true value.
Conclusion
lme4 provides a very powerful object-oriented toolset for dealing with mixed effect models in R. Understanding model fit and confidence intervals of lme4 objects requires some diligent research and the use of a variety of functions and extensions of lme4 itself. In our next tutorial we will explore how to identify a proper specification of a random-effect model and Bayesian extensions of the lme4 framework for difficult to specify models. We will also explore the generalized linear model framework and the glmer function for generalized linear modeling with multi-levels.
Appendix
print(sessionInfo(),locale=FALSE)
## R version 3.1.0 (2014-04-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] RLRsim_2.1-3 eeptools_0.3.1 ggplot2_0.9.3.1 plyr_1.8.1
## [5] lattice_0.20-29 arm_1.7-03 MASS_7.3-33 lme4_1.1-6
## [9] Rcpp_0.11.1 Matrix_1.1-3 knitr_1.5
##
## loaded via a namespace (and not attached):
## [1] abind_1.4-0 car_2.0-20 coda_0.16-1
## [4] colorspace_1.2-4 data.table_1.9.2 digest_0.6.4
## [7] evaluate_0.5.5 foreign_0.8-61 formatR_0.10
## [10] grid_3.1.0 gtable_0.1.2 labeling_0.2
## [13] maptools_0.8-29 memisc_0.96-9 minqa_1.2.3
## [16] munsell_0.4.2 nlme_3.1-117 nnet_7.3-8
## [19] proto_0.3-10 RcppEigen_0.3.2.1.2 reshape2_1.4
## [22] rmarkdown_0.1.25 scales_0.2.4 sp_1.0-15
## [25] splines_3.1.0 stringr_0.6.2 tools_3.1.0
## [28] yaml_2.1.11
[See message from lme4 co-author Doug Bates on this subject]. (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q4/002984.html)↩
Andrew Gelman and Yu-Sung Su (2014). arm: Data Analysis Using Regression and Multilevel/Hierarchical Models. R package version 1.7-03. http://CRAN.R-project.org/package=arm↩
WikiDot FAQ from the R Mixed Models Mailing List↩
There are also an extensive series of references available in the References section of the help by running ?exactLRT and ?exactRLRT.↩

复制代码

【独家发布】Jared E. Knowles:Fun with merMod Objects [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

【独家发布】Jared E. Knowles:Fun with merMod Objects [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

本帖被以下文库推荐

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群