you can use proc glimmix which is intended for the categorical outcome with random effect
Basic Features
The GLIMMIX procedure enables you to specify a generalized linear mixed model
and to perform confirmatory inference in such models. The syntax is similar to that
of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.
The following are some of the basic features of PROC GLIMMIX.
•
SUBJECT= and GROUP= options, which enable blocking of variance matrices and parameter heterogeneity
•
choice of linearization about expected values or expansion about current solutions of best linear unbiased predictors
•
flexible covariance structures for random and residual random effects, including variance components, unstructured, autoregressive, and spatial structures
•
CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements, which produce hypothesis tests and estimable linear combinations of effects
•
NLOPTIONS statement, which enables you to exercise control over the numerical optimization. You can choose techniques, update methods, line search
algorithms, convergence criteria, and more. Or, you can choose the default
optimization strategies selected for the particular class of model you are fitting
•
computed variables with SAS programming statements inside of PROC GLIMMIX (except for variables listed in the CLASS statement). These computed
variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ
statements.
•
grouped data analysis •
user-specified link and variance functions •
choice of model-based variance-covariance estimators for the fixed effects or empirical (sandwich) estimators to make analysis robust against misspecification
of the covariance structure and to adjust for small-sample bias
•
joint modeling for multivariate data. For example, you can model binary and normal responses from a subject jointly and use random effects to relate (fuse)
the two outcomes.
•
multinomial models for ordinal and nominal outcomes •
univariate and multivariate low-rank smoothing Assumptions
The primary assumptions underlying the analyses performed by PROC GLIMMIX
are as follows:
•
If the model contains random effects, the distribution of the data conditional on the random effects is known. This distribution is either a member of the
exponential family of distributions or one of the supplementary distributions
provided by the GLIMMIX procedure. In models without random effects, the
Notation for the Generalized Linear Mixed Model
7 unconditional (marginal) distribution is assumed to be known for maximum
likelihood estimation, or the first two moments are known in the case of quasilikelihood
estimation.
•
The conditional expected value of the data takes the form of a linear mixed model after a monotonic transformation is applied.
•
The problem of fitting the GLMM can be cast as a singly or doubly iterative optimization problem. The objective function for the optimization is a function
of either the actual log likelihood, an approximation to the log likelihood, or
the log likelihood of an approximated model.
For a model containing random effects, the GLIMMIX procedure, by default, estimates
the parameters by applying pseudo-likelihood techniques as in Wolfinger and
O’Connell (1993) and Breslow and Clayton (1993). In a model without random
effects (GLM models), PROC GLIMMIX estimates the parameters by maximum
likelihood, restricted maximum likelihood, or quasi-likelihood. See the
“Singly or Doubly Iterative Fitting”
section on page 140 on when the GLIMMIX procedure applies noniterative, singly and doubly iterative algorithms, and the
“Default Estimation Techniques”
section on page 142 on the default estimation methods. Once the parameters have been estimated, you can perform statistical inferences for
the fixed effects and covariance parameters of the model. Tests of hypotheses for
the fixed effects are based on Wald-type tests and the estimated variance-covariance
matrix.
PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling
the output from SAS procedures. ODS enables you to convert any of the
output from PROC GLIMMIX into a SAS data set. See the
“ODS Table Names” section on page 160.
ODS statistical graphics are available with the GLIMMIX procedure. For more information,
see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.
For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical
Graphics Using ODS” (
SAS/STAT User’s Guide). Notation for the Generalized Linear Mixed Model
This section introduces the mathematical notation used throughout the chapter to
describe the generalized linear mixed model (GLMM). See the
“Details” section on page 108 for a description of the fitting algorithms and the mathematical-statistical
details.
The Basic Model
Suppose
Y represents the (n × 1) vector of observed data and is a (r × 1) vector of random effects. Models fit by the GLIMMIX procedure assume that
E
[Y|] = g−1(X + Z) where
g(·) is a differentiable monotonic link function and g−1(·) is its inverse. The matrix
Xis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random 8
The GLIMMIX Procedure effects. The random effects are assumed to be normally distributed with mean
0 and variance matrix
G. The GLMM contains a linear mixed model inside the inverse link function. This
model component is referred to as the linear predictor,
= X + Z The variance of the observations, conditional on the random effects, is
var
[Y|] = A1/2RA1/2 The matrix
A is a diagonal matrix and contains the variance functions of the model. The variance function expresses the variance of a response as a function of the mean.
The GLIMMIX procedure determines the variance function from the
DIST= option in the MODEL statement or from the user-supplied variance function (see the
“Implied Variance Functions”
section on page 104). The matrix R is a variance matrix specified by the user through the RANDOM statement. If the conditional distribution of
the data contains an additional scale parameter, it is either part of the variance functions
or part of the
R matrix. For example, the gamma distribution with mean μ has variance function
a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side random effects only (see below), the procedure models
R = -I, where I is the identity matrix.
Table 9 on page 104 identifies the distributions for which - 1. G-side and R-side Random Effects
The GLIMMIX procedure distinguishes two types of random effects. Depending on
whether the variance of the random effect is contained in
Gor inR, these are referred to as “G-side” and “R-side” random effects. R-side effects are also called “residual”
effects. Simply put, if a random effect is an element of , it is a G-side effect;
otherwise, it is an R-side effect. Models without G-side effects are also known as
marginal (or population-averaged) models. Models fit with the GLIMMIX procedure
can have none, one, or more of each type of effect.
Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED
effect in the MIXED procedure. In the GLIMMIX procedure all random effects are
specified through the RANDOM statement.
The columns of
X are constructed from effects listed on the right-hand side in the MODEL statement. Columns of
Z and the variance matricesGandRare constructed from the RANDOM statement.
The
R matrix is by default the scaled identity matrix, R = -I. The scale parameter -
is set to one if the distribution does not have a scale parameter, for example, in the case of the the binary, binomial, Poisson, and exponential distribution (see
Table 9
on page 104). To specify a different R matrix, use the RANDOM statement with the –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that
the
Time effect for each patient is an R-side effect with a first-order autoregressive covariance structure, use the RESIDUAL option:
PROC GLIMMIX Contrasted with Other SAS Procedures
9 random time / type=ar(1) subject=patient residual;
To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword
random _residual_;
You specify the link function
g(·) with the LINK= option of the MODEL statement or with programming statements. You specify the variance function that controls the
matrix
A with the DIST= option of the MODEL statement or with programming statements.
Unknown quantities subject to estimation are the fixed-effects parameter vector
and the covariance parameter vector
that comprises all unknowns in G and R. The random effects
are not parameters of the model in the sense that they are not estimated. The vector
is a vector of random variables. The solutions for are predictors of these random variables.
Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of
be computed at every iteration.
Relationship with Generalized Linear Models
Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder
1989) are a special case of GLMMs. If
= 0 and R = -I, the GLMM reduces to either a generalized linear model (GLM) or a GLM with overdispersion. For example,
if
Y is a vector of Poisson variables so that A is a diagonal matrix containing E
[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1 and overdispersed relative to a Poisson distribution for
- > 1. Since the Poisson distribution does not have an extra scale parameter, you can model overdispersion by
adding the statement
random _residual_;
to your GLIMMIX statements. If the only random effect is an overdispersion effect,
PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of
the methods specific to GLMMs.
PROC GLIMMIX Contrasted with Other SAS Procedures
The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two
important ways. First, the response can have a nonnormal distribution. The MIXED
procedure assumes that the response is normally (Gaussian) distributed. Second,
the GLIMMIX procedure incorporates random effects in the model and so allows
for subject-specific (conditional) and population-averaged (marginal) inference. The
GENMOD procedure only allows for marginal inference.
The GLIMMIX and MIXED procedure are closely related; see the syntax and feature
comparison in the section
“Comparing PROC GLIMMIX with PROC MIXED” 10
The GLIMMIX Procedure on page 138. The remainder of this section compares PROC GLIMMIX with the
GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.
The GENMOD procedure fits generalized linear models for independent data by
maximum likelihood. It can also handle correlated data through the marginal GEE
approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation
in the GENMOD procedure is a marginal method that does not incorporate
random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances
only, and the unknown parameters in
R are estimated by the method of moments. The GLIMMIX procedure allows G-side random effects and R-side covariances.
The parameters are estimated by likelihood-based techniques.
Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.
In a GLMM it is not always possible to derive the log likelihood of the data.
Even if the log likelihood is tractable, it may be computationally infeasible. In some
cases, the objective function must be constructed based on a substitute model. In other
cases, only the first two moments of the marginal distribution can be approximated.
Consequently, obtaining likelihood-based tests and statistics is difficult in the majority
of generalized linear mixed models. The GLIMMIX procedure relies heavily on
linearization and Taylor-series techniques to construct Wald-type test statistics and
confidence intervals. Likelihood ratio tests and confidence intervals are not available
in the GLIMMIX procedure.
The NLMIXED procedure also fits generalized linear mixed models but the class of
models it can accommodate is more narrow. The NLMIXED procedure relies on approximating
the marginal log likelihood by integral approximation through Gaussian
quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the
problem of obtaining solutions for the parameter estimates as an optimization problem.
The objective function for the NLMIXED procedure is the marginal log likelihood
obtained by integrating out the random effects from the joint distribution of
responses and random effects using quadrature techniques. Although these are very
accurate, the number of random effects that can be practically managed is limited.
Also, R-side random effects cannot be accommodated with the NLMIXED procedure.
The GLIMMIX procedure, on the other hand, determines the marginal log
likelihood as that of an approximate linear mixed model. This allows multiple random
effects, nested and crossed random effects, multiple cluster types, and R-side
random components. The disadvantage is a doubly iterative fitting algorithm and the
absence of a true log likelihood.
The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate
the independence case only. Binary, binomial, multinomial models for
ordered data, and generalized logit models that can be fit with PROC LOGISTIC,
can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities
specific to such data implemented in the LOGISTIC procedure go beyond the
capabilities of the GLIMMIX procedure.
Logistic Regressions with Random Intercepts
11 Getting Started
Logistic Regressions with Random Intercepts
Researchers investigated the performance of two medical procedures in a multicenter
study. They randomly selected 15 centers for inclusion. One of the study goals
was to compare the occurrence of side effects for the procedures. In each center
nA patients were randomly selected and assigned to procedure “A,” and
nB patients were randomly assigned to procedure “B”. The following DATA step creates the data set
for the analysis.
data multicenter;
input center group$ n sideeffect;
datalines;
1 A 32 14
1 B 33 18
2 A 30 4
2 B 28 8
3 A 23 14
3 B 24 9
4 A 22 7
4 B 22 10
5 A 20 6
5 B 21 12
6 A 19 1
6 B 20 3
7 A 17 2
7 B 17 6
8 A 16 7
8 B 15 9
9 A 13 1
9 B 14 5
10 A 13 3
10 B 13 1
11 A 11 1
11 B 12 2
12 A 10 1
12 B 9 0
13 A 9 2
13 B 9 6
14 A 8 1
14 B 8 1
15 A 7 1
15 B 8 0
;
The variable
group identifies the two procedures, n is the number of patients who received a given procedure in a particular center, and
sideeffect is the number of patients who reported side effects.
If
YiA and YiB denote the number of patients in center i who report side effects for procedures
A and B, respectively, then—for a given center—these are independent 12
The GLIMMIX Procedure binomial random variables. To model the probability of side effects for the two drugs,
iA and iB, you need to account for the fixed group effect and the random selection of centers. One possibility is to assume a model that relates group and center effects
linearly to the logit of the probabilities:
log
iA 1
− iA =
0 + A + i log
iB 1
− iB =
0 + B + i In this model,
A − B measures the difference in the logits of experiencing side effects, and the
i are independent random variables due to the random selection of centers. If you think of
0 as the overall intercept in the model, then the i are random intercept adjustments. Observations from the same center receive the same
adjustment, and these vary randomly from center to center with variance var
[i] = 2 c
. Since
iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA, you can model the sample proportions as binomial ratios in a generalized linear mixed
model. The following statements request this analysis under the assumption of normally
distributed center effects with equal variance and a logit link function.
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
run;
The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs
the procedure to treat the variables
center and group as classification variables. The MODEL statement specifies the response variable as a sample proportion
using the
events/trials syntax. In terms of the previous formulas, sideeffect/n corresponds to
YiA/niA for observations from Group A and to YiB/niB for observations from Group B. The SOLUTION option in the MODEL statement requests a listing
of the solutions for the fixed-effects parameter estimates. Note that because of the
events/trials
syntax, the GLIMMIX procedure defaults to the binomial distribution, and that distribution’s default link is the logit link. The RANDOM statement specifies
that the linear predictor contains an intercept term that randomly varies at the
level of the
center effect. In other words, a random intercept is drawn separately and independently for each center in the study.
The results of this analysis are shown in the following tables.
The “Model Information Table” in
Figure 1 summarizes important information about the model you fit and about aspects of the estimation technique.
Logistic Regressions with Random Intercepts
13 The GLIMMIX Procedure
Model Information
Data Set WORK.MULTICENTER
Response Variable (Events) sideeffect
Response Variable (Trials) n
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Blocked By center
Estimation Technique Residual PL
Degrees of Freedom Method Containment
Figure 1.
Model Information PROC GLIMMIX recognizes the variables
sideeffect and n as the numerator and denominator in the
events/trials syntax, respectively. The distribution—conditional on the random center effects—is binomial. The marginal variance matrix is blockdiagonal,
and observations from the same center form the blocks. The default estimation
technique in generalized linear mixed models is residual pseudo-likelihood with
a subject-specific expansion (METHOD=RSPL).
In
Figure 2, the “Class Level Information” table lists the levels of the variables specified in the CLASS statement and the ordering of the levels. The “Number of
Observations” table displays the number of observations read and used in the analysis.
Class Level Information
Class Levels Values
center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
group 2 A B
Number of Observations Read 30
Number of Observations Used 30
Number of Events 155
Number of Trials 503
Figure 2.
Class Level Information and Number of Observations There are two variables listed in the CLASS statement. The
center variable has fifteen levels, and the
group variable has two levels. Since the response is specified through the
events/trial syntax, the “Number of Observations” table also contains the total number of events and trials used in the analysis.
The “Dimensions” table in
Figure 3 lists the size of relevant matrices. 14
The GLIMMIX Procedure Dimensions
G-side Cov. Parameters 1
Columns in X 3
Columns in Z per Subject 1
Subjects (Blocks in V) 15
Max Obs per Subject 2
Figure 3.
Dimensions There are three columns in the
X matrix, corresponding to an intercept and the two levels of the
group variable. For each subject (center), the Z matrix contains only an intercept column.
The “Optimization Information” table in
Figure 4 provides information about the methods and size of the optimization problem.
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 1
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Profiled
Starting From Data
Figure 4.
Optimization Information The default optimization technique for generalized linear mixed models is the Quasi-
Newton method. Because a residual likelihood technique is used to compute the objective
function, only the covariance parameters are participating in the optimization.
A lower boundary constraint is placed on the variance component for the random
center
effect. The solution for this variance cannot be less than zero. The “Iteration History” table in
Figure 5 displays information about the progress of the optimization process.
Logistic Regressions with Random Intercepts
15 Iteration History
Objective Max
Iteration Restarts Subiterations Function Change Gradient
0 0 5 79.688580269 0.11807224 7.851E-7
1 0 3 81.294622554 0.02558021 8.209E-7
2 0 2 81.438701534 0.00166079 4.061E-8
3 0 1 81.444083567 0.00006263 2.311E-8
4 0 1 81.444265216 0.00000421 0.000025
5 0 1 81.444277364 0.00000383 0.000023
6 0 1 81.444266322 0.00000348 0.000021
7 0 1 81.44427636 0.00000316 0.000019
8 0 1 81.444267235 0.00000287 0.000017
9 0 1 81.444275529 0.00000261 0.000016
10 0 1 81.44426799 0.00000237 0.000014
11 0 1 81.444274843 0.00000216 0.000013
12 0 1 81.444268614 0.00000196 0.000012
13 0 1 81.444274277 0.00000178 0.000011
14 0 1 81.444269129 0.00000162 9.772E-6
15 0 0 81.444273808 0.00000000 9.102E-6
Convergence criterion (PCONV=1.11022E-8) satisfied.
Figure 5.
Iteration History and Convergence Status After the initial optimization, the GLIMMIX procedure performed 15 updates before
the convergence criterion was met. At convergence, the largest absolute value of the
gradient was near zero. This indicates that the process stopped at an extremum of the
objective function.
The “Fit Statistics” table in
Figure 6 lists information about the fitted model. Fit Statistics
-2 Res Log Pseudo-Likelihood 81.44
Generalized Chi-Square 30.69
Gener. Chi-Square / DF 1.10
Figure 6.
Fit Statistics Twice the negative of the residual log lilikelihood in the final pseudo-model equaled
81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is
close to 1. This is a measure of the residual variability in the marginal distribution of
the data.
The “Covariance Parameter Estimates” table in
Figure 7 displays estimates and asymptotic estimated standard errors for all covariance parameters.
16
The GLIMMIX Procedure Covariance Parameter Estimates
Standard
Cov Parm Subject Estimate Error
Intercept center 0.6176 0.3181
Figure 7.
Covariance Parameter Estimates The variance of the random center intercepts on the logit scale is estimated as
b2 c
= 0
.6176. The “Parameter Estimates” table in
Figure 8 displays the solutions for the fixed effects in the model.
Solutions for Fixed Effects
Standard
Effect group Estimate Error DF t Value Pr > |t|
Intercept -0.8071 0.2514 14 -3.21 0.0063
group A -0.4896 0.2034 14 -2.41 0.0305
group B 0 . . . .
Figure 8.
Parameter Estimates Because of the fixed-effects parameterization used in the GLIMMIX procedure, the
“Intercept” effect is an estimate of
0 + B, and the “A” group effect is an estimate of
A − B, the log-odds ratio. The associated estimated probabilities of side effects in the two groups are
b
A = 1
1 + exp
{0.8071 + 0.4896} = 0
.2147 b
B = 1
1 + exp
{0.8071} = 0
.3085 There is a significant difference between the two groups (
p=0.0305). The “Type III Tests of Fixed Effect” table in
Figure 9 displays significance tests for the fixed effects in the model.
Type III Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
group 1 14 5.79 0.0305
Figure 9.
Type III Tests of Fixed Effects Logistic Regressions with Random Intercepts
17 Because the
group effect has only two levels, the p-value for the effect is the same as in the “Parameter Estimates” table, and the “
F Value” is the square of the “t Value” shown there.
You can produce the estimates of the average logits in the two groups and their predictions
on the scale of the data with the LSMEANS statement in PROC GLIMMIX.
ods select lsmeans;
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
lsmeans group / cl ilink;
run;
The LSMEANS statement requests the least-squares means of the
group effect on the logit scale. The CL option requests their confidence limits. The ILINK option adds
estimates, standard errors, and confidence limits on the mean (probability) scale. The
table in
Figure 10 displays the results. The GLIMMIX Procedure
group Least Squares Means
Standard
group Estimate Error DF t Value Pr > |t| Alpha Lower Upper
A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388
B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679
group Least Squares Means
Standard
Error Lower Upper
group Mean Mean Mean Mean
A 0.2147 0.04385 0.1354 0.3233
B 0.3085 0.05363 0.2065 0.4334
Figure 10.
Least-squares Means The “Estimate” column displays the least-squares mean estimate on the logit scale,
and the “Mean” column represents its mapping onto the probability scale. The
“Lower” and “Upper” columns are 95% confidence limits for the logits in the two
groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence
limits for the probabilities of side effects. These limits are obtained by inversely
linking the confidence bounds on the linear scale, and thus are not symmetric
about the estimate of the probabilities.
Basic Features
The GLIMMIX procedure enables you to specify a generalized linear mixed model
and to perform confirmatory inference in such models. The syntax is similar to that
of the MIXED procedure and includes CLASS, MODEL, and RANDOM statements.
The following are some of the basic features of PROC GLIMMIX.
•
SUBJECT= and GROUP= options, which enable blocking of variance matrices and parameter heterogeneity
•
choice of linearization about expected values or expansion about current solutions of best linear unbiased predictors
•
flexible covariance structures for random and residual random effects, including variance components, unstructured, autoregressive, and spatial structures
•
CONTRAST, ESTIMATE, LSMEANS and LSMESTIMATE statements, which produce hypothesis tests and estimable linear combinations of effects
•
NLOPTIONS statement, which enables you to exercise control over the numerical optimization. You can choose techniques, update methods, line search
algorithms, convergence criteria, and more. Or, you can choose the default
optimization strategies selected for the particular class of model you are fitting
•
computed variables with SAS programming statements inside of PROC GLIMMIX (except for variables listed in the CLASS statement). These computed
variables can appear in the MODEL, RANDOM, WEIGHT, or FREQ
statements.
•
grouped data analysis •
user-specified link and variance functions •
choice of model-based variance-covariance estimators for the fixed effects or empirical (sandwich) estimators to make analysis robust against misspecification
of the covariance structure and to adjust for small-sample bias
•
joint modeling for multivariate data. For example, you can model binary and normal responses from a subject jointly and use random effects to relate (fuse)
the two outcomes.
•
multinomial models for ordinal and nominal outcomes •
univariate and multivariate low-rank smoothing Assumptions
The primary assumptions underlying the analyses performed by PROC GLIMMIX
are as follows:
•
If the model contains random effects, the distribution of the data conditional on the random effects is known. This distribution is either a member of the
exponential family of distributions or one of the supplementary distributions
provided by the GLIMMIX procedure. In models without random effects, the
Notation for the Generalized Linear Mixed Model
7 unconditional (marginal) distribution is assumed to be known for maximum
likelihood estimation, or the first two moments are known in the case of quasilikelihood
estimation.
•
The conditional expected value of the data takes the form of a linear mixed model after a monotonic transformation is applied.
•
The problem of fitting the GLMM can be cast as a singly or doubly iterative optimization problem. The objective function for the optimization is a function
of either the actual log likelihood, an approximation to the log likelihood, or
the log likelihood of an approximated model.
For a model containing random effects, the GLIMMIX procedure, by default, estimates
the parameters by applying pseudo-likelihood techniques as in Wolfinger and
O’Connell (1993) and Breslow and Clayton (1993). In a model without random
effects (GLM models), PROC GLIMMIX estimates the parameters by maximum
likelihood, restricted maximum likelihood, or quasi-likelihood. See the
“Singly or Doubly Iterative Fitting”
section on page 140 on when the GLIMMIX procedure applies noniterative, singly and doubly iterative algorithms, and the
“Default Estimation Techniques”
section on page 142 on the default estimation methods. Once the parameters have been estimated, you can perform statistical inferences for
the fixed effects and covariance parameters of the model. Tests of hypotheses for
the fixed effects are based on Wald-type tests and the estimated variance-covariance
matrix.
PROC GLIMMIX uses the Output Delivery System (ODS) for displaying and controlling
the output from SAS procedures. ODS enables you to convert any of the
output from PROC GLIMMIX into a SAS data set. See the
“ODS Table Names” section on page 160.
ODS statistical graphics are available with the GLIMMIX procedure. For more information,
see the PLOTS options of the PROC GLIMMIX and LSMEANS statements.
For more information on the ODS GRAPHICS statement, see Chapter 15, “Statistical
Graphics Using ODS” (
SAS/STAT User’s Guide). Notation for the Generalized Linear Mixed Model
This section introduces the mathematical notation used throughout the chapter to
describe the generalized linear mixed model (GLMM). See the
“Details” section on page 108 for a description of the fitting algorithms and the mathematical-statistical
details.
The Basic Model
Suppose
Y represents the (n × 1) vector of observed data and is a (r × 1) vector of random effects. Models fit by the GLIMMIX procedure assume that
E
[Y|] = g−1(X + Z) where
g(·) is a differentiable monotonic link function and g−1(·) is its inverse. The matrix
Xis a (n×p) matrix of rank k, and Z is a (n×r) design matrix for the random 8
The GLIMMIX Procedure effects. The random effects are assumed to be normally distributed with mean
0 and variance matrix
G. The GLMM contains a linear mixed model inside the inverse link function. This
model component is referred to as the linear predictor,
= X + Z The variance of the observations, conditional on the random effects, is
var
[Y|] = A1/2RA1/2 The matrix
A is a diagonal matrix and contains the variance functions of the model. The variance function expresses the variance of a response as a function of the mean.
The GLIMMIX procedure determines the variance function from the
DIST= option in the MODEL statement or from the user-supplied variance function (see the
“Implied Variance Functions”
section on page 104). The matrix R is a variance matrix specified by the user through the RANDOM statement. If the conditional distribution of
the data contains an additional scale parameter, it is either part of the variance functions
or part of the
R matrix. For example, the gamma distribution with mean μ has variance function
a(μ) = μ2 and var[Y |] = μ2-. If your model calls for G-side random effects only (see below), the procedure models
R = -I, where I is the identity matrix.
Table 9 on page 104 identifies the distributions for which - 1. G-side and R-side Random Effects
The GLIMMIX procedure distinguishes two types of random effects. Depending on
whether the variance of the random effect is contained in
Gor inR, these are referred to as “G-side” and “R-side” random effects. R-side effects are also called “residual”
effects. Simply put, if a random effect is an element of , it is a G-side effect;
otherwise, it is an R-side effect. Models without G-side effects are also known as
marginal (or population-averaged) models. Models fit with the GLIMMIX procedure
can have none, one, or more of each type of effect.
Note that an R-side effect in the GLIMMIX procedure is equivalent to a REPEATED
effect in the MIXED procedure. In the GLIMMIX procedure all random effects are
specified through the RANDOM statement.
The columns of
X are constructed from effects listed on the right-hand side in the MODEL statement. Columns of
Z and the variance matricesGandRare constructed from the RANDOM statement.
The
R matrix is by default the scaled identity matrix, R = -I. The scale parameter -
is set to one if the distribution does not have a scale parameter, for example, in the case of the the binary, binomial, Poisson, and exponential distribution (see
Table 9
on page 104). To specify a different R matrix, use the RANDOM statement with the –RESIDUAL– keyword or the RESIDUAL option. For example, to specify that
the
Time effect for each patient is an R-side effect with a first-order autoregressive covariance structure, use the RESIDUAL option:
PROC GLIMMIX Contrasted with Other SAS Procedures
9 random time / type=ar(1) subject=patient residual;
To add a multiplicative overdispersion parameter, use the –RESIDUAL– keyword
random _residual_;
You specify the link function
g(·) with the LINK= option of the MODEL statement or with programming statements. You specify the variance function that controls the
matrix
A with the DIST= option of the MODEL statement or with programming statements.
Unknown quantities subject to estimation are the fixed-effects parameter vector
and the covariance parameter vector
that comprises all unknowns in G and R. The random effects
are not parameters of the model in the sense that they are not estimated. The vector
is a vector of random variables. The solutions for are predictors of these random variables.
Some fitting algorithms require that the best linear unbiased predictors (BLUPs) of
be computed at every iteration.
Relationship with Generalized Linear Models
Generalized linear models (Nelder and Wedderburn 1972; McCullagh and Nelder
1989) are a special case of GLMMs. If
= 0 and R = -I, the GLMM reduces to either a generalized linear model (GLM) or a GLM with overdispersion. For example,
if
Y is a vector of Poisson variables so that A is a diagonal matrix containing E
[Y] = μ on the diagonal, then the model is a Poisson regression model for - = 1 and overdispersed relative to a Poisson distribution for
- > 1. Since the Poisson distribution does not have an extra scale parameter, you can model overdispersion by
adding the statement
random _residual_;
to your GLIMMIX statements. If the only random effect is an overdispersion effect,
PROC GLIMMIX fits the model by (restricted) maximum likelihood and not one of
the methods specific to GLMMs.
PROC GLIMMIX Contrasted with Other SAS Procedures
The GLIMMIX procedure generalizes the MIXED and GENMOD procedures in two
important ways. First, the response can have a nonnormal distribution. The MIXED
procedure assumes that the response is normally (Gaussian) distributed. Second,
the GLIMMIX procedure incorporates random effects in the model and so allows
for subject-specific (conditional) and population-averaged (marginal) inference. The
GENMOD procedure only allows for marginal inference.
The GLIMMIX and MIXED procedure are closely related; see the syntax and feature
comparison in the section
“Comparing PROC GLIMMIX with PROC MIXED” 10
The GLIMMIX Procedure on page 138. The remainder of this section compares PROC GLIMMIX with the
GENMOD, NLMIXED, LOGISTIC, and CATMOD procedures.
The GENMOD procedure fits generalized linear models for independent data by
maximum likelihood. It can also handle correlated data through the marginal GEE
approach of Liang and Zeger (1986) and Zeger and Liang (1986). The GEE implementation
in the GENMOD procedure is a marginal method that does not incorporate
random effects. The GEE estimation in the GENMOD procedure relies on R-side covariances
only, and the unknown parameters in
R are estimated by the method of moments. The GLIMMIX procedure allows G-side random effects and R-side covariances.
The parameters are estimated by likelihood-based techniques.
Many of the fit statistics and tests in the GENMOD procedure are based on the likelihood.
In a GLMM it is not always possible to derive the log likelihood of the data.
Even if the log likelihood is tractable, it may be computationally infeasible. In some
cases, the objective function must be constructed based on a substitute model. In other
cases, only the first two moments of the marginal distribution can be approximated.
Consequently, obtaining likelihood-based tests and statistics is difficult in the majority
of generalized linear mixed models. The GLIMMIX procedure relies heavily on
linearization and Taylor-series techniques to construct Wald-type test statistics and
confidence intervals. Likelihood ratio tests and confidence intervals are not available
in the GLIMMIX procedure.
The NLMIXED procedure also fits generalized linear mixed models but the class of
models it can accommodate is more narrow. The NLMIXED procedure relies on approximating
the marginal log likelihood by integral approximation through Gaussian
quadrature. Like the GLIMMIX procedure, the NLMIXED procedure defines the
problem of obtaining solutions for the parameter estimates as an optimization problem.
The objective function for the NLMIXED procedure is the marginal log likelihood
obtained by integrating out the random effects from the joint distribution of
responses and random effects using quadrature techniques. Although these are very
accurate, the number of random effects that can be practically managed is limited.
Also, R-side random effects cannot be accommodated with the NLMIXED procedure.
The GLIMMIX procedure, on the other hand, determines the marginal log
likelihood as that of an approximate linear mixed model. This allows multiple random
effects, nested and crossed random effects, multiple cluster types, and R-side
random components. The disadvantage is a doubly iterative fitting algorithm and the
absence of a true log likelihood.
The LOGISTIC and CATMOD procedures also fit generalized linear models but accommodate
the independence case only. Binary, binomial, multinomial models for
ordered data, and generalized logit models that can be fit with PROC LOGISTIC,
can also be fit with the GLIMMIX procedure. The diagnostic tools and capabilities
specific to such data implemented in the LOGISTIC procedure go beyond the
capabilities of the GLIMMIX procedure.
Logistic Regressions with Random Intercepts
11 Getting Started
Logistic Regressions with Random Intercepts
Researchers investigated the performance of two medical procedures in a multicenter
study. They randomly selected 15 centers for inclusion. One of the study goals
was to compare the occurrence of side effects for the procedures. In each center
nA patients were randomly selected and assigned to procedure “A,” and
nB patients were randomly assigned to procedure “B”. The following DATA step creates the data set
for the analysis.
data multicenter;
input center group$ n sideeffect;
datalines;
1 A 32 14
1 B 33 18
2 A 30 4
2 B 28 8
3 A 23 14
3 B 24 9
4 A 22 7
4 B 22 10
5 A 20 6
5 B 21 12
6 A 19 1
6 B 20 3
7 A 17 2
7 B 17 6
8 A 16 7
8 B 15 9
9 A 13 1
9 B 14 5
10 A 13 3
10 B 13 1
11 A 11 1
11 B 12 2
12 A 10 1
12 B 9 0
13 A 9 2
13 B 9 6
14 A 8 1
14 B 8 1
15 A 7 1
15 B 8 0
;
The variable
group identifies the two procedures, n is the number of patients who received a given procedure in a particular center, and
sideeffect is the number of patients who reported side effects.
If
YiA and YiB denote the number of patients in center i who report side effects for procedures
A and B, respectively, then—for a given center—these are independent 12
The GLIMMIX Procedure binomial random variables. To model the probability of side effects for the two drugs,
iA and iB, you need to account for the fixed group effect and the random selection of centers. One possibility is to assume a model that relates group and center effects
linearly to the logit of the probabilities:
log
iA 1
− iA =
0 + A + i log
iB 1
− iB =
0 + B + i In this model,
A − B measures the difference in the logits of experiencing side effects, and the
i are independent random variables due to the random selection of centers. If you think of
0 as the overall intercept in the model, then the i are random intercept adjustments. Observations from the same center receive the same
adjustment, and these vary randomly from center to center with variance var
[i] = 2 c
. Since
iA is the conditional mean of the sample proportion, E[YiA/niA|i] = iA, you can model the sample proportions as binomial ratios in a generalized linear mixed
model. The following statements request this analysis under the assumption of normally
distributed center effects with equal variance and a logit link function.
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
run;
The PROC GLIMMIX statement invokes the procedure. The CLASS statement instructs
the procedure to treat the variables
center and group as classification variables. The MODEL statement specifies the response variable as a sample proportion
using the
events/trials syntax. In terms of the previous formulas, sideeffect/n corresponds to
YiA/niA for observations from Group A and to YiB/niB for observations from Group B. The SOLUTION option in the MODEL statement requests a listing
of the solutions for the fixed-effects parameter estimates. Note that because of the
events/trials
syntax, the GLIMMIX procedure defaults to the binomial distribution, and that distribution’s default link is the logit link. The RANDOM statement specifies
that the linear predictor contains an intercept term that randomly varies at the
level of the
center effect. In other words, a random intercept is drawn separately and independently for each center in the study.
The results of this analysis are shown in the following tables.
The “Model Information Table” in
Figure 1 summarizes important information about the model you fit and about aspects of the estimation technique.
Logistic Regressions with Random Intercepts
13 The GLIMMIX Procedure
Model Information
Data Set WORK.MULTICENTER
Response Variable (Events) sideeffect
Response Variable (Trials) n
Response Distribution Binomial
Link Function Logit
Variance Function Default
Variance Matrix Blocked By center
Estimation Technique Residual PL
Degrees of Freedom Method Containment
Figure 1.
Model Information PROC GLIMMIX recognizes the variables
sideeffect and n as the numerator and denominator in the
events/trials syntax, respectively. The distribution—conditional on the random center effects—is binomial. The marginal variance matrix is blockdiagonal,
and observations from the same center form the blocks. The default estimation
technique in generalized linear mixed models is residual pseudo-likelihood with
a subject-specific expansion (METHOD=RSPL).
In
Figure 2, the “Class Level Information” table lists the levels of the variables specified in the CLASS statement and the ordering of the levels. The “Number of
Observations” table displays the number of observations read and used in the analysis.
Class Level Information
Class Levels Values
center 15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
group 2 A B
Number of Observations Read 30
Number of Observations Used 30
Number of Events 155
Number of Trials 503
Figure 2.
Class Level Information and Number of Observations There are two variables listed in the CLASS statement. The
center variable has fifteen levels, and the
group variable has two levels. Since the response is specified through the
events/trial syntax, the “Number of Observations” table also contains the total number of events and trials used in the analysis.
The “Dimensions” table in
Figure 3 lists the size of relevant matrices. 14
The GLIMMIX Procedure Dimensions
G-side Cov. Parameters 1
Columns in X 3
Columns in Z per Subject 1
Subjects (Blocks in V) 15
Max Obs per Subject 2
Figure 3.
Dimensions There are three columns in the
X matrix, corresponding to an intercept and the two levels of the
group variable. For each subject (center), the Z matrix contains only an intercept column.
The “Optimization Information” table in
Figure 4 provides information about the methods and size of the optimization problem.
Optimization Information
Optimization Technique Dual Quasi-Newton
Parameters in Optimization 1
Lower Boundaries 1
Upper Boundaries 0
Fixed Effects Profiled
Starting From Data
Figure 4.
Optimization Information The default optimization technique for generalized linear mixed models is the Quasi-
Newton method. Because a residual likelihood technique is used to compute the objective
function, only the covariance parameters are participating in the optimization.
A lower boundary constraint is placed on the variance component for the random
center
effect. The solution for this variance cannot be less than zero. The “Iteration History” table in
Figure 5 displays information about the progress of the optimization process.
Logistic Regressions with Random Intercepts
15 Iteration History
Objective Max
Iteration Restarts Subiterations Function Change Gradient
0 0 5 79.688580269 0.11807224 7.851E-7
1 0 3 81.294622554 0.02558021 8.209E-7
2 0 2 81.438701534 0.00166079 4.061E-8
3 0 1 81.444083567 0.00006263 2.311E-8
4 0 1 81.444265216 0.00000421 0.000025
5 0 1 81.444277364 0.00000383 0.000023
6 0 1 81.444266322 0.00000348 0.000021
7 0 1 81.44427636 0.00000316 0.000019
8 0 1 81.444267235 0.00000287 0.000017
9 0 1 81.444275529 0.00000261 0.000016
10 0 1 81.44426799 0.00000237 0.000014
11 0 1 81.444274843 0.00000216 0.000013
12 0 1 81.444268614 0.00000196 0.000012
13 0 1 81.444274277 0.00000178 0.000011
14 0 1 81.444269129 0.00000162 9.772E-6
15 0 0 81.444273808 0.00000000 9.102E-6
Convergence criterion (PCONV=1.11022E-8) satisfied.
Figure 5.
Iteration History and Convergence Status After the initial optimization, the GLIMMIX procedure performed 15 updates before
the convergence criterion was met. At convergence, the largest absolute value of the
gradient was near zero. This indicates that the process stopped at an extremum of the
objective function.
The “Fit Statistics” table in
Figure 6 lists information about the fitted model. Fit Statistics
-2 Res Log Pseudo-Likelihood 81.44
Generalized Chi-Square 30.69
Gener. Chi-Square / DF 1.10
Figure 6.
Fit Statistics Twice the negative of the residual log lilikelihood in the final pseudo-model equaled
81.44. The ratio of the generalized chi-square statistic and its degrees of freedom is
close to 1. This is a measure of the residual variability in the marginal distribution of
the data.
The “Covariance Parameter Estimates” table in
Figure 7 displays estimates and asymptotic estimated standard errors for all covariance parameters.
16
The GLIMMIX Procedure Covariance Parameter Estimates
Standard
Cov Parm Subject Estimate Error
Intercept center 0.6176 0.3181
Figure 7.
Covariance Parameter Estimates The variance of the random center intercepts on the logit scale is estimated as
b2 c
= 0
.6176. The “Parameter Estimates” table in
Figure 8 displays the solutions for the fixed effects in the model.
Solutions for Fixed Effects
Standard
Effect group Estimate Error DF t Value Pr > |t|
Intercept -0.8071 0.2514 14 -3.21 0.0063
group A -0.4896 0.2034 14 -2.41 0.0305
group B 0 . . . .
Figure 8.
Parameter Estimates Because of the fixed-effects parameterization used in the GLIMMIX procedure, the
“Intercept” effect is an estimate of
0 + B, and the “A” group effect is an estimate of
A − B, the log-odds ratio. The associated estimated probabilities of side effects in the two groups are
b
A = 1
1 + exp
{0.8071 + 0.4896} = 0
.2147 b
B = 1
1 + exp
{0.8071} = 0
.3085 There is a significant difference between the two groups (
p=0.0305). The “Type III Tests of Fixed Effect” table in
Figure 9 displays significance tests for the fixed effects in the model.
Type III Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
group 1 14 5.79 0.0305
Figure 9.
Type III Tests of Fixed Effects Logistic Regressions with Random Intercepts
17 Because the
group effect has only two levels, the p-value for the effect is the same as in the “Parameter Estimates” table, and the “
F Value” is the square of the “t Value” shown there.
You can produce the estimates of the average logits in the two groups and their predictions
on the scale of the data with the LSMEANS statement in PROC GLIMMIX.
ods select lsmeans;
proc glimmix data=multicenter;
class center group;
model sideeffect/n = group / solution;
random intercept / subject=center;
lsmeans group / cl ilink;
run;
The LSMEANS statement requests the least-squares means of the
group effect on the logit scale. The CL option requests their confidence limits. The ILINK option adds
estimates, standard errors, and confidence limits on the mean (probability) scale. The
table in
Figure 10 displays the results. The GLIMMIX Procedure
group Least Squares Means
Standard
group Estimate Error DF t Value Pr > |t| Alpha Lower Upper
A -1.2966 0.2601 14 -4.99 0.0002 0.05 -1.8544 -0.7388
B -0.8071 0.2514 14 -3.21 0.0063 0.05 -1.3462 -0.2679
group Least Squares Means
Standard
Error Lower Upper
group Mean Mean Mean Mean
A 0.2147 0.04385 0.1354 0.3233
B 0.3085 0.05363 0.2065 0.4334
Figure 10.
Least-squares Means The “Estimate” column displays the least-squares mean estimate on the logit scale,
and the “Mean” column represents its mapping onto the probability scale. The
“Lower” and “Upper” columns are 95% confidence limits for the logits in the two
groups. The “Lower Mean” and “Upper Mean” columns are the corresponding confidence
limits for the probabilities of side effects. These limits are obtained by inversely
linking the confidence bounds on the linear scale, and thus are not symmetric
about the estimate of the probabilities.
[此贴子已经被作者于2007-9-16 0:27:22编辑过]