Stata多層面分析案例

1关注
62粉丝

VIP

已卖：4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库 其他...

R资源总汇

Panel Data Analysis

Experimental Design

0%

威望: 1 级
论坛币: 49635 个
通用积分: 55.7537
学术水平: 370 点
热心指数: 273 点
信用等级: 335 点
经验: 57805 点
帖子: 4005
精华: 21
在线时间: 582 小时
注册时间: 2005-5-8
最后登录: 2023-11-26

楼主

ReneeBK 发表于 2014-4-4 01:29:05 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

Consider a longitudinaldataset, used by both Ruppert, Wand, and Carroll (2003) and Diggle etal. (2002), consisting of weight measurements of 48 pigs on 9 successive weeks.Pigs are identified by the variable id. Below is a plot of the growth curvesfor the first 10 pigs.

. usehttp://www.stata-press.com/data/r13/pig (Longitudinalanalysis of pig weights)

. twowayconnected weight week if id<=10, connect(L)

It seems clear that each pig experiences a linear trend ingrowth and that overall weight measurements vary from pig to pig. Because weare not really interested in these particular 48 pigs per se, we instead treatthem as a random sample from a larger population and model the between-pigvariability as a random effect, or in the terminology of (2), as arandom-intercept term at the pig level. We thus wish to fit the model

weightij= β0 + β1weekij + uj + Єij

states that we want one overall regression linerepresenting the population average. The random effect uj serves to shift thisregression line up or down according to each pig. Because the random effectsoccur at the pig level (id), we fit the model by typing

. mixedweight week || id:

Notes:

By typing weight week, we specified the response, weight, and thefixed portion of the model in the same way that we would if we were using regressor any other estimation command. Our fixed effects are a coefficient on week and a constant term.
When we added || id:, wespecified random effects at the level identified by the group variable id, that is, the pig level (level two). Becausewe wanted only a random intercept, that is all we had to type.
The estimation log consists of three parts: (a)A set of EM iterations used to refine starting values. By default,the iterations themselves are not displayed, but you can display them with the emlogoption. (b)A set of gradient-based iterations. By default, these areNewton–Raphson iterations, but other methods are available by specifying theappropriate maximize option.(c)The message “Computing standard errors”. This is just to informyou that mixed has finished its iterative maximization and is nowreparameterizing from a matrix-based parameterization to the natural metric ofvariance components and their estimated standard errors.
The output title, “Mixed-effects ML regression”, informs us thatour model was fit using ML, the default. For EML estimates, use the reml option. Becausethis model is a simple random-intercept model fit by ML, it would be equivalentto using xtreg with its mle option.
The first estimation tablereports the fixed effects. We estimate β0 = 19.36 and β1 = 6.21.
The second estimation table shows theestimated variance components. The first section of the table is labeled id:Identity, meaning that these are random effects at the id (pig) level and that theirvariance–covariance matrix is a multiple of the identity matrix. Because wehave only one random effect at this level, mixed knew that Identity is the onlypossible covariance structure. In any case, the variance of the level-twoerrors is estimated as 4.82 with standard error 3.12.
The row labeled var(Residual) displays theestimated variance of the overall error term; the variance of the level-oneerrors is 4.38.
Finally,a likelihood-ratio test comparing the model with one-level ordinary linearregression, model (4) without uj , is provided and is highly significant forthese data.

ME294

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏3 回帖

关键词：Stata tata 分析案例 Longitudinal Measurements identified growth

相关帖子

沙发

ReneeBK 发表于 2014-4-4 01:43:25

Extending aboveexample to allow for a random slope on week yields the model

weightij = β0 + β1weekij + u0j + u1jweekij+ Єij

Because we did not specify a covariance structure for therandom effects, mixed used thedefault Independentstructure;
We can use lrtest andour two stored estimation results to verify this fact:

. lrtest randslope randint

The near-zero significance level favors the modelthat allows for a random pig-specific regression line over the model thatallows only for a pig-specific shift.

藤椅

ReneeBK 发表于 2014-4-4 01:51:17

Specify covariance to be unstructured to allow u0j and u1j to be correlated

. mixed weight week || id: week, covariance(unstructured)

Conduct Likelihood-ratio test

. lrtest randslope

Specify covariance to be identity, restricting u0j and u1j to not only be independent but also to have common variance,
specify covariance to be exchangeable, which imposes a common variance but allows for a nonzero correlation.

板凳

ReneeBK 发表于 2014-4-4 02:01:54

Baltagi, Song, and Jung (2001) estimate a Cobb–Douglas production function examining the productivity of public capitalin each state’s private output. Originally provided by Munnell (1990), the data were recorded over 1970–1986 for 48 states groupedinto nine regions.

. use http://www.stata-press.com/data/r13/productivity(Public Capital Productivity)

. describe

Because the states are nested within regions,we fit a three-level mixed model with random intercepts at both the region andthe state-within-region levels.

. mixed gsp private emp hwy water otherunemp || region: || state:

Our model now has two random-effects equations, separatedby ||. The first is a random intercept (constant only) at the regionlevel(level three), and the second is a random intercept at the state level(level two). The order in which these are specified (from left to right) issignificant—mixed assumes that state isnested within region.
The information on groups is now displayed as a table, withone row for each grouping. You can suppress this table with the nogrouporthe noheader option, which will suppress the rest of the header, aswell.
The variance-component estimates are now organized andlabeled according to level. After adjusting for the nested-level errorstructure, we find that the highway and water components of public capital hadsignificant positive effects on private output, whereas the other publicbuildings component had a negative effect.
In the previous example, the states are coded 1–48 and are nested within nine regions. mixed treated the states as nested within regions, regardless of whether the codes for each state were unique between regions. That is, even if codes for states were duplicated between regions, mixed would have enforced the nesting and produced the same results.
The group information at the top of the mixed output and that produced by the postestimation command estat group take the nesting into account. The statistics are thus not necessarily what you would get if you instead tabulated each group variable individually.

ME302

报纸

ReneeBK 发表于 2014-4-4 02:35:02

•       Returning to our productivity data, we now add random coefficients on hwy and unemp at the region level. This only slightly changes the estimates of the fixed effects, so we focus our attention on the variance components:
. mixed gsp private emp hwy water other unemp || region: hwy unemp || state:,nolog nogroup nofetable
•       The random-effects specification at the state level remains unchanged; that is, ∑2 is still treated as the scalar variance of the random intercepts at the state level.

•       An LR test comparing this model with that from example 4 favors the inclusion of the two random coefficients.
•       The estimated variance components, upon examination, reveal that the variances of the random coefficients on hwy and unemp could be treated as equal.
•       We construct block-diagonal covariances by repeating level specifications:
. mixed gsp private emp hwy water other unemp || region: hwy unemp,
> cov(identity) || region: || state:, nolog nogroup nofetable
•       We specified two equations for the region level: the first for the random coefficients on hwy and unemp with covariance set to Identity and the second for the random intercept cons, whose covariance defaults to Identity because it is of dimension 1. mixed labeled the estimate of ∑2  as var(hwy unemp) to designate that it is common to the random coefficients on both hwy and unemp.
•       An LR test shows that the constrained model fits equally well.
. lrtest . prodrc
•       Because the null hypothesis for this test is one of equality (H0 : σ2a = σ2b ), it is not on the boundary of the parameter space. As such, we can take the reported significance as precise rather than a conservative estimate
•       You can repeat level specifications as often as you like, defining successive blocks of a blockdiagonal covariance matrix. However, repeated-level equations must be listed consecutively; otherwise, mixed will give an error.
•       In the previous estimation output, there was no constant term included in the first region equation, even though we did not use the noconstant option. When you specify repeated-level equations, mixed knows not to put constant terms in each equation because such a model would be unidentified. By default, it places the constant in the last repeated-level equation, but you can use noconstant creatively to override this. Linear mixed-effects models can also be fit using meglm with the default gaussian family. Meglm provides two more covariance structures through which you can impose constraints on variance components.

地板

Lisrelchen 发表于 2014-4-4 03:11:26

• Following Rabe-Hesketh and Skrondal (2012, sec. 7.2), we analyze data from Asian children in a British community who were weighed up to four times, roughly between the ages of 6 weeks and 27 months. The dataset is a random sample of data previously analyzed by Goldstein (1986) and Prosser, Rasbash, and Goldstein (1991).
. use http://www.stata-press.com/data/r13/childweight
(Weight data on Asian children)
. describe
• .
graph twoway (line weight age, connect(ascending)), by(girl)
> xtitle(Age in years) ytitle(Weight in kg)

• Ignoring gender effects for the moment, we begin with the following model for the ith measurement on the jth child:
weightij = β0 + β1ageij + β2age2ij + uj0 + uj1ageij + Єij

• This models overall mean growth as quadratic in age and allows for two child-specific random effects: a random intercept uj0, which represents each child’s vertical shift from the overall mean, and a random age slope uj1, which represents each child’s deviation in linear growth rate from the overall mean linear growth rate (_1). For simplicity, we do not consider child-specific changes in the quadratic component of growth.
. mixed weight age c.age#c.age || id: age, nolog
• it is always a good idea to first fit a model with the covariance(unstructured) option.
• Next we introduce gender effects into the fixed portion of the model by including a main gender effect and a gender–age interaction for overall mean growth:
. mixed weight i.girl i.girl#c.age c.age#c.age || id: age, nolog
• The main gender effect is significant at the 5% level, but the gender–age interaction is not:
. test 0.girl#c.age = 1.girl#c.age

• On average, boys are heavier than girls, but their average linear growth rates are not significantly different.
• In the above model, we introduced a gender effect on average growth, but we still assumed that the variability in child-specific deviations from this average was the same for boys and girls. To check this assumption, we introduce gender into the random component of the model. Because support for factor-variable notation is limited in specifications of random we need to generate the interactions ourselves.
. gen boy = !girl
. gen boyXage = boy*age
. gen girlXage = girl*age
. mixed weight i.girl i.girl#c.age c.age#c.age || id: boy boyXage, noconstant
> || id: girl girlXage, noconstant nolog nofetable

• Our previous model had the random-effects specification
|| id: age
which we have replaced with the dual repeated-level specification
|| id: boy boyXage, noconstant || id: girl girlXage, noconstant

• The former models a random intercept and random slope on age, and does so treating all children as a random sample from one population. The latter also specifies a random intercept and random slope on age, but allows for the variability of the random intercepts and slopes to differ between boys and girls. In other words, it allows for heteroskedasticity in random effects due to gender. We use the noconstant option so that we can separate the overall random intercept (automatically provided by the former syntax) into one specific to boys and one specific to girls.
• There seems to be a large gender effect in the variability of linear growth rates. We can compare both models with an LR test, recalling that we stored the previous estimation results under the name homoskedastic:
. lrtest homoskedastic heteroskedastic

• Because the null hypothesis here is one of equality of variances and not that variances are 0, the above does not test on the boundary; thus we can treat the significance level as precise and not conservative. Either way, the results favor the new model with heteroskedastic random effects.

7楼

Lisrelchen 发表于 2014-4-4 03:23:22

West, Welch, and Galecki (2007, chap. 7)analyze data studying the effect of ceramic dental veneer placement on gingival(gum) health. Data on 55 teeth located in the maxillary arches of 12 patients wereconsidered.
use http://www.stata-press.com/data/r13/veneer, clear (Dentalveneer data)
describe

Veneers were placed to match the original contour of thetooth as closely as possible, and researchers were interested in how contourdifferences (variable cda) impacted gingival health. Gingival health was measured asthe amount of gingival cervical fluid (GCF) at each tooth, measured at baseline(variable base gcf) and at two posttreatment follow-ups at 3 and 6 months.The variable gcf records GCF at follow-up, and the variable followuprecordsthe follow-up time. Because two measurements were taken for each tooth andthere exist multiple teeth per patient, we fit a three-level model with thefollowing random effects: a random intercept and random slope on follow-up timeat the patient level, and a random intercept at the tooth level. For the ithmeasurement of the jth tooth from the kth patient, we have
gcfijk = β0 + β1followupijk + β2base gcfijk + β3cdaijk + β4ageijk+u0k + u1kfollowupijk + v0jk + Єijk
mixed gcf followup base_gcf cdaage || patient: followup, cov(un) || tooth:, reml nolog

We modify the above to estimate two distinct errorvariances: one for the 3-month follow-up and one for the 6-month follow-up. Tofit this model, we add the residuals(independent, by(followup)) option,which maintains independence of residual errors but allows forheteroskedasticity with respect to follow-up time.
mixed gcf followup base_gcf cda age || patient:followup, cov (un) || tooth: residuals (independent, by (followup)) reml nolog

The default residual-variance structure is independent, and when specified without by() is

equivalent to the defaultbehavior of mixed: estimating one overall residual standard variance for the entiremodel

8楼

tamtam7010 发表于 2014-4-25 00:12:06

thanks for your sharing, xie xie

已有 1 人评分	论坛币	收起理由
Lisrelchen	+ 20	鼓励积极发帖

总评分: 论坛币 + 20 查看全部评分

9楼

ReneeBK 发表于 2014-6-3 20:39:31

We begin by importing the tab-delimited version of the Rat Pup data set into Stata, assuming that the rat_pup.dat file is located in the C:\temp directory. Note that we present the Stata commands including the prompt (.), which is not entered as part of the commands. . insheet using "C:\temp\rat_pup.dat", tab

We utilize the xtmixed command (first available in Stata Release 9) to fit the models
for this example.
Step 1: Fit a model with a “loaded” mean structure.

Stata by default treats the lowest-valued level of a categorical variable (alphabetically or numerically) as the reference category, so the default reference level for SEX would be Female in this case. To be consistent with the other software procedures, we use the char command to set the reference category for SEX to be Male:
. char sex[omit] Male

Because “Control” is the lowest level of TREATMENT alphabetically, we do not need to recode this variable. The xtmixed command used to fit Model 3.1 is specified as follows:
. * Model 3.1 fit .
. xi: xtmixed weight i.treatment*i.sex litsize, || litter:,covariance(identity) variance

The xi: (interaction expansion) option is used before invoking the xtmixed command to create dummy variables for the categories of the fixed factors TREATMENT and SEX, and for the corresponding interaction terms. The i. notation (i.treatment*i.sex) automatically specifies that fixed effects associated with the interaction between TREATMENT and SEX should be included in the model, along with the main effects for both of these terms.
This tells us that Control is the reference level of TREATMENT (Control omitted), i.e., no dummy variable is created for the control treatment, and that Male is the reference level of SEX (Male omitted).
The xtmixed command syntax has three parts, separated by commas. The first part specifies the fixed effects, the second part specifies the random effects, and the third part specifies the covariance structure for the random effects, in addition to miscellaneous options. We discuss these parts of the syntax in detail later. The first variable listed after the xtmixed command is the continuous dependent variable, WEIGHT. The variables following the dependent variable are the terms that will have associated fixed effects in the model. We include i.treatment*i.sex and litsize.
The two vertical bars (||) precede the variable that defines clusters of observations (litter:).
The absence of additional variables after the colon indicates that there will only be a single random effect associated with the intercept for each litter in the model.
The covariance option after an additional comma specifies the covariance structure for the random effects (or the D matrix). Because Model 3.1 includes only a single random effect associated with the intercept (and therefore a single variance parameter associated with the random effects), it has an identity covariance structure. The covariance option is actually not necessary in this simple case.
Finally, the variance option requests that the estimated variances of the random effects and the residuals be displayed in the output, rather than their estimated standard deviations, which is the default. The xtmixed procedure uses REML estimation by default.
The AIC and BIC information criteria can be obtained by using the following command after the xtmixed command has finished running:

. * Information criteria.
. estat ic

By default, the xtmixed command does not display F-tests for the fixed effects in the model. Instead, omnibus Wald chi-square tests for the fixed effects in the model can be performed using the test command. For example, to test the overall significance of the fixed treatment effects, the following command can be used:

. * Test overall significance of the fixed treatment effects.
. test _Itreatment_2 _Itreatment_3

The two terms listed after the test command are the dummy variables automatically generated by Stata for the fixed treatment effects (we obtain the names for these dummy variables from the initial output for the model shown above). The fixed effects associated with these variables will be displayed in the output, and the indicator variables for the high and low levels of TREATMENT will automatically be saved in the data set. The test command is testing the null hypothesis that the two fixed effects associated with these terms are equal to zero (i.e., the null hypothesis that the treatment means are all equal).
Similar omnibus tests may be obtained for the fixed SEX effect, the fixed LITSIZE effect, and the interaction between TREATMENT and SEX:

. * Omnibus tests for SEX, LITSIZE and TREATMENT*SEX interaction.
. test _Isex_1
. test litsize
. test _ItreXsex_2_1 _ItreXsex_3_1

The dummy variables that Stata generates for the interaction effects (_ItreXsex_2_1 and _ItreXsex_3_1) can be somewhat confusing to identify, but they are located in the Variables window in Stata and are also displayed in the initial model output. Once a model has been fitted using the xtmixed command, EBLUPs of the random effects associated with the levels of the random factor (LITTER) can be saved in a new variable (named EBLUPS) using the following syntax:

. predict eblups, reffects

The saved EBLUPs can then be used to check for random effects that may be outliers.

Step 2: Select a structure for the random effects (Model 3.1 vs. Model 3.1A).
We test Hypothesis 3.1 to decide whether the random effects associated with the intercept for each litter can be omitted from Model 3.1, using a likelihood ratio test, based on the following output generated by the xtmixed command:

Stata reports “chibar2(01),” indicating that it uses the correct null hypothesis distribution of the test statistic each with equal weight of 0.5 (see Subsection 3.5.1). The likelihood ratio test reported by the xtmixed command is an overall test of the covariance parameters associated with all random effects in the model. In models with a single random effect for each cluster, as in Model 3.1, it is appropriate to use this test to decide if that random effect should be included in the model. The significant result of this test (p < .001) suggests that the random litter effects should be retained in Model 3.1. The current version of the xtmixed command in Stata does not allow one to fit LMMs with residual variances that vary between different levels of a “group” variable in the data set (e.g., Models 3.2A, 3.2B, and 3.3). We therefore do not consider additional models in Stata.

10楼

oasises 发表于 2019-6-1 21:07:16 来自手机

ReneeBK 发表于 2014-4-4 01:29
Consider a longitudinaldataset, used by both Ruppert, Wand, and Carroll (2003) and Diggle etal. (200 ...

好的。

Stata多層面分析案例 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

Stata多層面分析案例 [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群