楼主: ainur
7255 11

HLM样本数据 [推广有奖]

  • 0关注
  • 0粉丝

本科生

38%

还不是VIP/贵宾

-

威望
0
论坛币
390 个
通用积分
0.1538
学术水平
0 点
热心指数
0 点
信用等级
0 点
经验
204 点
帖子
13
精华
0
在线时间
167 小时
注册时间
2014-3-10
最后登录
2025-6-11

楼主
ainur 发表于 2014-4-1 17:01:53 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
用HLM处理数据每一层最少的样本量多少,我做的是两层的,个体层次和团队层次?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:样本数据 HLM 样本数 处理数据 样本量 样本

沙发
ReneeBK 发表于 2014-4-1 22:31:42
Although there is not much you can do from a power perspective, there are some precautions you can take to ensure that the estimates are unbiased. For the variance components, using REML will provide unbiased estimates. A study by Browne and Draper (2006) in Bayesian Analysis attained unbiased variance component estimates with as few as 6 clusters using REML.

For fixed effects, using a Kenward-Roger degree of freedom adjustment has been shown to provide unbiased estimates with small sample size. There is an advance article in Methodology by Bell et al that uses Kenward-Roger and estimates showed negligible bias with as few as 10 clusters. Kenward-Roger is available in SAS using the DDFM option in the MODEL statement.

Another option is to use a Bayesian framework. A 2010 study in the International Journal of Biostatistics by Austin found that Bayesian estimates were unbiased with as few as about 7 clusters with only 10 observations in each cluster.

Hope That Helps!

Dan McNeish

藤椅
ReneeBK 发表于 2014-4-1 22:33:12
If you look into it, you will find that multilevel models will not perform adequately with as few as 8 groups.  There are debates about how many is enough, but the debate does not get anywhere near 8.  It's not even a question of power, but bias in the estimation

Robert Brennan  

板凳
ReneeBK 发表于 2014-4-1 22:34:30
Good chance of negative estimated variance components with n2=8. Better to use survey-sensitive analysis software that corrects the variance estimates for clustering without trying to estimates components of variance at the same time.

Dave Judkins

报纸
ReneeBK 发表于 2014-4-1 22:35:31
I'm not sure if we are referring to the same method, but if by clustered regression you are referring to design based methods such as GEEs and sandwich estimators, those methods encounter similar difficulties with downwardly biased standard errors with small sample sizes. There are some attempts to correct the bias (e.g. Pan and Wall 2002 or Morel, Bokossa,and Neerchal 2003) but from my understanding they are not any more effective than using Kenward-Roger and also require the assumption that the model is properly specified. Kenward-Roger also allows for the variance components to be estimated rather than only producing marginal estimates as is the case with GEEs.

David McNeish

地板
ReneeBK 发表于 2014-4-1 22:37:58
Clustered regression (which is also what Dave calls the survey-sensitive methods) is not advisable for a small number of groups as small as 8.

For the clustered regression, I made a literature study of the sandwich estimator and this is summarized in Section 12.2 of the 2nd edition of Snijders & Bosker, "Multilevel Analysis" (Sage, 2012). The executive summary is that it is doubtful for small numbers of groups like less than 20 or 30. It certainly is not a panacea.

Best regards,

Tom Snijders

7
ReneeBK 发表于 2014-4-1 22:39:13
First, here is a StataCorp faq on what they do for cluster-adjusted robust SE's: http://www.stata.com/support/faqs/statistics/references/

Secondly, esp. re: small number of clusters, I recommend the following article: Cameron, AC and Miller, DL (2011), "Robust inference with clustered data," in A Ullah and DE Giles, _Handbook of Empirical Economics and Finance_, CRC Press, pp. 1-28; their basic answer is that the cluster-adjusted SE's are should not be used with a small number of clusters; they suggest a variation on clustered bootstrap for which there is some, but not full, code available on-line.

Richard Goldstein  

8
ReneeBK 发表于 2014-4-1 22:42:26
I think Bob's and Dave's suggestions are good (treat characteristics of the sites as fixed effects, don't try to estimate level 2 variance components). You could at least account for some of the dependency within sites by treating site as a clustering variable in Mplus (not as a second level of a multilevel model) using Type = Complex, and using MLR to obtain robust standard errors. You could also do this in the regress module in Stata by using vce(cluster site) to obtain robust standard errors controlling for within site dependency. In both programs, you can also use this approach for non-normally distributed dependent variables, e.g. binary outcomes examined with logistic regression

Bruce Cooper

9
ReneeBK 发表于 2014-4-1 22:44:05
The references I had cited were methods to obtain unbiased estimates for all estimates. From studies by Maas and Hox 2004 and 2005, fixed effect standard errors and level two variance components are the biggest worry with small samples. REML can address the level two variance components and Kenward-Roger can address the fixed effects standard error bias.

Also, Bethany Bell and colleagues presented a paper at the M3 conference at UConn in 2011 that compared a MLM with KR and REML with small sample size to a single level model with clusters as fixed effects and found the estimates were much closer to the true values with 10 clusters in their simulation study. I don't believe this study has been published yet, however.

Dan McNeish

10
ReneeBK 发表于 2014-4-1 22:46:22
I was interested in this discussion, and wondered about results from an extreme case -- say just five clusters, with six observations in each. The results of a quick simulation, and the R code for conducting the simulation, are below. The first line includes the means across 1000 simulations (which take just a couple minutes to run), and the second line the standard deviations. The means should all be 1s.

Maybe people will find this useful. The biases don't seem large to me, but there are lots of other issues to take into account. I haven't checked whether the estimated SEs are anti- (or over-)conservative, for instance, though that wouldn't be hard to do.
=================================================================
library(lme4.0)
nsims <- 1000
set.seed(080813)
dgp <- function(N=5,n=6) { within(data.frame(grp = gl(N,n), x1 = rnorm(N*n), x2 = runif(N)[rep(1:N, each=n)]), y <- 1 + x1 + x2 + rnorm(N)[grp] + rnorm(N*n)) } # function to generate the data
c.mer <- function(mod) { c(fixef(mod), c(unlist(lapply(VarCorr(mod), diag)), attr(VarCorr(mod), "sc")^2)) } # function to extract FEs, REs
sim <- function(N, n) { c.mer(lmer(y ~ x1 + x2 + (1 | grp), dgp())) } # function to run a simulation
apply(sapply(1:nsims, sim), 1, function(xx) c(mean(xx),sd(xx))) # run nsims simulations, get means and SDs

[1,]    1.050036 1.0013884 0.9049677       1.0322423 1.0104995
[2,]    1.429368 0.2163859 2.7632334       0.9928147 0.2887274=================================================================


Malcolm Fairbrother  


您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-1-28 22:02