MISSING DATA
Conventional methods for missing data, like listwise deletion or regression imputation, are prone to three serious problems:
- Inefficient use of the available information, leading to low power and Type II errors.
- Biased estimates of standard errors, leading to incorrect p-values.
- Biased parameter estimates, due to failure to adjust for selectivity in missing data.
More accurate and reliable results can be obtained with maximum likelihood or multiple imputation.
These new methods for handling missing data have been around for at least a decade, but have only become practical in the last few years with the introduction of widely available and user friendly software. Maximum likelihood and multiple imputation have very similar statistical properties. If the assumptions are met, they are approximately unbiased and efficient--that is, they have minimum sampling variance. What's remarkable is that these newer methods depend on less demanding assumptions than those required for conventional methods for handling missing data. At present, maximum likelihood is best suited for linear models or log-linear models for contingency tables. Multiple imputation, on the other hand, can be used for virtually any statistical problem.
This course will cover the theory and practice of both maximum likelihood and multiple imputation. Maximum likelihood for linear models will be demonstrated with Amos 4, a software package designed for estimating structural equation models with latent variables. Multiple imputation will be demonstrated with two new SAS procedures, PROC MI and PROC MIANALYZE.
Materials
In addition to Professor Allison's text Missing Data, participants receive a bound manual containing detailed lecture notes (with equations and graphics), examples of computer printout, and many other useful features. This book frees participants from the distracting task of note taking.
Course outline
1. Assumptions for missing data methods
2. Problems with conventional methods
3. Maximum likelihood (ML)
4. ML with EM algorithm
5. Direct ML with Amos
6. ML for contingency tables
7. Multiple Imputation (MI)
8. MI under multivariate normal model
9. MI with SAS
10. MI with categorical and nonnormal data
11. Interactions and nonlinearities
12. Using auxiliary variables
13. Other parametric approaches to MI
14. Linear hypotheses and likelihood ratio tests
15. Nonparametric and partially parametric methods
16. Sequential generalized regression models
17. MI and ML for nonignorable missing data
Comments by April 2005 Participants
Participants in the April 2005 seminar were asked to rate the course on a scale of 1 (worst) to 10 (best). The average score for 27 respondents was 9.2. They were also asked if they wished to make an attributed statement regarding the course. Here are all the comments that were received:
"This has been a great learning experience for me. Intensive, yet reasonably paced, it offered a balanced combination of theories of missing data adjustment and practical applications. For someone like me who has had little previous experience with missing data analysis, this is a good way to get started."
Anca Romantan, Annenberg School for Communication, University of Pennsylvania
"Wonderful course! Makes you realize what your data/analysis is 'missing'."
Faika Zanjani, University of Pennsylvania
"Dr. Allison explains things thoroughly and with enough datail that the student is able to use the material after the course. A large amount of material is carefully condensed and presented in such a way as to still be easily comprehended. The course has an amazing balance between theory and practice. The presentations are engaging."
Jim Godbold, Mount Sinai School of Medicine
"This is a great class. I would recomend it for anyone doing applied or simulation research with missing data."
Carolyn Furlow, Georgia State University
"Even for a novice researcher with no SAS experience, this course has been an invaluable review of conceptual and practical issues related to missing data. Clear, cogent and thorough."
Angela Duckworth, Positive Psychology Center, University of Pennsylvania
"This course is very helpful and Dr. Allison explains complicated contents very easily."
Sunhee Park, University of Pennsylvania School of Nursing
"Theoretically informed, but a very practical 'how-to-do' approach to very common problems. Readily applicable to 'real-world' situations."
Daniel K. Cooper, Harris Interactive
"Missing data is becoming a big issue in all industries, from telecommunications to bank/financial services. Professor Allison taught us how to tackle this problem with the most up-to-date methodologies (both theoretical and practical approaches)."
Shakuntala Choudhury, Senior Marketing Statistician