Brief Contents PART I: INTRODUCTION 1 Chapter 1 Discriminant Analysis in Research 3 Chapter 2 Preliminaries 15 PART II: ONE-FACTOR MANOVA/DDA 33 Chapter 3 Group Separation 35 Chapter 4 Assessing MANOVA Effects 61 Chapter 5 Describing MANOVA Effects 81 Chapter 6 Deleting and Ordering Variables 103 Chapter 7 Reporting DDA Results 117 PART III: COMPLEX MANOVA 129 Chapter 8 Factorial MANOVA 131 Chapter 9 Analysis of Covariance 163 Chapter 10 Repeated-Measures Analysis 193 Chapter 11 Mixed-Model Analysis 227 PART IV: GROUP-MEMBERSHIP PREDICTION 253 Chapter 12 Classification Basics 255 Chapter 13 Multivariate Normal Rules 269 vii viii BRIEF CONTENTS Chapter 14 Classification Results 285 Chapter 15 Hit Rate Estimation 295 Chapter 16 Effectiveness of Classification Rules 315 Chapter 17 Deleting and Ordering Predictors 335 Chapter 18 Two-Group Classification 349 Chapter 19 Nonnormal Rules 361 Chapter 20 Reporting PDA Results 375 Chapter 21 PDA-Related Analyses 385 PART V: ISSUES AND PROBLEMS 391 Chapter 22 Issues in PDA and DDA 393 Chapter 23 Problems in PDA and DDA 401 Contents List of Figures xix List of Tables xxi Preface to Second Edition xxv Acknowledgments xxvii Preface to First Edition xxix Notation xxxi I INTRODUCTION 1 1 Discriminant Analysis in Research 3 1.1 A Little History, 3 1.2 Overview, 5 1.3 Descriptive Discriminant Analysis, 5 1.4 Predictive Discriminant Analysis, 7 1.5 Design in Discriminant Analysis, 9 1.5.1 Grouping Variables, 9 1.5.2 Response Variables, 9 Exercises, 13 2 Preliminaries 15 2.1 Introduction, 15 2.2 Research Context, 15 2.3 Data, Analysis Units, Variables, and Constructs, 16 2.4 Summarizing Data, 18 2.5 Matrix Operations, 21 2.5.1 SSCP Matrix, 22 ix x CONTENTS 2.5.2 Determinant, 23 2.5.3 Inverse, 24 2.5.4 Eigenanalysis, 25 2.6 Distance, 26 2.7 Linear Composite, 28 2.8 Probability, 28 2.9 Statistical Testing, 29 2.10 Judgment in Data Analysis, 30 2.11 Summary, 31 Further Reading, 31 Exercises, 32 II ONE-FACTOR MANOVA/DDA 33 3 Group Separation 35 3.1 Introduction, 35 3.2 Two-Group Analyses, 35 3.2.1 Univariate Analysis, 35 3.2.2 Multivariate Analysis, 39 3.3 Test for Covariance Matrix Equality, 41 3.4 Yao Test, 43 3.5 Multiple-Group Analyses—Single Factor, 44 3.5.1 Univariate Analysis, 44 3.5.2 Multivariate Analysis, 47 3.6 Computer Application, 52 3.7 Summary, 56 Exercises, 57 4 Assessing MANOVA Effects 61 4.1 Introduction, 61 4.2 Strength of Association, 62 4.2.1 Univariate, 62 4.2.2 Multivariate, 62 4.2.3 Bias, 65 4.3 Computer Application I, 66 4.4 Group Contrasts, 67 4.4.1 Univariate, 67 4.4.2 Multivariate, 68 4.5 Computer Application II, 72 4.6 Covariance Matrix Heterogeneity, 74 4.7 Sample Size, 74 CONTENTS xi 4.8 Summary, 75 Technical Notes, 76 Exercises, 77 5 Describing MANOVA Effects 81 5.1 Introduction, 81 5.2 Omnibus Effects, 82 5.2.1 An Eigenanalysis, 82 5.2.2 Linear Discriminant Functions, 83 5.3 Computer Application I, 85 5.4 Standardized LDFWeights, 87 5.5 LDF Space Dimension, 88 5.5.1 Statistical Tests, 89 5.5.2 Proportion of Variance, 91 5.5.3 LDF Plots, 91 5.6 Computer Application II, 93 5.7 Computer Application III, 94 5.8 Contrast Effects, 96 5.9 Computer Application IV, 96 5.10 Summary, 98 Technical Note, 99 Further Reading, 100 Exercises, 100 6 Deleting and Ordering Variables 103 6.1 Introduction, 103 6.2 Variable Deletion, 103 6.2.1 Purposes of Deletion, 103 6.2.2 McCabe Analysis, 104 6.2.3 Computer Application, 105 6.3 Variable Ordering, 106 6.3.1 Meaning of Importance, 106 6.3.2 Computer Application I, 108 6.3.3 Variable Ranking, 110 6.4 Contrast Analyses, 110 6.5 Computer Application II, 111 6.6 Comments, 113 Further Reading, 114 Exercises, 115 7 Reporting DDA Results 117 7.1 Introduction, 117 7.2 Example of Reporting DDA Results, 117 xii CONTENTS 7.3 Computer Package Information, 122 7.4 Reporting Terms, 123 7.5 MANOVA/DDA Applications, 124 7.6 Concerns, 124 7.7 Overview, 126 Further Reading, 127 Exercises, 127 III FACTORIAL MANOVA, MANCOVA, AND REPEATED MEASURES 129 8 Factorial MANOVA 131 8.1 Introduction, 131 8.2 Research Context, 131 8.3 Univariate Analysis, 134 8.4 Multivariate Analysis, 136 8.4.1 Omnibus Tests, 136 8.4.2 Distribution Assumptions, 138 8.5 Computer Application I, 139 8.6 Computer Application II, 146 8.7 Nonorthogonal Design, 150 8.8 Outcome Variable Ordering and Deletion, 151 8.9 Summary, 152 Technical Notes, 152 Exercises, 159 9 Analysis of Covariance 163 9.1 Introduction, 163 9.2 Research Context, 164 9.3 Univariate ANCOVA, 166 9.3.1 Testing for Equality of Regression Slopes, 166 9.3.2 Omnibus Test of Adjusted Means, 168 9.4 Multivariate ANCOVA (MANCOVA), 170 9.4.1 Matrix Calculations, 170 9.4.2 Testing for Equal Slopes, 171 9.5 Computer Application I, 173 9.6 Comparing Adjusted Means—Omnibus Test, 174 9.7 Computer Application II, 175 9.8 Contrast Analysis, 180 9.9 Computer Application III, 180 CONTENTS xiii 9.10 Summary, 184 Technical Note, 184 Exercises, 190 10 Repeated-Measures Analysis 193 10.1 Introduction, 193 10.2 Research Context, 195 10.3 Univariate Analyses, 196 10.3.1 Omnibus Test, 196 10.3.2 Contrast Analysis, 197 10.4 Multivariate Analysis, 199 10.5 Computer Application I, 202 10.6 Univariate and Multivariate Analyses, 204 10.7 Testing for Sphericity, 207 10.8 Computer Application II, 210 10.9 Contrast Analysis, 212 10.10 Computer Application III, 214 10.11 Summary, 216 Technical Notes, 217 Exercises, 223 11 Mixed-Model Analysis 227 11.1 Introduction, 227 11.2 Research Context, 228 11.3 Univariate Analysis, 229 11.4 Multivariate Analysis, 231 11.4.1 Group-by-Time Interaction, 232 11.4.2 Repeated-Measures Variable Main Effect, 235 11.5 Computer Application I, 237 11.6 Contrast Analysis, 240 11.7 Computer Application II, 243 11.8 Summary, 246 Technical Note, 247 Exercises, 249 IV GROUP MEMBERSHIP PREDICTION 253 12 Classification Basics 255 12.1 Introduction, 255 12.2 Notion of Distance, 256 xiv CONTENTS 12.3 Distance and Classification, 259 12.4 Classification Rules in General, 260 12.4.1 Maximum Likelihood, 260 12.4.2 Typicality Probability, 261 12.4.3 Posterior Probability, 262 12.4.4 Prior Probability, 263 12.5 Comments, 264 Technical Note, 265 Further Reading, 265 Exercises, 266 13 Multivariate Normal Rules 269 13.1 Introduction, 269 13.2 Normal Density Functions, 269 13.3 Classification Rules Based on Normality, 271 13.4 Classification Functions, 273 13.4.1 Quadratic Functions, 273 13.4.2 Linear Functions, 274 13.4.3 Distance-Based Classification, 275 13.5 Summary of Classification Statistics, 277 13.6 Choice of Rule Form, 278 13.6.1 Normal-Based Rule, 278 13.6.2 Covariance Matrix Equality, 279 13.6.3 Rule Choice, 280 13.6.4 Priors, 281 13.7 Comments, 281 Technical Notes, 283 Further Reading, 283 Exercises, 284 14 Classification Results 285 14.1 Introduction, 285 14.2 Research Context, 285 14.3 Computer Application, 286 14.4 Individual Unit Results, 287 14.4.1 In-Doubt Units, 288 14.4.2 Outliers, 289 14.5 Group Results, 290 CONTENTS xv 14.6 Comments, 291 Technical Note, 291 Exercises, 292 15 Hit Rate Estimation 295 15.1 Introduction, 295 15.2 True Hit Rates, 296 15.3 Hit Rate Estimators, 297 15.3.1 Formula Estimators, 297 15.3.2 Internal Analysis, 299 15.3.3 External Analysis, 300 15.3.4 Maximum-Posterior-Probability Method, 302 15.4 Computer Application, 304 15.5 Choice of Hit Rate Estimator, 306 15.6 Outliers and In-Doubt Units, 306 15.6.1 Outliers, 307 15.6.2 In-Doubt Units, 307 15.7 Sample Size, 309 15.8 Comments, 310 Further Reading, 311 Exercises, 312 16 Effectiveness of Classification Rules 315 16.1 Introduction, 315 16.2 Proportional Chance Criterion, 316 16.2.1 Definition, 316 16.2.2 Statistical Test, 317 16.3 Maximum-Chance Criterion, 319 16.4 Improvement over Chance, 320 16.5 Comparison of Rules, 320 16.6 Computer Application I, 321 16.7 Effect of Unequal Priors, 323 16.8 PDA Validity/Reliability, 325 16.9 Applying a Classification Rule to New Units, 325 16.9.1 Computer Application II, 326 16.9.2 Computer Application III, 327 16.10 Comments, 330 Technical Notes, 330 Further Reading, 331 Exercises, 332 xvi CONTENTS 17 Deleting and Ordering Predictors 335 17.1 Introduction, 335 17.2 Predictor Deletion, 336 17.2.1 Purposes of Deletion, 336 17.2.2 Deletion Methods, 336 17.2.3 Package Analyses, 337 17.2.4 All Possible Subsets, 337 17.3 Computer Application, 337 17.4 Predictor Ordering, 340 17.4.1 Meaning of Importance, 340 17.4.2 Variable Ranking, 340 17.5 Reanalysis, 343 17.6 Comments, 343 17.7 Side Note, 345 Further Reading, 346 Exercises, 347 18 Two-Group Classification 349 18.1 Introduction, 349 18.2 Two-Group Rule, 349 18.3 Regression Analogy, 351 18.4 MRA–PDA Relationship, 353 18.5 Necessary Sample Size, 355 18.6 Univariate Classification, 356 Further Reading, 357 Exercises, 359 19 Nonnormal Rules 361 19.1 Introduction, 361 19.2 Continuous Variables, 362 19.2.1 Rank Transformation Analysis, 362 19.2.2 Nearest-Neighbor Analyses, 363 19.2.3 Another Density Estimation Analysis, 366 19.2.4 Other Analyses, 366 19.3 Categorical Variables, 366 19.3.1 Direct Probability Estimation Analysis, 367 19.3.2 Dummy Variable Analysis, 367 19.3.3 Overall–Woodward Analysis, 368 19.3.4 Fisher–Lancaster Analysis, 368 19.3.5 Other Analyses, 369 19.4 Predictor Mixtures, 369 CONTENTS xvii 19.5 Comments, 370 Further Reading, 371 Exercises, 373 20 Reporting PDA Results 375 20.1 Introduction, 375 20.2 Example of Reporting PDA Results, 375 20.3 Some Additional Specific PDA Information, 378 20.4 Computer Package Information, 379 20.5 Reporting Terms, 379 20.6 Sources of PDA Applications, 381 20.7 Concerns, 381 20.8 Overview, 382 Further Reading, 383 Exercises, 383 21 PDA-Related Analyses 385 21.1 Introduction, 385 21.2 Nonlinear Methods, 385 21.2.1 Classification and Regression Trees (CART), 385 21.2.2 Logistic Regression, 385 21.2.3 Neural Networks, 386 21.3 Other Methods, 386 21.3.1 Cluster Analysis, 386 21.3.2 Image Analysis, 387 21.3.3 Optimal Allocation, 387 21.3.4 Pattern Recognition, 387 Further Reading, 388 V ISSUES AND PROBLEMS 391 22 Issues in PDA and DDA 393 22.1 Introduction, 393 22.2 Five Choices in PDA, 393 22.2.1 Linear Versus Quadratic Rules, 393 22.2.2 Nonnormal Classification Rules, 394 22.2.3 Prior Probabilities, 394 22.2.4 Misclassification Costs, 394 22.2.5 Hit-Rate Estimation, 395 22.3 Stepwise Analyses, 395 22.4 StandardizedWeights Versus Structure r’s, 396 xviii CONTENTS 22.5 Data-Based Structure, 398 Further Reading, 400 23 Problems in PDA and DDA 401 23.1 Introduction, 401 23.2 Missing Data, 401 23.2.1 Data Inspection, 401 23.2.2 Data Imputation, 402 23.2.3 Missing GValues, 404 23.2.4 Ad Hoc Strategy, 404 23.3 Outliers and Influential Observations, 405 23.3.1 Outlier Identification, 405 23.3.2 Influential Observations, 406 23.4 Initial Group Misclassification, 406 23.5 Misclassification Costs, 407 23.6 Statistical Versus Clinical Prediction, 407 23.7 Other Problems, 409 Further Reading, 409 Appendix A Data Set Descriptions 411 Appendix B Some DA-Related Originators 415 Appendix C List of Computer Syntax 419 Appendix D Contents ofWileyWebsite 421 References 425 Answers to Exercises 449 Index 481 List of Figures 1.1 Classification of multivariate methods. 6 2.1 Distance in a plane. 27 5.1 LDF plot of group centroids for Baumann study. 92 5.2 Plot of group centroids in LDF space. 95 6.1 Plot ofWilks values versus best subset size for the 3-group Ethington data. 107 7.1 LDF plot of group centroids. 121 7.2 MANOVA and descriptive discriminant analysis. 126 8.1 LDF plot for the three school levels. 145 9.1 Two dimensional plot of adjusted group centroids. 179 12.1 Distance in a plane. 257 12.2 Graphical representations of two density functions. 261 17.1 Total group L-O-O hit rate versus best-subset size for the 3-group Ethington data. 339 20.1 Predictive discriminant analysis. 382 xix
List of Tables 2.1 Scores on the Error Detection Task (Y1) and Degrees of Reading Power (Y2) for the Think Aloud (TA) and Directed Reading Activity (DRA) Groups 16 2.2 Mean, Sum-of-Squares, and Variance for Test Scores on the Error Detection Task (Y1) and Degrees of Reading Power (Y2) for the Think Aloud (TA) and Directed Reading Activity (DRA) Groups (n1 = n2 = 22) 19 3.1 Scores on the Error Detection Task (Y1) and Degrees of Reading Power (Y2) for the Think Aloud (TA), Directed Reading Activity (DRA), and Directed Reading and Think Aloud (DRTA) Groups 45 3.2 Means and Variances for Test Scores on the Error Detection Task (Y1) and Degrees of Reading Power (Y2) for the Think Aloud (TA), Directed Reading Activity (DRA), and Directed Reading and Think Aloud (DRTA) Groups 45 3.3 Summary of Four MANOVA Test Statistics 56 4.1 Five Multivariate Effect Size Indices 65 5.1 Summary of Dimensionality Tests 90 6.1 Partial McCabe Output for the 3-Group Ethington Data 106 6.2 Results Used to Order Outcome Variables for the 3-Group Ethington Data 110 7.1 Descriptive Information for the 3-Group Ethington Data 119 7.2 Variable Ordering for the 3-Group Ethington Data 120 7.3 Test of Dimensionality for the 3-Group Ethhington Data 120 7.4 LDFs at Group Centroids 120 7.5 Structure r’s for the 3-Group Ethington Data 121 7.6 Structure r’s for Group 1 versus Group 3 122 7.7 DDA Printout Information 123 7.8 DDA versus PDA; Context: J Groups of Units, p Response Variables 125 8.1 Test Scores on Four Measures of Stress for Three School Levels and Two Levels of Gender 132 8.2 Means and Standard Deviations for Four Measures of Stress from Three School Levels and Two Levels of Gender 133 xxi xxii LIST OF TABLES 8.3 Univariate Sum-of-Squares 134 8.4 ANOVA Summary for Variable Y1 135 8.5 Summary of Omnibus Univariate Results for Outcome Variables Y2, Y3, and Y4 135 8.6 Summary of Pairwise Contrasts Among School Levels with Bonferroni Adjusted P Values for Outcome Variables Y2, Y3, and Y4 136 8.7 (i) Values for the Two-Factor (3 × 3) Ethington Data 151 9.1 Vocabulary Scores for Three Treatment Interventions and a Control Group 165 9.2 Means and Variances for Morphemic Only (MO), Context Only (CO), Morphemic and Context (MC), and Control (C) Groups on Five Vocabulary Tests 166 9.3 SSCP and E Matrices 167 9.4 Analysis of Covariance Summary Table 169 9.5 Adjusted Means for the Four Participating Groups 169 9.6 Sum-of-Squares and Cross-Products for Grand-Mean Centered (Total), Each of the Group-Mean Centered (MO, CO, MC, and C), and the Error Matrices 172 10.1 Scores from the Rosenberg Self-Esteem Inventory 196 10.2 ANOVA Summary Table for Changes in Self-Esteem Over the Second and Third Trimesters 197 10.3 Coefficients for Linear to Quintic Polynomial Trend Analysis for Six Measurements 198 10.4 Orthonormal Polynomial Coefficients 205 11.1 Self-Esteem Scores for Pregnant and NonpregnantWomen 229 11.2 Self-Esteem Means and (Standard Deviations) by Group and Month 229 11.3 Formulas for the Univariate Sum-of-Squares for the Mixed-Model Analysis of Variance 230 11.4 Univariate Analysis of Variance Summary Table for the Mixed Model 230 11.5 Separate Group and Summed Error SSCP Matrices 233 13.1 LCFs for the 3-Group Ethington Data 275 13.2 Classification Statistics 277 13.3 Alternative Forms of Classification Statistics 278 14.1 Some Unit Classification Results for the 3-Group Ethington Data 288 14.2 Classification Table for J = 3 290 14.3 Classification Table for the 3-Group Ethington Data 291 15.1 PDA and MCA/MRA Indices 297 15.2 Leave-One-Out Results for the 3-Group Ethington Data 305 15.3 Hit Rate Estimates for the 3-Group Ethington Data 305 15.4 SAS Linear L-O-O Results for the 3-Group Ethington Data with THRESHOLD = .45 308 15.5 Threshold Classification Rates for Group 2 of the 3-Group Ethington Data 309 LIST OF TABLES xxiii 15.6 Smallest Group Sizes for a PDA 310 16.1 Classification Table Notation 316 16.2 Linear L-O-O Results for the 3-Group Ethington Data 318 16.3 Hypothetical Classification Table 319 16.4 Comparison of Rules 321 16.5 Summary of Linear and Quadratic L-O-O Classification Results for the 3-Group Ethington Data 323 16.6 Linear L-O-O Results for the 3-Group Ethington Data Using Equal Priors 324 16.7 Summary of Total-Group Linear L-O-O Results for the 3-Group Ethington Data 324 16.8 Scores on Nine Predictor Variables for Five Hypothetical New Students 326 16.9 Classification Results for New Students 328 16.10 Classification Results for New Students Using a Quadratic Rule 329 17.1 Total-Group L-O-O Hit Rates for Variable Subsets from the 3-Group Ethington Data 339 17.2 Linear L-O-O Group 2 Hit Rates, Transformed Hit Rates, and Predictor Ranks for the 3-Group Ethington Data 341 18.1 Regression Classification Results for Groups 1 and 2 of the 3-Group Ethington Data 353 18.2 Minimum Sample Size, n(= n1 = n2), in Each Group Required for P to beWithin γ of P(o) 356 19.1 Linear L-O-O Rank-Based PDA Results for the 3-Group Ethington Data 363 19.2 L-O-O Linear 3-NN Results for the 3-Group Ethington Data 365 19.3 CategoryWeights for X3 and X4 in the HSB Data 370 19.4 SuggestedWays of Handling Nonnormal Predictors 371 20.1 Linear L-O-O Group Classification Results 377 20.2 Classification RuleWeights (and Constants) 377 20.3 DA Printout Information 379 A.1 Variables Selected from the CCSEQ 412 A.2 Cell Sizes for the Race-by-Grade Design 412 A.3 Categorical Response Variables 412