This contribution is from David Corliss. David teaches a class on this subject, giving a (very brief) description of 23 regression methods in just an hour, with an example and the package and procedures used for each case. Here you can check the webcast done for Central Michigan University. The slide deck can be found here. Below is the presentation transcript. If you know some other types of regressions, you can list them in the comment section below. For instance, I would add piecewise linear regression, as well as regression on unusual domains (on a sphere or on the simplex.) For more on regression, click here. Presentation transcript 1 Speed Dating with Regression Procedures. David J Corliss, PhD, Wayne State University, Physics and Astronomy / Public Outreach 2 Model Selection Flowchart NON-LINEAR LINEAR MIXED NON-PARAMETRIC 3 Decision: Continuous or Discrete Outcome PROC LOGISTIC PROC REG 4 Simple Linear Regression Regression Type: Continuous, linear Regression Type: Continuous, linear General regression procedure with a number of options but limited specialized capabilities, for which other procedures or packages have been developed General regression procedure with a number of options but limited specialized capabilities, for which other procedures or packages have been developed Choice of model variable selection methods (e.g., Forward, Backwards, Best Subsets), can be coded for polynomial regression, multiple model statements and features interactive capability Choice of model variable selection methods (e.g., Forward, Backwards, Best Subsets), can be coded for polynomial regression, multiple model statements and features interactive capability SAS = REG, R = lm function, regress SAS = REG, R = lm function, regress 5 Simple Linear Regression Eample: Homeless Students by State Example: Homeless Students by State Solid performance of the model across the range from low to high homelessness states indicates consistency of factors correlated with the number of homeless students r 2 =.652 Actual Percent Model - Percent of Student Population 6 Special Data Needs: Problems with Outliers Robust Regression Regression Type: Continuous, linear Regression Type: Continuous, linear Robust regression is achieved by identifying outliers, limiting their influence by assigning weights and then performing standard regression Robust regression is achieved by identifying outliers, limiting their influence by assigning weights and then performing standard regression Choice of methods for outlier detection e.g. M, LTS, S and MM estimation; robust ANOVA Choice of methods for outlier detection e.g. M, LTS, S and MM estimation; robust ANOVA SAS = ROBUSTREG, R = robustbase, robust SAS = ROBUSTREG, R = robustbase, robust 7 PROC ROBUSTREG Eample: Log-Log Regression With Weighted Outliers Example: Log-Log Regression With Weighted Outliers SAS/STAT ® 9.2 User’s Guide, support.sas.com In Robust Regression, the outliers need not be disregarded: weights can be assigned and incorporated in the regression 8 Special Data Needs: Ill-Conditioned Data Regression Using Givens Rotations Regression Type: Continuous, linear Regression Type: Continuous, linear Regression using the Gentleman-Givens procedure instead of collecting crossproducts Regression using the Gentleman-Givens procedure instead of collecting crossproducts For ill-conditioned data, where small errors in the data may cause large errors in the results – more accurate than simple regression For ill-conditioned data, where small errors in the data may cause large errors in the results – more accurate than simple regression SAS = ORTHOREG, R = givens SAS = ORTHOREG, R = givens 9 Givens Rotation Regression Eample: Fitting a Higher-Order Polynomial Example: Fitting a Higher-Order Polynomial SAS/STAT ® 9.2 User’s Guide, support.sas.com An example of fitting a 9 th -degree polynomial, where near singularities must be distinguished from true ones 10 Special Data Needs: Transformation Regression with Data Transformation Regression Type: Continuous, linear Regression Type: Continuous, linear Regression with a number of data transformations, including smooth, spline, Box-Cox and other non- linear forms Regression with a number of data transformations, including smooth, spline, Box-Cox and other non- linear forms Supports fitting splines with a user-specified degree and number of knots; capable of piece-wise solutions Supports fitting splines with a user-specified degree and number of knots; capable of piece-wise solutions SAS = TRANSREG, R = reg, betareg SAS = TRANSREG, R = reg, betareg 11 Regression with Data Transformation ample: Spline Regression to a Complex Form Example: Spline Regression to a Complex Form Splines used to fit to a spectrographic line profile to determine the radial velocity of erupting gas from a star 12 Special Model Types: General Linear General Linear Models Regression Type: Continuous, linear Regression Type: Continuous, linear General purpose procedure for continuous least squares regression using classification predictor variables as well as continuous General purpose procedure for continuous least squares regression using classification predictor variables as well as continuous While capable of many types of models and analysis, another procedure is often better for a specific task While capable of many types of models and analysis, another procedure is often better for a specific task SAS = GLM, R = glm function SAS = GLM, R = glm function 13 General Linear Model Eample: Age Group as a Categorical Predictor Variable Example: Age Group as a Categorical Predictor Variable GLM used with Box and Whisker output An Overview of ODS Statistical Graphics in SAS ® 9.3 Robert N. Rodriguez, SAS Institute Inc., Cary, NC agegroup Distribution of Response 14 Special Model Types: By Quantile Quantile Regression Regression Type: Continuous, linear Regression Type: Continuous, linear Quantile regression: while other procedures model the mean, quantile regression models the median and other specified quantiles to provide a more complete picture of the response variable Quantile regression: while other procedures model the mean, quantile regression models the median and other specified quantiles to provide a more complete picture of the response variable Uncertainties for individual quantiles can be estimated by bootstrapping Uncertainties for individual quantiles can be estimated by bootstrapping SAS = QUANTREG, R = quantreg SAS = QUANTREG, R = quantreg 15 Quantile Regression Eample: 5/10/ 25/50/75/90/95% Quantiles Example: 5/10/ 25/50/75/90/95% Quantiles An example of Quantile Regression demonstrating greater detail than possible with ordinary regression Quantile regression with PROC QUANTREG Peter L. Flom, Peter Flom Consulting, New York, NY Predicted birth weight by maternal weight gain 16 Special Model Types: PLS, PCA Regression Partial Least Squares & Principal Components Regression Type: Continuous, linear Regression Type: Continuous, linear Partial Least Squares and Principal Component regression: predictor and response variables are projected into a new coordinate systems, possibly with reduced complexity Partial Least Squares and Principal Component regression: predictor and response variables are projected into a new coordinate systems, possibly with reduced complexity Supports reduced rank regression with cross validation of the number of components Supports reduced rank regression with cross validation of the number of components SAS = PLS, R = pls SAS = PLS, R = pls 17 Partial Least Squares / Principal Components Eample: Variable Importance Plot Example: Variable Importance Plot Principal Component variables derived from the original, observed variables Quantile regression with PROC QUANTREG Peter L. Flom, Peter Flom Consulting, New York, NY 18 Special Model Types: Survey Data Survey Regression Regression Type: Continuous, linear Regression Type: Continuous, linear Special capabilities for analysis in the presence of common survey data features, including stratification, clustering and weighting Special capabilities for analysis in the presence of common survey data features, including stratification, clustering and weighting Supports several methods for sampling and estimation of sampling error using either Taylor series or primary sample units Supports several methods for sampling and estimation of sampling error using either Taylor series or primary sample units SAS = SURVEYREG, R = survey SAS = SURVEYREG, R = survey 19 Survey Regression Eample: Regression with Stratified Sampling Example: Regression with Stratified Sampling Example output from application to survey data, with summary statistics and model parameters PROC SURVEYREG sas.support.com, example 98.4 Stratum Information Stratum IndexStateRegionN ObsPopulation TotalSampling Rate 1Iowa % % % 4Nebraska % % Tests of Model Effects EffectNum DFF ValuePr > F Model Intercept FarmArea Note:The denominator degrees of freedom for the F tests is 14. Estimated Regression Coefficients ParameterEstimate Standard Errort ValuePr > |t| Intercept FarmArea Covariance of Estimated Regression Coefficients InterceptFarmArea Intercept FarmArea 20 Special Model Types: PH on Survey Data Proportional Hazards with Survey Data Regression Type: Continuous, linear Regression Type: Continuous, linear Performs Cox Proportional Hazards modeling on survey data with truncation, supporting stratification, clustering and weighting Performs Cox Proportional Hazards modeling on survey data with truncation, supporting stratification, clustering and weighting Performs estimation of variance by model parameters by Taylor series, BRR or Jackknife Performs estimation of variance by model parameters by Taylor series, BRR or Jackknife SAS = SURVEYPHREG, R = survey SAS = SURVEYPHREG, R = survey 21 Proportional Hazards with Survey Data Eample: Stratified Sampling with Truncated Data Example: Stratified Sampling with Truncated Data Example output for Proportional Hazards regression on survey data with truncation: summary statistics and model parameters PROC SURVEYPHREG sas.support.com, example 97.2 Analysis of Maximum Likelihood Estimates ParameterDFEstimateStandard Errort ValuePr > |t| Hazard Ratio BodyWeight Smoke Smoke Smoke Smoke Type III Tests of Model Effects EffectNum DFDen DFF ValuePr > F BodyWeight Smoke Estimate LabelEstimateStandard ErrorDFt ValuePr > |t|Exponentiated Row 22 Special Model Types: Categorical Regression on Categorical Data Regression Type: Continuous, linear Regression Type: Continuous, linear A generalization of continuous methods to categorical data, performs linear regression and other analyses on data than can be expressed in a contingency tables A generalization of continuous methods to categorical data, performs linear regression and other analyses on data than can be expressed in a contingency tables Supports both ordinary and logistic regression, log- linear and repeated measures Supports both ordinary and logistic regression, log- linear and repeated measures SAS = CATMOD, R = catdata, vgam SAS = CATMOD, R = catdata, vgam 23 Regression on Categorical Data Eample: Bartlett's Data, No 3-Variable Interaction Example: Bartlett's Data, No 3-Variable Interaction Example output from regression on categorical data, with summary statistics and model parameters PROC CATMOD sas.support.com, example 28.4 Data Summary ResponseLength*Time*StatusResponse Levels8 Weight VariablewtPopulations1 Data SetBARTLETTTotal Frequency960 Frequency Missing0Observations8 Response Profiles ResponseLengthTimeStatus Maximum Likelihood Analysis of Variance SourceDFChi-SquarePr > ChiSq Length Time Length*Time Status148.94<.0001 Length*Status148.94<.0001 Time*Status195.01<.0001 Likelihood Ratio 本帖隐藏的内容