楼主: hanszhu
11084 22

[学科前沿] [推荐]Introduction to Generalized Linear Models [推广有奖]

  • 0关注
  • 34粉丝

院士

26%

还不是VIP/贵宾

-

TA的文库  其他...

Clojure NewOccidental

Job and Interview

Perl资源总汇

威望
7
论坛币
144575016 个
通用积分
71.5575
学术水平
37 点
热心指数
38 点
信用等级
25 点
经验
23228 点
帖子
1869
精华
1
在线时间
796 小时
注册时间
2005-1-3
最后登录
2024-4-23

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
<P>Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the Normal distribution, such as the Poisson, Binomial, Multinomial, and etc. Generalized Linear Models also relax the requirement of equality or constancy of variances that is required for hypothesis tests in traditional linear models. </P>
<P>


<p>
<P><B>The General Linear Univariate Model (GLUM)
<p></B>
<p>
<P>
<p>
<p>
<P>Most parametric statistical analyses can be viewed as a process of fitting a linear model to the observed data and testing hypotheses about the fitted model’s parameters. Even the lowly <I>t</I> – test is a form of the General Linear Univariate Model (GLUM). The Analysis of Variance (ANOVA), Regression, Multiple Regression, and the Analysis of Covariance (ANCOVA) are more complicated forms of the GLUM.  </P>
<P>
<p>
<p>
<P>The least squares criterion is used to obtain estimates of the parameters of these GLUM models.  Additional assumptions must be met in order to test hypotheses about the model’s parameters. Besides the assumption of independence of the observations, which is required for all statistical analyses, hypothesis tests derived from GLUM’s require normality of the response variable and constancy or homogeneity of variances. </P>
<P>
<p>
<p>
<P><B>The General Linear Multivariate Model (GLMM)
<p></B>
<p>
<P>
<p>
<p>
<P>When attempting to explain variation in more than one response variable simultaneously the modeling exercise is to fit the General Linear Multivariate Model (GLMM) to the data. Commonly used multivariate statistical procedures such as Multivariate Analysis of Variance (MANOVA), Multivariate Analysis of Covariance (MANCOVA), Discriminant Function Analysis (DFA), Canonical Correlation Analysis (CCA), and Principal Components Analysis (PCA) are all forms of the GLMM. To perform hypothesis tests in the context of the GLMM, one must assume that the response variables are multivariate normal and that the variance-covariance matrices are homogeneous.</P>
<P>
<p>
<p>
<P>When the distribution of the response variable(s) is not normal or multivariate normal, or if the variances or the variance-covariance matrices are not homogeneous, then application of hypothesis tests to GLUM’s or GLMM’s can lead to Type I and Type II error rates that differ from the nominal rates. Traditionally, transformations of the scale of the response variables have been applied to insure that the assumptions required for hypotheses tests are met. For example, count data are often Poisson distributed and tend to be right skewed. Furthermore, the variance of a Poisson random variable is equal to the mean of the response. Hence, for count data a transformation must both normalize the data and eliminate the inherent variance heterogeneity. Commonly, count data are transformed to a logarithmic scale or even a square-root scale, however such transformations are not always successful in achieving the desired end. In fact, there is no a priori reason to believe that a scale exists that will insure that data meet the normality and variance homogeneity assumptions.  </P>
<P>
<p>
<p>
<P><B>General - <I>izing</I> the Linear Model
<p></B>
<p>
<P>
<p>
<p>
<P>The Generalized Linear Model is an extension of the General Linear Model to include response variables that follow any probability distribution in the exponential family of distributions. The exponential family includes such useful distributions as the Normal, Binomial, Poisson, Multinomial, Gamma, Negative Binomial, and others. Hypothesis tests applied to the Generalized Linear Model do not require normality of the response variable, nor do they require homogeneity of variances. Hence, Generalized Linear Models can be used when response variables follow distributions other than the Normal distribution, and when variances are not constant. For example, count data would be appropriately analyzed as a Poisson random variable within the context of the Generalized Linear Model. </P>
<P>
<p>
<p>
<P>Parameter estimates are obtained using the principle of maximum likelihood; therefore hypothesis tests are based on comparisons of likelihoods or the deviances of nested models. </P>
<H3><B>Applications
<p></B></H3>
<H3>Several forms of the Generalized Linear Model are now commonly used and implemented in many statistical software packages. <a href="http://userwww.sfsu.edu/~efc/classes/biol710/logistic/logisticreg.htm" target="_blank" >Logistic Regression</A>, Multiway Frequency Analysis (<a href="http://userwww.sfsu.edu/~efc/classes/biol710/loglinear/Log%20Linear%20Models.htm" target="_blank" >Log-Linear Models</A>), Logit Models, and Poisson Regression are all forms of the Generalized Linear Model. In Logistic Regression, the binary response variable is modeled as a Binomial random variable with the logit link function. For Multiway Frequency Analysis (Log-Linear Models), the response variable is usually modeled as a Poisson random variable with the log link function. However, one could assume that the response variable is Binomial or Multinomial, but the results would not differ from those obtained assuming the response variable to be Poisson distributed (Agresti 1996). For logit models, binary response variables are modeled as Binomial random variables, while polychotomous response variables are modeled as Multinomial random variables, but in both instances the link function is the logit function. In Poisson regression, the response variable is modeled as a Poisson random variable with the log link function.    </H3>
<P>
<p>
<p>
<P>
<p>
<p>
<H2>Software</H2>
<P>
<p>
<p>
<P>GLZ’s can be fit and evaluated using SPLUS, SAS, SPSS, and a number of other statistical packages. Of the major packages, SPLUS and SAS provide greater flexibility in fitting and evaluating GLZ’s</P>
<P>
<p>
<p>
<P><B>References
<p></B>
<p>
<P>
<p>
<p>
<P>Agresti, A. 1996. An Introduction to Categorical Data Analysis. John Wiley & Sons: New York. (A very readable introduction the many forms of the generalized linear model) </P>
<P>
<p>
<p>
<P>McCullagh, P. and J.A. Nelder. 1989. Generalized Linear Models. Chapman and Hall: London. (mathematical statistics of generalized linear model)</P>
<P>
<p>
<p>
<P>
<p>
<p>
<P><B>Ecological Applications of Generalized Linear Models
<p></B>
<p>
<P>
<p>
<p>
<P>Vincent, P.J. and J.M. Haworth. 1983. Poisson regression models of species abundance. Journal of Biogeography 10: 153-160.</P>
<P>
<p>
<p>
<P>Connor, E.F., E. Hosfield, D. Meeter, and X. Nui. 1997. Tests for aggregation and size-based sample-unit selection when sample units vary in size. Ecology 78: 1238 -1249. </P>

[此贴子已经被作者于2005-2-28 0:05:00编辑过]

二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:introduction Generalized troduction Generalize General hypothesis required General process follow

本帖被以下文库推荐

沙发
xiaozhangz 发表于 2005-2-28 00:02:00 |只看作者 |坛友微信交流群
太感谢了...正需要...

使用道具

藤椅
hanszhu 发表于 2005-2-28 00:06:00 |只看作者 |坛友微信交流群

使用道具

板凳
hanszhu 发表于 2005-2-28 00:06:00 |只看作者 |坛友微信交流群
Generalized Linear Models
in MATLAB
Programmed by Peter Dunn; Current Version: glmlab version 2.3.1 dated 11 July 1999

About glmlab: A basic overview and introduction for new users of glmlab; what glmlab can do.The Features of glmlab: What it can be used for and it's features, including screen shots and details of menu items.
Downloading glmlab: How to get the latest official version from MathWorks, and un-official versions locally.On-Line Manual: Links to the on-line manual, which includes a section on installing glmlab.
Cost: More details about using glmlab for free.Comments about glmlab: What others users have said about glmlab.
Links to Other Useful Pages: A couple of links to other statistics and glm pages.MATLAB Conference Paper: A conference paper presented about glmlab at the 1997 MATLAB Conference in Sydney.
Giving Feedback: How to supply feedback to the author.Frequently Asked Questions: Find answers to common questions regarding glmlab

使用道具

报纸
hanszhu 发表于 2005-2-28 00:07:00 |只看作者 |坛友微信交流群

Generalized Linear Models

Dr. Joseph Hilbe

Course Discussion Board: Click here to go to the course discussion board (pre-paid registration required, see above to register). Note that the discussion board is not activated until the course start date.

Aim of the Course: This course will explain the theory of generalized linear models (GLM), outline the algorithms used for GLM estimation, and explain how to determine which algorithm to use for a given data analysis.

Generalized Linear Models is a unified method used to extend the general linear model, or ordinary least squares (OLS) regression, to incorporate responses other than normal. GLM models are all members of the exponential family of distributions, and allow the modeling of responses, or dependent variables, that take the form of counts, proportions, dichotomies (1/0), positive continuous values, as well as values that follow the normal Gaussian distribution. Logistic, Poisson, and negative binomial regression models are three of the most noteworthy GLM family members.

The course will detail the basic theory of GLM and will schematically outline the various algorithms that have been used in GLM estimation. Explanations will involve determining which algorithms and models are optimal for a given data analysis as well as how to best interpret parameter estimates, standard errors, p-values, scale/dispersion values, and fit statistics.

Each type of GLM model will be addressed, with separate discussion sections being given to continuous response and to discrete response data situations. Particular emphasis shall be given to goodness-of-fit, residual analysis, and to adjustments of standard errors, for discrete response models, when there is excessive correlation in the data. The latter is known as the problem of overdispersion.

The course concludes by discussing how the basic GLM algorithm can be adjusted for certain data situations that do not follow explicit GLM model assumptions; e.g. truncated, censored, and zero-inflated models.

Who Should Take This Course: Analysts in any field who need to move beyond standard multiple linear regression models for modeling their data.

Instructors:Dr. Joseph Hilbe, Professor Emeritus, University of Hawaii, and Adjunct Professor of sociology and statistics, Arizona State University. Dr Hilbe has lectured worldwide on the topic of generalized linear models, has written extensively in the area, and wrote the first GLM command for the Stata statistical package in 1992 and a well-used negative binomial macro for SAS in 1993. He is the co-author (with James Hardin) of Generalized Linear Models and Extensions and Generalized Estimating Equations. Dr. Hilbe is currently software reviews editor for The American Statistician and is on the editorial board of the international journals Health Services Outcomes and Research Methodology and the Journal of Modern Applied Statistical Methods. He was also the founding editor of the Stata Technical Bulletin (1991), was a biostatistical consultant to the Health Care Financing Administration (HCFA) and lead biostatistician for both NRMI-2 and FASTRAK, the U.S. and Canadian national cardiovascular registries respectively.

Prerequisite: Participants should be familiar with basic probability and statistics, including multiple linear regression. Basic Concepts in Probability and Statistics and Introduction to Statistics: Design and Analysis at statistics.com cover introductory statistics, including a brief treatment of linear regression. For a more complete coverage of regression, see Introduction to Regression.

Organization of the Course: The course takes place over the Internet, at statistics.com. Course participants will be given an alias and access to a private bulletin board that serves as a forum for discussion of ideas, problem solving, and interaction with the instructor. The course is scheduled to take place over four weeks, and should require about 10 hours per week. At the beginning of each week, participants receive the relevant material, in addition to answers to exercises from the previous session. During the week, participants are expected to go over the course materials and work through exercises. Discussion among participants is encouraged. The instructor will provide answers and comments.

Course Requirements: James Hardin and Joseph Hilbe (2001), Generalized Linear Models and Extensions, (not included in course price) available here. PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE. In some lessons, you will benefit from being able to implement models in a software program that is able to do GLM (for example, Stata, SAS, S-PLUS, R). Click Here for information on obtaining a free (or nominal cost) copy of various software packages for use during the course.

Course Program: The course is structured as follows

SESSION 1: General overview of GLM

  • Derivation of GLM functions
  • GLM algorithms: OIM, EIM
  • Fit and residual statistics
SESSION 2: Continuous response models
  • Gaussian
  • Log-normal
  • Gamma
  • Log-gamma models for survival analysis
  • Inverse Gaussian
SESSION 3: Discrete response models
  • Binomial models: logit, probit, cloglog, loglog, others
  • Count models: Poisson, negative binomial, geometric
SESSION 4: Extending the model
  • Dealing with overdispersion
  • Truncated, censored, and zero-inflated models

[此贴子已经被作者于2005-2-28 0:10:53编辑过]

使用道具

地板
hanszhu 发表于 2005-2-28 00:12:00 |只看作者 |坛友微信交流群
Generalized linear models (Littell, chapter 10)
This site supports tutorial instruction on an linear models, based on Littell, et al. (2002) SAS for Linear Models. The materials in this site are appropriate for someone who has a reasonable command of basic linear regression. In a basic regression course, it is usually assumed that we are interesting in modeling effects based on observations of independent and identically distributed observations (e.g. a single cross section with simple random sampling). In the materials in this site, we expand the application of the linear model to the analysis of data arising from more complex design and sampling scenarios (e.g. experimental and quasi-experimental designs, cases with nested or clustered samples). The site is provided by Robert Hanneman, in the Department of Sociology at the University of California, Riverside. Your comments and suggestions are welcome, as is your use of any materials you find here.
This page is parallel to the organization of Littell's chapter:

使用道具

7
hanszhu 发表于 2005-2-28 00:13:00 |只看作者 |坛友微信交流群

An Introduction to Generalized Linear Models, 2nd Edition Annette J. Dobson

This is one of the books available for loan from Academic Technology Services (see Statistics Books for Loan for other such books, and details about borrowing). See Where to buy books for tips on different places you can buy these books.

Read it Online! (UC Only)

Stata SAS Chapter Title
Chapter 1 Chap 1 Introduction
Chapter 2 Chap 2 Model Fitting
Chapter 3 Chap 3 Exponential Family and Generalized Linear Models
Chapter 4 Chap 4 Estimation
Chapter 5 Chap 5 Inference
Chapter 6 Chap 6 Chap 6 Normal Linear Models
Chapter 7 Chap 7 Binary Variables and Logistic Regression
Chapter 8 Chap 8 Nominal and Ordinal Logistic Regression
Chapter 9 Chap 9 Count Data, Poisson Regression and Log-Linear Models
Chapter 10 Chap 10 Survival Analysis
Chapter 11 Chap 11 Clustered and Longitudinal Data

[此贴子已经被作者于2005-2-28 0:14:56编辑过]

使用道具

8
hanszhu 发表于 2005-2-28 00:16:00 |只看作者 |坛友微信交流群

Software:Generalized Linear Models

Genstat and GLIM

  • Genstat. VSN International.
  • GLIM. NAG, Oxford.
  • GARMA Generalized Autoregressive Moving Average Models. Mikis Stasinopoulos, University of North London.
  • MADAM: Mean and Dispersion Additive Models. Mikis Stasinopoulos, University of North London.

LispStat

Matlab

  • glmlab. Interactive modules in MATLAB for fitting GLMs, by Pete Dunn, University of Southern Queensland.
  • StatBox. Statistical toolbox including binomial and Poisson generalized linear models. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.

R

SAS

  • Negative Binomial Regression. Uses the GENMOD procedure to fit a Log Negative Binomial Regression model, estimating the heterogeneity parameter.
  • Generalized Linear Models. Examples and SAS code for a reading course on generalized linear models by Robert Hanneman, University of California, Riverside.

SciLab

  • GLMBOX. An implementation of generalized linear models for the SciLab environment.

S-Plus

  • Generalized Linear Models in S. Notes on S-Plus by Bill Venables. Erik Moledor, Duke University.
  • Statlib S Archive. Contains a number of entries applicable to generalized linear models.
  • StatMod Library. A library of S-Plus functions for statistical modelling, including generalized linear models. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
  • Digamma Family. The gamma deviance family. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
  • Double generalized linear models. Simultaneously model the mean and the dispersion in generalized linear models. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
  • Negative Binomial Family. An S-Plus function by Bill Venables. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
  • Randomized quantile residuals. An improvement on deviance and Pearson residuals, especially when the response takes on a relatively small number of distinct values. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
  • Tweedie Family. Generalized linear models with any power variance function and any Box-Cox family link. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.

Stata

  • Stata Houndouts for WWS509. Handouts in ps or pdf on using Stata for a graduate course at Princeton. Includes linear models, logit models, Poisson regression, contingency tables, multinomial responses, ordered response models and survival models. Germán Rodríguez, Princeton University.

XploRe

Reviews

  • Hilbe, J. M. (1994). Generalized linear models. American Statistician, 48(3), 155-265. Reviews software for generalized linear models including Genstat, SAS, GLIM, S-Plus and others.

[此贴子已经被作者于2005-2-28 0:17:56编辑过]

使用道具

9
hanszhu 发表于 2005-2-28 00:27:00 |只看作者 |坛友微信交流群

Logistic Regression and Generalized Linear Models

York Summer Programme in Data Analysis

June 2004

John Fox

Course Materials


[此贴子已经被作者于2005-2-28 0:33:23编辑过]

使用道具

10
hanszhu 发表于 2005-2-28 00:35:00 |只看作者 |坛友微信交流群

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-26 10:47