楼主: xuehe
11489 12

负的二项式模型(NB)、泊松模型(Possion)和有序概率单位模型(OP)用什么软件做? [推广有奖]

贵宾

学术权威

90%

还不是VIP/贵宾

-

威望
8
论坛币
571284 个
通用积分
434.9290
学术水平
365 点
热心指数
358 点
信用等级
202 点
经验
353457 点
帖子
4354
精华
9
在线时间
2621 小时
注册时间
2004-12-31
最后登录
2024-4-20

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
有人做过否?请简要介绍怎么做?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Possion 二项式 POS OSS 怎么做 模型 概率 泊松 二项式 Possion

回帖推荐

hanszhu 发表于2楼  查看完整内容

1. Introduction An event count is the realization of a nonnegative integer-valued random variable (Cameron and Trivedi 1998). Examples are the number of car accidents per month, thunder storms per year, and wild fires per year. The ordinary least squares (OLS) method for event count data results in biased, inefficient, and inconsistent estimates (Long 1997). Thus, researchers have developed vario ...

本帖被以下文库推荐

沙发
hanszhu 发表于 2007-7-27 03:49:00 |只看作者 |坛友微信交流群

1. Introduction

An event count is the realization of a nonnegative integer-valued random variable (Cameron and Trivedi 1998). Examples are the number of car accidents per month, thunder storms per year, and wild fires per year. The ordinary least squares (OLS) method for event count data results in biased, inefficient, and inconsistent estimates (Long 1997). Thus, researchers have developed various nonlinear models that are based on the Poisson distribution and negative binomial distribution.

1.1 Count Data Regression Models

The left-hand side (LHS) of the equation has event count data. Independent variables are, as in the OLS, located at the right-hand side (RHS). These RHS variables may be interval, ratio, or binary (dummy). Table 1 below summarizes the categorical dependent variable regression models (CDVMs) according to the level of measurement of the dependent variable.

Table 1. Comparison between OLS and CDVMs

Model Dependent (LHS) Method Independent (RHS)
OLS Ordinary least squares Interval or ratio scale Moment based method A linear function of interval/ratio or binary independent variables
CDVMs Binary response Binary (0 or 1) Maximum Likelihood Method
Ordinal response Ordinal (1st, 2nd, ...)
Nominal response Nominal (A, B, ...)
Event count data Count (0, 1, 2, ...)

The Poisson regression model (PRM) and negative binomial regression model (NBRM) are basic models for count data analysis. Either the zero-inflated Poisson (ZIP) or the zero-inflated negative binomial regression model (ZINB) is used when there are many zero counts. Other count models are developed to handle censored, truncated, or sample selected count data. This document, however, focuses on PRM, NBRM, ZIP, and ZINB.

1.2 Poisson Models versus Negative Binomial Models

The Poisson probability distribution, , has the same mean and variance (equidispersion), Var(y)=E(y)=mu. As the mean of a Poisson distribution increases, the probability of zeros decreases and the distribution approximates a normal distribution (Figure 1). The Poisson distribution also has the strong assumption that events are independent. Thus, this distribution does not fit well if differs across observations (heterogeneity) (Long 1997).

The Poisson regression model (PRM) incorporates observed heterogeneity into the Poisson distribution function, Var(y|x)=E(y|x)=mu=exp(xb). As mu increases, the conditional variance of y increases, the proportion of predicted zeros decreases, and the distribution around the expected value becomes approximately normal (Long 1997). The conditional mean of the errors is zero, but the variance of the errors is a function of independent variables, var(y|x)=exp(xb). The errors are heteroscedastic. Thus, the PRM rarely fits in practice due to overdispersion (Long 1997; Maddala 1983).

Figure 1. Poisson Probability Distribution with Means of .5, 1, 2, and 5

The negative binomial probability distribution is , where 1/v=alpha determines the degree of dispersion and the Gamma is the Gamma probability distribution. As the dispersion parameter alpha increases, the variance of the negative binomial distribution also increases, Var(y|x)=mu(1+mu/v).

The negative binomial regression model (NBRM) incorporates observed and unobserved heterogeneity into the conditional mean, mu=exp(xb+e) (Long 1997). Thus, the conditional variance of y becomes larger than its conditional mean, E(y|x)=mu, which remains unchanged. Figure 2 illustrates how the probabilities for small and larger counts increase in the negative binomial distribution as the conditional variance of y increases, given mu=2.

Figure 2. Negative Binomial Probability Distribution with Alpha of .01, .5, 1, and 5

The PRM and NBRM, however, have the same mean structure. If , the NBRM reduces to the PRM (Cameron and Trivedi 1998; Long 1997).

1.3 Overdispersion

When Var(y|x) > E(y|x), we are said to have overdispersion. Estimates of a PRM for overdispersed data are unbiased, but inefficient with standard errors biased downward (Cameron and Trivedi 1998; Long 1997). The likelihood ratio test for overdispersion examines the null hypothesis of alpha=0. The LR statistic follows the Chi-squared distribution with one degree of freedom. If the null hypothesis is rejected, NBRM is preferred to PRM.

Zero-inflated models handle overdispersion by changing the mean structure to explicitly model the production of zero counts (Long 1997). These models assume two latent groups. One is the always-zero group and the other is not-always-zero or sometime-zero group (Long 1997). Thus, zero counts come from the former group and some of the latter group with a certain probability.

The likelihood ratio tests the null hypothesis of alpha=0 to compare the ZIP and NBRM. The PRM and ZIP, and NBRM and ZINB cannot, however, be tested by this likelihood ratio, since they are not nested respectively. The Voung’s statistic compares these non-nested models. If V is greater than 1.96, the ZIP or ZINB is favored. If V is less than -1.96, the PRM or NBRM is preferred (Long 1997).

1.4 Estimation in SAS, STATA, and LIMDEP

The SAS GENMOD estimates Poisson and negative binomial regression models. STATA has individual commands (e.g., .poisson and .nbreg) for the corresponding count data models. LIMDEP has Poisson$ and Negbin$ commands to estimate various count data models including zero-inflated and zero-truncated models. Table 2 summarizes the procedures and commands for count data regression models.

Table 2. Comparison of the Procedures and Commands for Count Data Models

Model SAS 9.1 STATA 9.0 SE LIMDEP 8.0
Poisson Regression (PRM) GENMOD .poission Poisson$
Negative Binomial Regression (NBRM) GENMOD .nbreg Negbin$
Zero-infliated Poisson (ZIP) - .zip Poisson; Zip; Rh2$
Zero-Inflacted Negative Binomial (ZINB) - .zinb Negbin; Zip; Rh2$
Zero-truncated Poisson (ZTP) - .ztp Poisson; Truncation$
Zero-truncated Negative Binomial (ZTNB) - .ztnb Negbin; Truncation$

The example here examines how waste quotas (emps) and the strictness of policy implementation (strict) affect the frequency of waste spill accidents of plants (accident).

1.5 Long and Freese's SPost Module

STATA users may take advantages of user-written modules such as SPost written by J. Scott Long and Jeremy Freese. The module allows researchers to conduct follow-up analyses of various CDVMs including event count data models. See 2.3 for examples of major SPost commands.

In order to install SPost, execute the following commands consecutively. For more details, visit J. Scott Long’s Web site at http://www.indiana.edu/~jslsoc/spost_install.htm.

. net from http://www.indiana.edu/~jslsoc/stata/

. net install spost9_ado, replace

. net get spost9_do, replace

已有 1 人评分经验 论坛币 收起 理由
胖胖小龟宝 + 10 + 10 热心帮助其他会员

总评分: 经验 + 10  论坛币 + 10   查看全部评分

使用道具

藤椅
hanszhu 发表于 2007-7-27 03:51:00 |只看作者 |坛友微信交流群

使用道具

板凳
hanszhu 发表于 2007-7-27 03:53:00 |只看作者 |坛友微信交流群

Negative Binomial; Testing For Overdispersion in Poisson regression

http://www.uky.edu/ComputingCenter/SSTARS/P_NB_3.htm

[此贴子已经被作者于2007-7-27 3:56:05编辑过]

已有 1 人评分热心指数 收起 理由
星辰剑客 + 1

总评分: 热心指数 + 1   查看全部评分

使用道具

报纸
wtingn 发表于 2007-7-27 09:21:00 |只看作者 |坛友微信交流群
用stata做最好

使用道具

地板
zjuxmz 发表于 2010-4-10 22:30:11 |只看作者 |坛友微信交流群
stata 比较简洁快速

使用道具

7
bobguy 发表于 2010-4-10 23:19:21 |只看作者 |坛友微信交流群
xuehe 发表于 2007-7-26 19:35
有人做过否?请简要介绍怎么做?
COUNTREG Procedure in SAS can do them all. Here is an overview.



Overview: COUNTREG ProcedureThe COUNTREG (count regression) procedure analyzes regression models in which the dependent variable takes nonnegative integer or count values. The dependent variable is usually an event count, which refers to the number of times an event occurs. For example, an event count might represent the number of ship accidents per year for a given fleet. In count regression, the conditional mean of the dependent variable, , is assumed to be a function of a vector of covariates, .
The Poisson (log-linear) regression model is the most basic model that explicitly takes into account the nonnegative integer-valued aspect of the outcome. With this model, the probability of an event count is determined by a Poisson distribution, where the conditional mean of the distribution is a function of a vector of covariates. However, the basic Poisson regression model is limited because it forces the conditional mean of the outcome to equal the conditional variance. This assumption is often violated in real-life data. Negative binomial regression is an extension of Poisson regression in which the conditional variance may exceed the conditional mean. Also, an often encountered characteristic of count data is that the number of zeros in the sample exceeds the number of zeros predicted by either the Poisson or negative binomial model. Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models explicitly model the production of zero counts to account for excess zeros and also enable the conditional variance of the outcome to differ from the conditional mean.
Under zero-inflated models, additional zeros occur with probability , which is determined by a separate model, , where is the normal or logistic distribution function resulting in a probit or logistic model, and is a set of covariates.
PROC COUNTREG supports the following models for count data:
  • Poisson regression
  • negative binomial regression with quadratic (NEGBIN2) and linear (NEGBIN1) variance functions (Cameron and Trivedi 1986)
  • zero-inflated Poisson (ZIP) model (Lambert 1992)
  • zero-inflated negative binomial (ZINB) model
In recent years, count data models have been used extensively in economics, political science, and sociology. For example, Hausman, Hall, and Griliches (1984) examine the effects of R&D expenditures on the number of patents received by U.S. companies. Cameron and Trivedi (1986) study factors affecting the number of doctor visits. Greene (1994) studies the number of derogatory reports to a credit reporting agency for a group of credit card applicants. As a final example, Long (1997) analyzes the number of doctoral publications in the final three years of Ph.D. studies.
The COUNTREG procedure uses maximum likelihood estimation. When a model with a dependent count variable is estimated using linear ordinary least squares (OLS) regression, the count nature of the dependent variable is ignored. This leads to negative predicted counts and to parameter estimates with undesirable properties in terms of statistical efficiency, consistency, and unbiasedness unless the mean of the counts is high, in which case the Gaussian approximation and linear regression may be satisfactory.

使用道具

8
shanxixiu 发表于 2010-11-3 10:55:08 |只看作者 |坛友微信交流群
thanks a lot

使用道具

9
offandon 发表于 2011-10-9 02:20:18 |只看作者 |坛友微信交流群
谢谢分享。。。。

使用道具

10
peyzf 发表于 2012-8-27 21:51:27 |只看作者 |坛友微信交流群
当为面板数据时,如果决定应该使用NB还是possion?

使用道具

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群

京ICP备16021002-2号 京B2-20170662号 京公网安备 11010802022788号 论坛法律顾问:王进律师 知识产权保护声明   免责及隐私声明

GMT+8, 2024-4-27 21:58