人大经济论坛 › 论坛 › 计量经济学与统计论坛五区 › 计量经济学与统计软件 › 负的二项式模型(NB)、泊松模型(Possion)和有序概率单位模 ...

CDA数据分析研究院

商业数据分析与大数据领航教育品牌



经管云课堂

经管/金融/财会/社科/名师公开课



学术培训

Stata 空间计量 SSCI Python

贵宾：通行论坛特权+数据库权限
+案例库+下载特权 VIP：论坛特权+更多下载次数
+ccerdata数据库+更高阅读权限+……

12 下一页

发帖

楼主: xuehe

11489 12

负的二项式模型(NB)、泊松模型(Possion)和有序概率单位模型(OP)用什么软件做？ [推广有奖]

0关注
123
粉丝

贵宾

学术权威

90%

还不是VIP/贵宾

威望: 8 级
论坛币: 571284 个
通用积分: 434.9290
学术水平: 365 点
热心指数: 358 点
信用等级: 202 点
经验: 353457 点
帖子: 4354
精华: 9
在线时间: 2621 小时
注册时间: 2004-12-31
最后登录: 2024-4-20

楼主

xuehe 发表于 2007-7-26 19:35:00 |只看作者 |坛友微信交流群|倒序 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

有人做过否？请简要介绍怎么做？

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏3 回帖

关键词：Possion 二项式 POS OSS 怎么做模型概率泊松二项式 Possion

回帖推荐

hanszhu 发表于2楼查看完整内容

1. Introduction An event count is the realization of a nonnegative integer-valued random variable (Cameron and Trivedi 1998). Examples are the number of car accidents per month, thunder storms per year, and wild fires per year. The ordinary least squares (OLS) method for event count data results in biased, inefficient, and inconsistent estimates (Long 1997). Thus, researchers have developed vario ...

本帖被以下文库推荐

· 计量.统计精彩问答|主题: 12506, 订阅: 52

使用道具举报

沙发

hanszhu 发表于 2007-7-27 03:49:00 |只看作者 |坛友微信交流群

1. Introduction

An event count is the realization of a nonnegative integer-valued random variable (Cameron and Trivedi 1998). Examples are the number of car accidents per month, thunder storms per year, and wild fires per year. The ordinary least squares (OLS) method for event count data results in biased, inefficient, and inconsistent estimates (Long 1997). Thus, researchers have developed various nonlinear models that are based on the Poisson distribution and negative binomial distribution.

1.1 Count Data Regression Models

The left-hand side (LHS) of the equation has event count data. Independent variables are, as in the OLS, located at the right-hand side (RHS). These RHS variables may be interval, ratio, or binary (dummy). Table 1 below summarizes the categorical dependent variable regression models (CDVMs) according to the level of measurement of the dependent variable.

Table 1. Comparison between OLS and CDVMs

	Model	Dependent (LHS)	Method	Independent (RHS)
OLS	Ordinary least squares	Interval or ratio scale	Moment based method	A linear function of interval/ratio or binary independent variables
CDVMs	Binary response	Binary (0 or 1)	Maximum Likelihood Method
	Ordinal response	Ordinal (1st, 2nd, ...)
	Nominal response	Nominal (A, B, ...)
	Event count data	Count (0, 1, 2, ...)

The Poisson regression model (PRM) and negative binomial regression model (NBRM) are basic models for count data analysis. Either the zero-inflated Poisson (ZIP) or the zero-inflated negative binomial regression model (ZINB) is used when there are many zero counts. Other count models are developed to handle censored, truncated, or sample selected count data. This document, however, focuses on PRM, NBRM, ZIP, and ZINB.

1.2 Poisson Models versus Negative Binomial Models

The Poisson probability distribution,

, has the same mean and variance (equidispersion), Var(y)=E(y)=mu. As the mean of a Poisson distribution increases, the probability of zeros decreases and the distribution approximates a normal distribution (Figure 1). The Poisson distribution also has the strong assumption that events are independent. Thus, this distribution does not fit well if differs across observations (heterogeneity) (Long 1997).

The Poisson regression model (PRM) incorporates observed heterogeneity into the Poisson distribution function, Var(y|x)=E(y|x)=mu=exp(xb). As mu increases, the conditional variance of y increases, the proportion of predicted zeros decreases, and the distribution around the expected value becomes approximately normal (Long 1997). The conditional mean of the errors is zero, but the variance of the errors is a function of independent variables, var(y|x)=exp(xb). The errors are heteroscedastic. Thus, the PRM rarely fits in practice due to overdispersion (Long 1997; Maddala 1983).

Figure 1. Poisson Probability Distribution with Means of .5, 1, 2, and 5

The negative binomial probability distribution is

, where 1/v=alpha determines the degree of dispersion and the Gamma is the Gamma probability distribution. As the dispersion parameter alpha increases, the variance of the negative binomial distribution also increases, Var(y|x)=mu(1+mu/v).

The negative binomial regression model (NBRM) incorporates observed and unobserved heterogeneity into the conditional mean, mu=exp(xb+e) (Long 1997). Thus, the conditional variance of y becomes larger than its conditional mean, E(y|x)=mu, which remains unchanged. Figure 2 illustrates how the probabilities for small and larger counts increase in the negative binomial distribution as the conditional variance of y increases, given mu=2.

Figure 2. Negative Binomial Probability Distribution with Alpha of .01, .5, 1, and 5

The PRM and NBRM, however, have the same mean structure. If , the NBRM reduces to the PRM (Cameron and Trivedi 1998; Long 1997).

1.3 Overdispersion

When Var(y|x) > E(y|x), we are said to have overdispersion. Estimates of a PRM for overdispersed data are unbiased, but inefficient with standard errors biased downward (Cameron and Trivedi 1998; Long 1997). The likelihood ratio test for overdispersion examines the null hypothesis of alpha=0. The LR statistic follows the Chi-squared distribution with one degree of freedom. If the null hypothesis is rejected, NBRM is preferred to PRM.

Zero-inflated models handle overdispersion by changing the mean structure to explicitly model the production of zero counts (Long 1997). These models assume two latent groups. One is the always-zero group and the other is not-always-zero or sometime-zero group (Long 1997). Thus, zero counts come from the former group and some of the latter group with a certain probability.

The likelihood ratio tests the null hypothesis of alpha=0 to compare the ZIP and NBRM. The PRM and ZIP, and NBRM and ZINB cannot, however, be tested by this likelihood ratio, since they are not nested respectively. The Voung’s statistic compares these non-nested models. If V is greater than 1.96, the ZIP or ZINB is favored. If V is less than -1.96, the PRM or NBRM is preferred (Long 1997).

1.4 Estimation in SAS, STATA, and LIMDEP

The SAS GENMOD estimates Poisson and negative binomial regression models. STATA has individual commands (e.g., .poisson and .nbreg) for the corresponding count data models. LIMDEP has Poisson$ and Negbin$ commands to estimate various count data models including zero-inflated and zero-truncated models. Table 2 summarizes the procedures and commands for count data regression models.

Table 2. Comparison of the Procedures and Commands for Count Data Models

Model	SAS 9.1	STATA 9.0 SE	LIMDEP 8.0
Poisson Regression (PRM)	GENMOD	.poission	Poisson$
Negative Binomial Regression (NBRM)	GENMOD	.nbreg	Negbin$
Zero-infliated Poisson (ZIP)	-	.zip	Poisson; Zip; Rh2$
Zero-Inflacted Negative Binomial (ZINB)	-	.zinb	Negbin; Zip; Rh2$
Zero-truncated Poisson (ZTP)	-	.ztp	Poisson; Truncation$
Zero-truncated Negative Binomial (ZTNB)	-	.ztnb	Negbin; Truncation$

The example here examines how waste quotas (emps) and the strictness of policy implementation (strict) affect the frequency of waste spill accidents of plants (accident).

1.5 Long and Freese's SPost Module

STATA users may take advantages of user-written modules such as SPost written by J. Scott Long and Jeremy Freese. The module allows researchers to conduct follow-up analyses of various CDVMs including event count data models. See 2.3 for examples of major SPost commands.

In order to install SPost, execute the following commands consecutively. For more details, visit J. Scott Long’s Web site at http://www.indiana.edu/~jslsoc/spost_install.htm.

. net from http://www.indiana.edu/~jslsoc/stata/

. net install spost9_ado, replace

. net get spost9_do, replace

已有 1 人评分	经验	论坛币	收起理由
胖胖小龟宝	+ 10	+ 10	热心帮助其他会员

总评分: 经验 + 10 论坛币 + 10 查看全部评分

使用道具举报

藤椅

hanszhu 发表于 2007-7-27 03:51:00 |只看作者 |坛友微信交流群

使用道具举报

板凳

hanszhu 发表于 2007-7-27 03:53:00 |只看作者 |坛友微信交流群

Negative Binomial; Testing For Overdispersion in Poisson regression

http://www.uky.edu/ComputingCenter/SSTARS/P_NB_3.htm

[此贴子已经被作者于2007-7-27 3:56:05编辑过]

已有 1 人评分	热心指数	收起理由
星辰剑客	+ 1

总评分: 热心指数 + 1 查看全部评分

使用道具举报

报纸

wtingn 发表于 2007-7-27 09:21:00 |只看作者 |坛友微信交流群

用stata做最好

使用道具举报

地板

zjuxmz 发表于 2010-4-10 22:30:11 |只看作者 |坛友微信交流群

stata 比较简洁快速

使用道具举报

7楼

bobguy 发表于 2010-4-10 23:19:21 |只看作者 |坛友微信交流群

xuehe 发表于 2007-7-26 19:35
有人做过否？请简要介绍怎么做？

COUNTREG Procedure in SAS can do them all. Here is an overview.

Overview: COUNTREG ProcedureThe COUNTREG (count regression) procedure analyzes regression models in which the dependent variable takes nonnegative integer or count values. The dependent variable is usually an event count, which refers to the number of times an event occurs. For example, an event count might represent the number of ship accidents per year for a given fleet. In count regression, the conditional mean

of the dependent variable,

, is assumed to be a function of a vector of covariates,

.
The Poisson (log-linear) regression model is the most basic model that explicitly takes into account the nonnegative integer-valued aspect of the outcome. With this model, the probability of an event count is determined by a Poisson distribution, where the conditional mean of the distribution is a function of a vector of covariates. However, the basic Poisson regression model is limited because it forces the conditional mean of the outcome to equal the conditional variance. This assumption is often violated in real-life data. Negative binomial regression is an extension of Poisson regression in which the conditional variance may exceed the conditional mean. Also, an often encountered characteristic of count data is that the number of zeros in the sample exceeds the number of zeros predicted by either the Poisson or negative binomial model. Zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) models explicitly model the production of zero counts to account for excess zeros and also enable the conditional variance of the outcome to differ from the conditional mean.
Under zero-inflated models, additional zeros occur with probability

, which is determined by a separate model,

, where

is the normal or logistic distribution function resulting in a probit or logistic model, and

is a set of covariates.
PROC COUNTREG supports the following models for count data:

Poisson regression
negative binomial regression with quadratic (NEGBIN2) and linear (NEGBIN1) variance functions (Cameron and Trivedi 1986)
zero-inflated Poisson (ZIP) model (Lambert 1992)
zero-inflated negative binomial (ZINB) model

In recent years, count data models have been used extensively in economics, political science, and sociology. For example, Hausman, Hall, and Griliches (1984) examine the effects of R&D expenditures on the number of patents received by U.S. companies. Cameron and Trivedi (1986) study factors affecting the number of doctor visits. Greene (1994) studies the number of derogatory reports to a credit reporting agency for a group of credit card applicants. As a final example, Long (1997) analyzes the number of doctoral publications in the final three years of Ph.D. studies.
The COUNTREG procedure uses maximum likelihood estimation. When a model with a dependent count variable is estimated using linear ordinary least squares (OLS) regression, the count nature of the dependent variable is ignored. This leads to negative predicted counts and to parameter estimates with undesirable properties in terms of statistical efficiency, consistency, and unbiasedness unless the mean of the counts is high, in which case the Gaussian approximation and linear regression may be satisfactory.

使用道具举报