0关注
0粉丝

小学生

28%

还不是VIP/贵宾

-

0%

威望: 0 级
论坛币: 370 个
通用积分: 0
学术水平: 0 点
热心指数: 0 点
信用等级: 0 点
经验: 130 点
帖子: 5
精华: 0
在线时间: 0 小时
注册时间: 2005-7-30
最后登录: 2012-9-12

楼主

celiall 发表于 2006-4-10 18:22:00 |AI写论文

是否 +2 论坛币

k人参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群

赵安豆老师微信：zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

立即领取

感谢您参与论坛问题回答

经管之家送您两个论坛币！

+2 论坛币

哪位高手来指点下哦～～
看过些介绍SPSS 使用的资料，觉得ordinal过程介绍的都很简单，谁做过了能分享下经验么？难道只是简单的analyze--regression--ordinal，然后选择变量点ok就可以了么？
结果报告里的-2 log likelyhood在多大的范围内比较理想呢？如果过大该怎么办呢？还有报告开头的warnings里说有些cells是zero frequencies是做什么的？另外做ordinal过程需要把变量分次放到方程里吗？
第一次用spss做数据，水平实在太low啦，请大家多指教哦，多谢～～

扫码加我拉你入群

请注明：姓名-公司-职位

以便审核进群资格，未注明则拒绝

分享0 收藏0 回帖

关键词：ordinal DINA SPSS PSS frequencies 做什么资料

回帖推荐

hanszhu 发表于4楼查看完整内容

Ordered Logistic Regression using PLUM PLUM, introduced in version 10, can estimate a variety of ordinal regression models, including ordered logit and ordered probit. This example is limited to the ordered logit model. The ordered logit model, also known as the cumulative logit model, estimates the effects of independent variables on the log odds of having lower rather than higher scores on the ...

本帖被以下文库推荐

· SPSS精彩问答|主题: 2176, 订阅: 25

沙发

hanszhu 发表于 2006-4-11 08:01:00

The ordinal regression SPSS package allows you to use a dependent ordinal variable with a mix of categorical and numeric predictors. Because the dependent variable categories are NOT numbers, we need ways to get around this in a prediction equation. One type of ordinal regression allows you to estimate the cumulative probabilities that a case will fall in a particular ordered category. For example, if our dependent variable were degree level, we could ask: what's the probability (in a logit solution, the odds) that a person will have at least a high school degree, or at least a BA degree? This is apparently the type of regression in the SPSS program. The shorthand name for this procedure is "PLUMS".

One of your decisions in constructing an ordinal regression model, of course, is to select your predictors for the location component of the model. Covariates can be interval or ratio; the assumption is that they are numeric...but I still wouldn't use too many categories. The program is still constructing a table and if you have many values in your covariates you will receive warnings about empty cells. The program will even begin to collapse some of these into cells so it can do estimates. So if YOU want to be in charge, condense the categories yourself and check the multivariate table for zero cells.

Adding a bit (.5 is the usual) to the delta function will also "smooth" out the empty cells.

You need to select a link function. This is a transformation of the cumulative probabilities that allow you to estimate your model (see above). Five link functions are available in the ordinal regression procedure, I recommend the logit link function which is comparable to what we recently have been studying. Because, remember, you will need to describe what is happening in your data when you are all done! Agresti discusses link functions and he talks more about them in the "big Agresti" (2002).

The scale component is optional. Much of the time, you don't need a scale component. The "location only" model will provide a good summary of the data. SPSS says "In the interests of keeping things simple, it's usually best to start with a location-only model, and add a scale component only if there is evidence that the location-only model is inadequate for your data. Following this philosophy, you will begin with a location-only model."

"The scale component is an optional modification to the basic model to account for differences in variability for different values of the predictor variables. For example, if men have more variability than women in their account status values, using a scale component to account for this may improve your model. The model with a scale component follows the form shown in this equation"

When SPSS suggests to keep things simple, I nearly always believe them.

Basically the scale component is a correction for what we call "heteroscedasticity" in OLS regression. Heteroscedasticity is when the variability on your dependent variable is different depending on the values of your independent variable--or combinations of independent variables. For example, there is usually a larger standard deviation on weight for tall people than for short people. Because you typically have far fewer values and cruder measurement on your ordinal dependent variable, this is less likely to happen in ordinal regression than in Ordinary Least Squares regression.

Be careful about including variables in these programs (especially the multinomial logistic regression program) if you don't plan to use them in a particular analysis. In the multinomial program, in particular, unused independent variables that are read into the multinomial program will be considered in constructing the n-dimensional table, even if you don't specify a relationship between that variable and the dependent variable, leading to misleading parameters, inference statistics, and degrees of freedom. You may be surprised to see a variable that you placed into the multinomial regression directions, but did not put in the model design, pop up when you study the table of observed and expected frequencies.

Remember! If you have an overall causal model and want to test the entire model, including indirect effect, you will need to use the loglinear model to do so. If you simply want the G², degrees of freedom and probability level for the final model, the HILOG program to model test will work fine here.

As the number of variables grows, the number of possible models grows too. The "aim of the game" is the simplest model with the smallest G2 and the largest degrees of freedom. But with a great many variables, it is possible to have comparable model statistics but quite different models.

已有 1 人评分	经验	论坛币	收起理由
bakoll	+ 2	+ 10	热心帮助其他会员

总评分: 经验 + 2 论坛币 + 10 查看全部评分

藤椅

hanszhu 发表于 2006-4-11 08:20:00

[下载]SPSS 14.0 Advanced Statistical Procedures Companion.Chapter 4.Ordinal Regres

47922.pdf (555.53 KB)

板凳

hanszhu 发表于 2006-4-11 08:31:00

Ordered Logistic Regression using PLUM

PLUM, introduced in version 10, can estimate a variety of ordinal regression models, including ordered logit and ordered probit. This example is limited to the ordered logit model. The ordered logit model, also known as the cumulative logit model, estimates the effects of independent variables on the log odds of having lower rather than higher scores on the dependent variable.

In the equation, a_j are intercepts indicating the logodds of lower rather than higher scores when all independent variables equal zero. Note that the effects of the independent variables b_kX_k are subtracted from rather than added to the intercepts. This is done so that positive coefficients indicate increased likelihood of higher scores on the dependent variable (cf. Agresti 1990: 323). The intercepts for J - 1 categories express the categorical nature of the dependent variable while a parallel odds restriction to let independent variables have the same effects on all cumulative logits results in a parsimonious model for ordinal data.

The following example shows how an ordered logistic model can be estimated with PLUM, available in SPSS 10 and later. The dependent variable is respondent’s occupation (OCC), the two independent variables are race (BLACK) and education (EDUC). OCC has 5 categories: 1=farmers, 2=laborers, 3=craftsmen, 4=clerical, 5=professionals. EDUC measures years of education with a range of 2 to 20. BLACK is the respondent’s race (1=black, 0=nonblack). These are the data used in Logan (1983: 332).

The following syntax can be used to estimate the effects of education and race on occupation. Using the menus, EDUC and BLACK would be entered as covariates. The PRINT options are the defaults when using the menus.

get file='c:\data\logan.sav'. compute educ=educ-14. plum occ with educ black /print=parameter summary fit.

PLUM - Ordinal Regression

**Warnings**
There are 71 (47.3%) cells (i.e., dependent variable levels by combinations of predictor variable values) with zero frequencies.

This is only relevant for the Goodness-of-Fit tests below. If the number of cells with zero frequencies is small then inferences can be made about the fit of the model to the data. Note that if continuous independent variables are used then the number of cells will be large and zero frequencies commonplace. This warning is only informative if a limited number of independent variables with a limited number of categories are used. The chi-square test for nested models remains valid.

**Case Processing Summary**
		N	Marginal Percentage
occupation	Farm	19	2.3%
	Operatives	217	25.9%
	Craftsmen	202	24.1%
	Sales	105	12.5%
	Professional	295	35.2%
Valid		838	100.0%
Missing		0
Total		838

A frequency table of the dependent variable. The small number of cases in FARM, that could cause unstable estimates, otherwise uninteresting.

**Model Fitting Information**
Model	-2 Log Likelihood	Chi-Square	df	Sig.
Intercept Only	624.010
Final	309.084	314.926	2	.000
Link function: Logit.

The value of 314.9 with 2 df is the most relevant value here. This is the likelihood ratio test that all coefficients for all independent variables are equal to zero. This null hypothesis can be rejected since the test is highly significant.

**Goodness-of-Fit**
	Chi-Square	df	Sig.
Pearson	393.771	114	.000
Deviance	161.775	114	.002
Link function: Logit.

These goodness of fit tests are highly significant, indicating that the model does not fit the data well. However, the tests are not informative because of the large number of zero frequencies in a three-way table of the variables in use here. This information is really only relevant if a small number of categorical independent variables is used.

**Pseudo R-Square**
Cox and Snell	.313
Nagelkerke	.333
McFadden	.134
Link function: Logit.

The pseudo R-square measures indicate that the model performs fairly well. The Nagelkerke R² value will usually be the most relevant value to report. It corrects the Cox and Snell value so that it can theoretically achieve a value of 1. Note that these pseudo R² measures confound goodness of fit and explanatory power of the model.

**Parameter Estimates**
		Estimate	Std. Error	Wald	df	Sig.	95% Confidence Interval
		Estimate	Std. Error	Wald	df	Sig.	Lower Bound	Upper Bound
Threshold	[OCC = 1.00]	-4.744	.255	346.899	1	.000	-5.244	-4.245
	[OCC = 2.00]	-1.568	.097	261.855	1	.000	-1.758	-1.378
	[OCC = 3.00]	-.214	.082	6.912	1	.009	-.374	-5.455E-02
	[OCC = 4.00]	.520	.084	38.660	1	.000	.356	.684
Location	EDUC	.457	.030	239.798	1	.000	.399	.515
Location	BLACK	-.850	.231	13.545	1	.000	-1.303	-.397
Link function: Logit.

The threshold values indicate the cumulative logits when the independent variables equal zero. The negative values for e.g. OCC=3 means that the predicted probability of scores of 3 or less on the dependent variable are smaller than for scores greater than 3.The positive value for OCC=4 means that P(OCC<=4) > P(OCC>4) when both independent variables are zero. The thresholds are necessary for calculating predicted values but are relatively uninteresting.

The positive coefficient for EDUC indicates that higher levels of education increase the probability of higher occupations. Strictly speaking, an increase in education by 1 year reduces ln(p(Y<=j)/p(Y>j), the logit of "at the most" versus "at least" a particular occupational category by .457. This reduction is the same regardless of the occupational category under scrutiny. It's clearer to say that education increases the logit of "at least" versus "at most" a particular occupational category. The probability p(Y>j) increases non-linearly but monotonically with higher levels of education.

The negative value for BLACK shows that being black decreases the probability of achieving higher occupational categories. The Wald statistics are equal to (B/S.E.)². The Wald statistic has a chi-square distribution with 1 df. The values under Sig. show that both effects are strongly significant.

References

Agresti, A. (1990).: Categorical Data Analysis. John Wiley & Sons.
Logan, J.A. (1983).: A Multivariate Model for Mobility Tables. American Journal of Sociology, 89, 324-349.
Long, J. Scott. (1997): Regression Models for Categorical and Limited Dependent Variables. Sage Publications.

已有 1 人评分	经验	论坛币	收起理由
bakoll	+ 2	+ 10	热心帮助其他会员

总评分: 经验 + 2 论坛币 + 10 查看全部评分

报纸

hanszhu 发表于 2006-4-11 08:45:00

Gill, Jeff (2000). Generalized Linear Model: A Unified Approach. Sage Publication, Thousand Oaks, California.

Hosmer, David W. and Lemeshow, Stanley (1989). Applied Logistic Regression. John Wiley & Sons, New York

McCullagh, P. (1980). Regression Models for Ordinal Data (with Discussion), Journal of the Royal Statistical Society - B 42, 109 - 142.

McCullagh, P. and Nelder (1989). J. A. Generalized Linear Models. Chapman and Hall, New York

地板

celiall 发表于 2006-4-11 16:27:00

to hanszhu：真的是太感谢了：）

I really appreciate your instant help very much~~

7楼

zhanglxgl 发表于 2008-6-12 09:58:00

8楼

seekts 发表于 2009-11-7 21:14:37

不错~~~~~```````

9楼

ferryxiaowang 发表于 2010-1-10 11:40:19

虽然看不懂，

10楼

ferryxiaowang 发表于 2010-1-10 11:41:32

学科带头人是好人。

[学习资料] 请教：用SPSS做ordinal回归，急呀～～ [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我拉你入群

相关帖子

回帖推荐

本帖被以下文库推荐

[下载]SPSS 14.0 Advanced Statistical Procedures Companion.Chapter 4.Ordinal Regres

Ordered Logistic Regression using PLUM

PLUM - Ordinal Regression

References

浏览过的帖子

浏览过的版块

本版微信群

[学习资料] 请教：用SPSS做ordinal回归，急呀～～ [推广有奖]

经管之家送您一份

经管之家联合CDA

感谢您参与论坛问题回答

扫码加我 拉你入群

相关帖子

回帖推荐

本帖被以下文库推荐

[下载]SPSS 14.0 Advanced Statistical Procedures Companion.Chapter 4.Ordinal Regres

Ordered Logistic Regression using PLUM

PLUM - Ordinal Regression

References

浏览过的帖子

浏览过的版块

本版微信群

扫码加我拉你入群