楼主: ReneeBK
2165 3

[问答] Cox Regression when Reference Group Have Zero Events [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.7537
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2014-4-12 02:39:11 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

My study is to find out how a blood test predicts mortality with patients followup of 1 year. Patients were divided into quartiles. and the first quartile is used as a reference group. I would like to determine the unadjusted Hazard ratios (HR) and the adjusted HR for quartile 2-4 using SPSS Cox Regression.

The reference group (Quartile 1–the first 25%) did not have any events, and the hazard ratios for the 2nd, 3rd and 4th quartiles is as high as 80000+ and I also got error messages form SPSS.Is there anyone who could advise me on this issue?


二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:regression Reference regressio regress Events

沙发
ReneeBK 发表于 2014-4-12 02:40:11
I think the solution here is to use your continuous covariate as continuous, rather than categorizing it. Categorizing continuous covariates is commonly done; it may make interpretation easier for clinicians, but statistically it is always a poor choice.

There seems to be some concern here about whether or not to convert continuous variables into variables with just two (or more) categories. Let me address that here, rather than in a comment. I would keep all of your variables as continuous. There are several reasons to avoid categorizing continuous variables:

  • By categorizing you would be throwing information away--some observations are further from the dividing line & others are closer to it, but they're treated as though they were the same. In science, our goal is to gather more and better information and to better organize and integrate that information. Throwing information away is simply antithetical to good science in my oppinion;
  • You tend to lose statistical power a;
  • You lose the ability to detect non-linear relationships;
  • What if someone reads your work & wonders what would happen if we drew the line b/t categories in a different place? (For example, consider your BMI example, what if someone else 10 years from now, based on what's happening in the literature at that time, wants to also know about people who are underweight and those who are morbidly obese?) They would simply be out of luck, but if you keep everything in its original form, each reader can assess their own preferred categorization scheme;
  • There are rarely 'bright lines' in nature, and so by categorizing you fail to reflect the situation under study as it really is. If you are concerned that there may be an actual bright line at some point for a-priori theoretical reasons, you could fit a spline to assess this. Imagine a variable, X, that runs from 0 to 1, and you think the relationship between this variable and a response variable suddenly and fundamentally changes at .7, then you create a new variable (called a spline) like this:

    XsplineXspline=0=X−.7if X≤.7if X>.7


    then add this new Xspline variable to your model in addition to your original X variable. The model output will show a sharp break at .7, and you can assess whether this enhances our understanding of the data.

藤椅
ReneeBK 发表于 2014-4-12 02:41:34

Well, what you're doing wrong is using as the reference group a group with zero events. Instead of hazard ratios, think in simpler terms (in my opinion) of incident rate ratios (IRRs), where the incident rate (IR) is IR=number of cases / total person-time.

IRRquartile 4 vs. quartile 1=IRquartile 4IRquartile 1



What happens if IRquartile 1=0?

You can change your categorisation (use tertiles or some other meaningful categorisation) or, even better, if you have a continuous predictor you can treat it as such and examine potential nonlinear relationships using polynomial terms, fractional polynomials or restricted cubic splines, for example.

板凳
ReneeBK 发表于 2014-4-12 02:48:22
The hazard of an event is the instantaneous probability of an event occurring at time t, conditional on it not having previously occurred.

Your problem should be clear instantly - with no events, the probability is zero. Borrowing from andrea's example, the incident rate is equivalent to a constant hazard - in your case, a constant hazard of zero.

Dividing by zero tends to make software angry.

You need to switch your reference category. My suggestion is to use "Quartile 4" or the other high value of the category, and step down, rather than using Quartile 1 and stepping up. If you were hoping to, for example, show an increase in the HR as you moved up a category, you're now showing the equivalent protective effect from moving down one.

I would also suggest taking a moment to consider why you have no events.
It's possible you're simply having a run of "bad luck", at which point there's nothing you can do but increase the study size or follow the population for longer in hopes of accumulating more events. But you should make sure there's no reason that the probability of having an outcome in your population isn't zero for a reason. For cardiac events I can imagine one, but it is always worth stopping to consider when you have zero events in some level of a covariate.

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注cda
拉您进交流群
GMT+8, 2026-1-3 17:59