楼主: ReneeBK
1815 6

Dummy variables for linear models with multiple levels [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4897份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49635 个
通用积分
55.6937
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2014-4-15 05:09:21 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币

I'm currently working with data which has continuous variables and a hierarchical structure attached to it, think of measuring blood pressure, size and weight of different domestic animals (cats, dogs, birds) as well as of their species, family and order.


All data is measured on the level of the individuals, so there are no predictors on higher levels (although they could be generated by taking, e.g. the inter-level mean).

Let's say I want to predict the blood pressure (y) with the help of the weight (x1) and the size (x2).

Ignoring the hierarchical information, I could use a linear model y=β0+x1β1+x2β2, which might be a very bad idea.


What might be the right approach for dummy variables if there are more than two categories?



二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:Variables Multiple Variable Linear Levels multiple levels

沙发
ReneeBK 发表于 2014-4-15 05:13:06
Your first question: Close but not quite: you either need to leave one of your species out of the model as a reference category or you need to leave out the constant. In the former case each βi+2 measures the difference between species i and the reference species, so the parameter of the reference species is necessarily 0, which means that the indicator variable for the reference species drops out of the model. In the latter case each βi+2 is the constant for each species, which means that there is nothing left to do for the overall constant β0 so it should drop out.

Your second question: No, all the information is already captured by the species indicator variables (a term I prefer over dummy variable), so there is nothing for the family indicator variables left to explain and they will be automatically dropped from the model due to perfect multicolinearity.

藤椅
ReneeBK 发表于 2014-4-15 05:13:34
So species indicator variables can be used to reflect a 2-level structure, but they do not reflect the additional structure of a 3- or 4-level model - am I reading this correctly? –  Roland

板凳
ReneeBK 发表于 2014-4-15 05:13:57
I don't understand that comment. I see only two levels: species and families. Where is that 3rd and 4th level comming from? As long as the levels are hierarchical, than indicator variables at the lowest level (e.g. species) will absorbe all the variance of all the higher levels (e.g. families). –  Maarten Buis

报纸
ReneeBK 发表于 2014-4-15 05:14:29
       
Oh, I might be using terminology incorrectly. With 2 levels I mean the individual pet level (level 1), and the species level (level 2): I'd call the simple linear model without species indicators a 1-level model, and as soon as we have species information, we would have a 2-level model. The third level would be one which includes families as well. A fourth level (order) was not written down in the models, but mentioned in the introduction. –  Roland

地板
ReneeBK 发表于 2014-4-15 05:14:50
So family and order indicator variables will add nothing once you have included the species indicator variable. As a consequence, these will result in perfect colinearity and they will be dropped. –  Maarten Buis

7
ReneeBK 发表于 2014-4-15 05:15:14
You could go the other way around though: first add only order, than add only family (which implictly also includes order, so leave that out), than add only species (which implicitly includes also family and order, so leave those out) –  Maarten Buis

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2025-12-26 05:12