楼主: ReneeBK
3310 5

Mixed Model using SPSS Syntax [推广有奖]

  • 1关注
  • 62粉丝

VIP

已卖:4901份资源

学术权威

14%

还不是VIP/贵宾

-

TA的文库  其他...

R资源总汇

Panel Data Analysis

Experimental Design

威望
1
论坛币
49675 个
通用积分
56.1287
学术水平
370 点
热心指数
273 点
信用等级
335 点
经验
57805 点
帖子
4005
精华
21
在线时间
582 小时
注册时间
2005-5-8
最后登录
2023-11-26

楼主
ReneeBK 发表于 2014-5-2 09:45:24 |AI写论文

+2 论坛币
k人 参与回答

经管之家送您一份

应届毕业生专属福利!

求职就业群
赵安豆老师微信:zhaoandou666

经管之家联合CDA

送您一个全额奖学金名额~ !

感谢您参与论坛问题回答

经管之家送您两个论坛币!

+2 论坛币
I just wanted to ask something about the SPSS problem i am trying to solve. I have repeated measures of a blood test for each patient (2-15 measurements per patient) and i want to find out the effect of each independent variables such as age, gender and ethnicity on the decline of this blood test over time. Therefore i am trying to run a random intercept and random slope models on repeated measurements by using MIXED model procedure in SPSS. However, the program is giving me an error of saying that there is insufficient memory to estimate model parameters. Totally i have 135823 cases of rows. When i tried to run in a smaller size of cases, the code did worked. What can i do to solve my problem ? Is it a problem due to computer memory or SPSS memory ? How can i make the code work ?

My code is something like;

MIXED Blood_test_Value BY Gender Ethnicity WITH Age
  /CRITERIA=CIN(95) MXITER(100) MXSTEP(5) SCORING(1)
SINGULAR(0.000000000001) HCONVERGE(0,
    ABSOLUTE) LCONVERGE(0, ABSOLUTE) PCONVERGE(0.000001, ABSOLUTE)
  /FIXED=Gender Ethnicity Age | SSTYPE(3)
  /METHOD=ML
  /PRINT=G  SOLUTION TESTCOV
  /RANDOM= INTERCEPT | SUBJECT(ID) COVTYPE(ID)
/REPEATED= Time_point  | SUBJECT(ID) COVTYPE(UNR).

Thank you.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

关键词:mixed model SYNTAX Mixed model Using procedure something measures repeated problem

本帖被以下文库推荐

沙发
ReneeBK 发表于 2014-5-2 09:46:35
Typically, I think a random coefficient model will look like:

MIXED
Blood_test_Value BY Gender Ethnicity WITH Age

/FIXED = Gender Ethnicity Age | SSTYPE(3)
/METHOD = REML
/RANDOM INTERCEPT Age | SUBJECT(id) COVTYPE(UNR) .

See, for example, Model 5 at http://publib.boulder.ibm.com/infocenter/spssstat/v20r0m0/topic/com.ibm.spss.statistics.help/syn_mixed_examples.htm.  I don't think the REPEATED statement belongs; the random slope should be modeled as part of the RANDOM subcommand.

Alex

藤椅
ReneeBK 发表于 2014-5-2 09:49:38
I believe Alex's parameterization is headed in the desired direction
of the OP. I might take issue with using a "UNR" structure instead of
a "UN" structure to start but I'll save my reasoning for another time.
Nevertheless, more needs to be done with respect to the fixed effects
component of the model to answer the original research question,
"...effect of each independent variables such as age, gender and
ethnicity on the decline of this blood test over time." In order to do
so, the FIXED statement needs to be modified as follows:

/FIXED = Gender Ethnicity Age Time_point Gender*Time_point
Ethnicity*Time_point Age*Time_point | SSTYPE(3)

Whether the OP wants to incorporate a REPEATED statement (which is
known as a "residual change model") versus a RANDOM statement with an
intercept and slope (which is known as a growth curve model) depends
on whether the OP is interested in estimating differences between
individual linear trajectories over time. If so, the
random-coefficient regression model (the model with the RANDOM
statement with an intercept and slope) is the way to go.

BTW, ignoring the fixed effects component for the moment, there is
absolutely nothing wrong with a combination of the RANDOM and REPEATED
statements posted originally by the OP. These statements combined in
the same model indicate that conditional upon the subject-specific
intercepts, there is residual correlation from observations obtained
by the same subject along with time-invariant residual variances. The
problem is that a random intercept coupled with an UNstructured
residual correlation matrix is inconsistent with what the OP
described. Furthermore, the fixed effects component did not include
Time_point or its interaction with the other variables, which are both
necessary in order to assess the effect of the independent variables
on the temporal linear change in the expected value of y.

Finally, no explanation was provided as to why the number of
measurements per subject varied between 2 and 15. I would be
interested in understanding why this is the case. This leads me to
wonder about various other aspects of the variable Time_point but I'll
refrain from going any further for now.

HTH,

Ryan

板凳
ReneeBK 发表于 2014-5-2 09:51:33
I will explain you the research question in more detail for help. I have got
a General Practice (GP) data set where each patient has repeated
measurements on a blood test. Measurement dates are not same for each
patient and the intervals between the measurements are not same either. The
time_point indicates the order of the measurement like measurement 1,
measurement 2, measurement 3...etc. There are up to 15 measurements per
patient, the reason for variation between the number of measurements per
subject is the missing data, so level 1 variable is the measurement taken
within individual and level 2 variable is the patient. Repeated measured
blood test is our dependent variable because the diagnosis of the particular
disease is based on the value of this blood test. I am trying to investigate
the effect of each independent variable such as age, gender, ethnicity... on
dependent variable and also the effect on the decline of the dependent
variable over time, so for example: is the person at age 30 has a higher
decline from the measurement time point 1 to 2 compared to person at age 60
? In that case, will a code like this be appropriate to use?


MIXED Blood_test_Value BY Gender Ethnicity Hypertension_diagnosis
Diabetes_Diagnosis IHD_Diagnosis Anaemia_Diagnosis Obesity_Diagnosis
Time_point WITH Age
  /CRITERIA = CIN(95) MXITER(150) MXSTEP(5) SCORING(1)
  SINGULAR(0.000000000001) HCONVERGE(0, ABSOLUTE) LCONVERGE(0, ABSOLUTE)
  PCONVERGE(0.000001, ABSOLUTE)
  /FIXED = Gender Ethnicity Age Hypertension_diagnosis Diabetes_Diagnosis
IHD_Diagnosis Anaemia_Diagnosis Obesity_Diagnosis Time_point
Gender*Time_point Ethnicity*Time_point Hypertension_diagnosis*Time_point
Diabetes_Diagnosis*Time_point IHD_Diagnosis*Time_point
Anaemia_Diagnosis*Time_point Obesity_Diagnosis*Time_point Age*Time_point|
SSTYPE(3)
  /METHOD = ML
  /PRINT = G R SOLUTION TESTCOV
  /RANDOM = INTERCEPT | SUBJECT(ID) COVTYPE(ID)
  /REPEATED = Time_point  | SUBJECT(ID) COVTYPE(AR1) .

Thank you for your help again.

Regards,

Zalihe.

报纸
ReneeBK 发表于 2014-5-2 09:54:34
If the intervals between measurements are not equal (or not nearly equal), then employing an autoregressive residual structure is invalid. In fact, I suggest that you forget about the REPEATED statement. Technically, an unstructured residual matrix can never be wrong, but it's likely too complex given the way in which the measurements were collected (unequal time intervals between and within patients). It is worth noting that the MIXED procedure in SAS offers a variety of spatial covariance structures which can handle unequal intervals while accounting for decaying residual correlations as observations become more distant in time, but I'll stick within the confines of SPSS for this post. With that stated, using the MIXED procedure in SPSS, a random coefficient model seems like your best option.

This is not easy to explain over email. Moreover, I'm quite distracted by other pressing work. Having said that, I'm going to try to help get you started. In order to make any movement, I need to make some assumptions:

(1) You have the date associated for when the measurements were taken on each subject.
(2) The first measurement was taken shortly before diagnosis.
(3) Patients you are tracking are getting equivalent forms of treatment that started shortly after diagnosis.

If yes to all 3 assumptions, then create a Time variable that reflects number of days since baseline. The first measurement on each patient will be considered baseline and should be coded as 0, and subsequent measurements will reflect the number of days since the first measurement/baseline. Concretely, if patient 1 was measured three times (baseline, 5 days post-baseline and 25 days post-baseline, then the dataset should look like this:

Patient_ID  Time
1             0
1             5
1            25
2
2
.
.
.

Needless to say, if patients are measured more frequently (e.g., multiple times in a single day), then you should make the measurement unit number of hours or minutes since baseline.

With that said, I'd parameterize the model as follows:

MIXED Y BY <categorical predictors> WITH Age Time
  /FIXED = <categorical predictors> Age Time <two-way interactions between each predictor and Time> | SSTYPE(3)
  /METHOD = REML
  /PRINT = G SOLUTION TESTCOV
  /RANDOM = INTERCEPT Time | SUBJECT(Patient_ID) COVTYPE(UN).
I am assuming that there is a linear relationship between time and the dependent variable. You can certainly consider exploring other types of relationships. Same goes for Age.

At any rate, with the model proposed above you should be able to answer all sorts of research questions using the TEST sub-command (e.g. is the estimated mean on day X since baseline for males significantly different than females; is the slope for males significantly different for females). Examining the estimates from the random effects covariance (G) matrix could prove useful as well, but no time to discuss this right now.

Write back if you have additional questions and I'll try to respond when time permits.

HTH,

Ryan

地板
ReneeBK 发表于 2014-5-2 09:55:37
A few very brief comments:

(1) The REPEATED statement [as currently written] in the 3rd model is likely overkill AND inappropriate since the levels of "observation_point" do not mean the same thing for each subject. The RANDOM statement is accounting for differences between subjects with respect to intercepts at t=0 and linear trajectories (aka slopes). Let me be more direct...Within subject correlation due to repeated measures (multiple observations per subject) is being accounted for by the RANDOM statement.

(2) I cannot pinpoint from over here why model 2 cannot be handled by the MIXED procedure in SPSS on your machine. If you have access to SAS, then you might consider the HPMIXED procedure. The HPMIXED procedure is a fairly recent procedure in SAS which is designed to handle complex models and large datasets. OTOH, the fixed effects component of your second model does seem to be getting quite large. You might need to remind yourself about what the primary research question is and whether you really need all of those fixed effects in order to answer it. Don't lose the forest for the trees.

(3) The default estimation method is REML. Why do you keep changing it to ML? Is it to conduct likelihood ratio tests to compare nested models? If not, you might consider going back to the default estimation method; it is generally the preferred choice.

Ryan

您需要登录后才可以回帖 登录 | 我要注册

本版微信群
加好友,备注jltj
拉您入交流群
GMT+8, 2026-2-7 14:50