enter 输入的是 进入观测期的 时间节点, origin 输入的是 将要处于 风险期 的节点, 进入观测期 不一定等于 进入了 风险期, 就像 你观察女性的最佳受孕WINDOW 一样, 易于受孕(危险期 )只有 几天, 但是 观测 期 可能 有 十几天,不过,有时候这两个时间 节点 也是 重合的, 还有 很少用到的 time0等等, 在设置 时 要 仔细。 具体到 你的案例, 你要考察的的是: 公司成立后,多久才可能出现国家化的概率(生存分析中 叫风险发生率), 那么 time=year(连续观测 计时标识)-internationalyear,int_dummy在 国际化那年 标记为1; stset time,fail(stinternationalized10)
id(ID) enter(time year) origin(time ipoyear) time0(time-1,这里离自己 设置一个变量名称) or stset time,fail(stinternationalized10)
id(ID) origin(time ipoyear) time0(time-1,这里离自己 设置一个变量名称)
你还可以参考:
http://www.stata.com/support/faqs/statistics/stset-spell-type-data/
stsetting the dataThere are several ways to stset our data. The above dataset was stset in one of these possible ways. The proper stset syntax for the data, however, depends on the study design and assumptions. In what follows we provide guidance for selecting the appropriate stset command syntax. This is only a guide, and idiosyncrasies in your particular data may require more modifications or options.
There are two main questions that need to be answered to stset our data.
Question 1: When does the clock begin ticking?
If you want the “clock” to begin at time zero, then what we did above is correct. For calendar data, t=0 at 1/1/1960, but for the above data, t=0 at 0. The command we used was
. stset End, failure(Employed) time0(Begin) id(ID) exit(time .)
If we want the “clock” to start ticking for each individual when the subject first enters unemployment, 10 for ID==102 and 0 for the others, then we need to specify origin().
. stset End, failure(Employed) time0(Begin) id(ID) exit(time .) origin(Begin)
Stata will use as the time origin the earliest entry time per subject.
When origin() is not specified, Stata automatically sets the origin to zero and treats records with entry times greater than zero as left-truncated or delayed-entry observations. That is what we obtained with our original syntax.
Question 2: How do we want to handle each subject’s second, third, etc., observations?
If we want the clock to continue ticking for each individual from the first observation forward, then we can use the syntax we used in our example
. stset End, failure(Employed) time0(Begin) id(ID) exit(time .)
or, depending on the answer to question 1,
. stset End, failure(Employed) time0(Begin) id(ID) exit(time .) origin(Begin)
If, on the other hand, we want to reset the clock to zero or the origin() for every observation, then we stset the data without specifying id(). The ID variable can be used later in the analysis to cluster the data and to produce a robust standard error.
. stset End, failure(Employed) time0(Begin) exit(time .)
or depending on the answer to question 1,
. stset End, failure(Employed) time0(Begin) exit(time .) origin(Begin)
To summarize,
- If we want time to start at 0 and continue ticking for subsequent observations, we use. stset End, failure(Employed) time0(Begin) id(ID) exit(time .)
- If we want time to start at 0 and to be reset to zero for every observation, then we do not specify id().. stset End, failure(Employed) time0(Begin) exit(time .)
- If we want time to start at the first entry time for each observation and continue ticking for subsequent observations, then we specify origin().. stset End, failure(Employed) time0(Begin) id(ID) exit(time .) origin(Begin)
- If we want to reset the clock to begin at the entry time of each record observation, then we specify origin(), but not id().. stset End, failure(Employed) time0(Begin) exit(time .) origin(Begin)