根据坛内的经验来说,stata做交叉项时应该先生成新变量再带入模型中,例如:
做x,y的交差项:gen xy=x*y. 然后再做回归(例如:reg/glm/logistic z x y xy)。
但是我发现x#y可以不用生成新变量就能用,以对数线性回归(glm)为例:
1、命令:
gen iner=row*colu
glm freq i.row i.colu iner, family(poisson)
以下是生成新变量的部分结果:
Generalized linear models No. of obs = 9
Optimization : ML Residual df = 3
Scale parameter = 1
Deviance = 103.613914 (1/df) Deviance = 34.53797
Pearson = 129.5374276 (1/df) Pearson = 43.17914
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 20.46444
Log likelihood = -86.0899988 BIC = 97.02224
2、命令:
glm freq i.row i.colu i.row#i.colu, family(poisson)
以下是部分结果:
Generalized linear models No. of obs = 9
Optimization : ML Residual df = 0
Scale parameter = 1
Deviance = 4.52284e-13 (1/df) Deviance = .
Pearson = 1.30799e-14 (1/df) Pearson = .
Variance function: V(u) = u [Poisson]
Link function : g(u) = ln(u) [Log]
AIC = 9.618454
Log likelihood = -34.2830418 BIC = 4.52e-13
从结果上来看,好像第二种优于第一种(第二种貌似接近饱和模型了(╯‵□′)╯︵┻━┻),但是两种有什么差别,各自分别是什么呢?
还请大神赐教~~~


雷达卡




京公网安备 11010802022788号







