1.你指的应该是这篇文献
"Chi2: Feature Selection and Discretization of Numeric Attributes"
Huan Liu, Rudy Setiono
这个有C++ source code
CHIMERGE.tar
Chi2.tar
http://www.public.asu.edu/~huanliu/FSBOOK/TOOLS/DISCRETIZER/
http://www.public.asu.edu/~huanliu/FSBOOK/TOOLS/DISCRETIZER/
2.至於在R中運作,
以iris data,sepal-length 150 obs为例说明
设alpha=0.05 class=3
则threshold=qchisq(1-alpha,class-1)=5.991465
经sort,define intervals,..后得到下列数据,
两两做chi-square,将最低的值合并,再次运算,
直到皆大于threshold.
1 2 3
4.30+ thru 4.35 1 0 0
4.35+ thru 4.45 3 0 0
4.45+ thru 4.55 1 0 0
4.55+ thru 4.65 4 0 0
.....................
7.65+ thru 7.80 0 0 4
7.80+ thru 7.90 0 0 1
结果如下:
chi-sqare=[30.90553 17.84705 9.07365]
cutpoint=[5.45 5.75 7.05]
midpoint=[4.30 5.45 5.75 7.05 7.90]
[此贴子已经被作者于2007-10-29 14:43:17编辑过]