以下是引用SPSSCHEN在2005-12-22 10:54:53的发言: Hi everyone, I am looking for different ways to test the stability of clusters after they have been generated (eg using holdout cases or monte carlo simulations etc). I appreciate that the type of test will vary across clustering alogorithms but I would appreciate some generic reference material (preferably internet accessible) for different ways to assess cluster stability and listed criterion that could be applied to the assessment process. Regards Paul
One approach when using a single method is the so-called "split sample" method. Steps are: - Divide the sample into two, and perform a cluster analysis on one of the samples, having a fixed rule for the number of clusters. - Determine the centroids of the clusters, and compute proximities between the objects in the second sample and the clusters, classifying the objects into their nearest cluster. - Cluster the second sample using the same methods as before, and compare these two alternative clusterings for the second sample. References for the split sample method include: McIntyre, R.M. and Blashfield, R.K. (1980), A nearest-centroid technique for evaluating the minimum variance clustering procedure. Multivariate Behavioral Research, 22, 225-238. Breckenridge, J.N. (1989), Replicating cluster analysis: Method, consistency and validity. Multivariate Behavioral Research, 24, 147-161. Both of these papers are referenced in Cluster Analysis, 4th edition, by Everitt, Landau, and Leese. When comparing two alternative clusterings, you can use the adjusted Rand index. The adjusted Rand index was introduced by Hubert and Arabie: Hubert, L.J., and Arabie, P. (1985), Comparing partitions. Journal of Classification, 2, 193-218. When using different methods, you can synthesize the results using *consensus clustering*. Cheng and Milligan have written about assessing the influence of individual points. Cheng, R. and Milligan, G.W. (1996), Measuring the influence of individual data points in a cluster analysis. Journal of Classification, 13, 315-335. Anthony Babinec