我在rpart包的说明书里面找到一段
A variable may appear in the tree many times, either as a primary or a surrogate
variable. An overall measure of variable importance is the sum of the goodness of split
measures for each split for which it was the primary variable, plus goodness * (adjusted
agreement) for all splits in which it was a surrogate. In the printout these are scaled to sum
to 100 and the rounded values are shown, omitting any variable whose proportion is less
than 1%. Imagine two variables which were essentially duplicates of each other; if we did
not count surrogates they would split the importance with neither showing up as strongly
as it should.
|