What tree-growing, or "splitting," criteria can CART provide?
CART includes seven single-variable splitting criteria - Gini, Symgini, twoing, ordered twoing and class probability for classification trees, and least squares and least absolute deviation for regression trees - and one multi-variable splitting criteria, the linear combinations method. The default Gini method typically performs best, but, given specific circumstances, other methods can generate more accurate models. CART's unique "twoing" procedure, for example, is tuned for classification problems with many classes, such as modeling which of 170 products would be chosen by a given consumer.
Other splitting criteria are available for inherently difficult problems in which even the best models are expected to have a relatively low accuracy. Demographics, for example, are often weak predictors of attitude- and preference-based segments. Special CART tree-growing options can dramatically increase the predictive accuracy of such demographic-based models. Additional unique tree-growing criteria are available for problems involving unequal misclassification costs, ordered target variables, and continuous dependent variables.To deal more effectively with select data patterns, CART also offers splits on linear combination of continuous predictor variables. For this option, CART looks for weighted averages of predictor variables to use as splitters; these weighted averages can reveal important database structure and can uncover new critical measures.