Salford Systems logo white space
Navigation
white space
white space
white space
white space
white space
Products > CART > Technical Overview > Frequently Asked Questions > Too Many Levels in a Categorical Predictor
Too Many Levels in a Categorical Predictor


CART will only search over all possible subsets of a categorical predictor for a limited number of levels. Beyond a threshold set by computational feasibility, CART will simply reject the problem. You can control this limit with the BOPTION NCLASSES = m command, but be aware that for m larger than 15, computation times increase dramatically.


SOLUTION: Convert The Variable Into Dummies

The ideal solution is to work with a supercomputer implementation of Salford Systems CART, since this will provide the optimal tree. Other alternatives are compromises that might not yield satisfactory results. One such compromise is to break the categorical variable into a vector of dummies. For example, a 50-level occupation variable could be coded into 50 separate indicators.


Steinberg, Dan and Phillip Colla. CART--Classification and Regression Trees. San Diego, CA: Salford Systems, 1997.
white space
© Copyright 2003-2004 Salford Systems - Print this page white space