Decision trees, and CART® in particular, were originally designed to split data using a single predictor. For continuous predictors the split is the form:
if X <= c then go leftFor categorical predictors the split is of the form:
if X is in { value1, value2,….} then go left
else go right
The advantage of such splitting rules lies principally in their simplicity and understandability, and the terminal nodes of the tree are conjunctions of such spliters that define rules (or data segments) that may also be easy to understand. But for continuous variables the advantages go far beyond comprehensibility as the split is essentially unchanged when the predictor in question is transformed via common transforms such as log(X) or SQRT(X) etc. Technically, so long as the transform does change the rank ordering of the values of the predictor (or simply inverts the order) then the splitting rule is essentially unchanged. Further, extreme values of the predictor should not affect the selection of the best splitter as the splitter is generally towards the interior of the split variable distribution. These characteristics go a long way to making CART such an effective analytical tool even in the presence of flawed data.