Frequently Asked Questions for CART®
CART® is the ultimate classification tree that has revolutionized the entire field of advanced analytics and inaugurated the current era of data mining. CART, which is continually being improved, is the most important tool in modern data mining methods. Designed for both non-technical and technical users, CART can quickly reveal important data relationships that could remain hidden using other analytical tools.
CART is based on landmark mathematical theory introduced in 1984 by four world–renowned statisticians at Stanford University and the University of California at Berkeley. Salford Systems' implementation of CART is the only decision tree software embodying the original proprietary code. The CART creators continue to collaborate with Salford Systems to enhance CART with proprietary advances.
Cross-validation is a method for estimating what the error rate of a sub-tree (of the maximal tree) would be if you had test data. Regardless of what value you set for V-fold cross validation, CART grows the same maximal tree. The monograph provides evidence that using a V of 10-20 gives better results than using a smaller number, but each number could result in a slightly different error estimate. The optimal tree — which is derived from the maximal tree by pruning — could differ from one V to another because each cross-validation run will come up with slightly different estimates of the error rates of sub-trees and thus might differ in which tree was actually best.