Download Now! Free 30 Day Trial of Salford System's Predictive Modeling Suite

Upcoming Tradeshows

  • JSM
    July 28, 2012 - August 02, 2012
    San Diego, CA, Booth TBA
  • KDD
    August 12, 2012 - August 16, 2012
    Beijing, China, Booth TBA
  • Statistical Learning and Data Mining III
    October 01, 2012
    Boston, MA
  • DMA
    October 13, 2012 - October 19, 2012
    Las Vegas, NV
  • INFORMS
    October 14, 2012 - October 16, 2012
    Phoenix, AZ
View full calendar
Home Support FAQs CART What are CART's "automatic self-validation procedures"?

What are CART's "automatic self-validation procedures"?

CART uses two test procedures to select the “optimal” tree, which is the tree with the lowest overall misclassification cost, thus the highest accuracy. Both test disciplines, one for small datasets and one for large, are entirely automated, ensuring that the optimal tree model will accurately classify existing data and predict results.

For smaller datasets and cases when an analyst does not wish to set aside a portion of the data for test purposes, CART automatically employs cross validation. While this frequently occurs in medical research, a shortage of training data can occur in the study of any rare event, such as specific types of fraud. In cross validation, ten different trees are typically grown, each built from a different ten percent of the total sample. When the results of the ten trees are put together, a highly reliable determination of the optimal tree size is obtained. For large datasets, CART automatically selects test data or uses pre-defined test records or test files to self-validate results.