Later, and especially once TreeNet and other ensembles became available, practitioners observed that we sometimes encountered rather close agreement between train and test results. This was especially true for methods such as TreeNet, which were constructed to resist overfitting. The question then became whether we should use a possible divergence between train and test data as itself an indication of a problem with the model.
Our practical approach in model development and selection is to prefer TreeNet models that show good agreement between train and test results and to distrust models exhibiting substantial train/test disagreement. We do not offer a formal statistical test of this difference, relying instead on judgment. In practice, when the train/test divergence seems too large to ignore we attempt to refine our models in one or more of the following ways:
- Using a slower learn rate,
- Growing smaller trees, or
- Removing some potentially strong predictors from the model.
We offer these answers to the following specific questions:
Are large differences between train and test results common? Is this in and of itself a problem?
—Large divergences between train and test performance in TreeNet models are not an everyday occurrence, but are not rare either.
—Large train/test performance differences in TreeNet models are not necessarily a problem, but we take them as indications that the models are probably sub–optimal and can be improved by appropriate manipulation of the Treenet control parameters and the predictors used.


