MARS® (Multivariate Adaptive Regression Splines), introduced by Stanford University data mining guru Professor Jerome H. Friedman in 1988, is one of the landmarks in the evolution of regression methods. For the first time analysts could leverage a search mechanism intended to automatically discover nonlinearity and interactions in the context of classical regression.
Multivariate Adaptive Regression Splines was developed in the early 1990s by world-renowned Stanford physicist and statistician Jerome Friedman. It is an innovative, flexible modeling tool that automates the building of accurate predictive models for continuous and binary dependent variables.
The major advantage of MARS is that it automates aspects of regression modeling that are difficult and time-consuming. These include:
MARS is not a black box. It is faster, more interpretable, and more accurate than neural nets.
Almost all modeling technologies can track training data accurately. MARS protects users from misleading results through its two-stage modeling process. MARS overfits its model initially but then prunes away all components that would not hold up with new data. MARS provides assessments through use of one of two built-in testing regimens: cross validation or reference to independent test data. Using these tests, MARS determines the degree of accuracy that can be expected from the best predictive model.
MARS is capable of predicting with much higher resolution and accuracy, typically producing unique scores for every record in a database. In this way, MARS expands on the capabilities of decision trees for regression.
A MARS predictive model can be implemented in two ways. First, new databases can be scored directly by identifying the MARS model and the data to be scored. MARS will perform all the required data transformations and calculations automatically and output the predicted scores. Second, the MARS predictive equation can be exported as ready-to-run C and SAS®-compatible code that can be deployed in the user's application framework.
MARS automatically creates a missing value indicator – a dummy variable – that becomes one of the available predictors. These dummy variables represent the absence or the presence of data for the predictor variables in focus.