• Banner 201707

    INTRODUCING

    Fast, highly accurate platform for data mining and predictive analytics

  • Banner 201707

    INTRODUCING

    Fast, highly accurate platform for data mining and predictive analytics

Talk to Minitab
Get Price Quote

Saving MARS® Regression Spline Basis Functions to a New Dataset

MARS® (Multivariate Adaptive Regression Splines), introduced by Stanford University data mining guru Professor Jerome H. Friedman in 1988, is one of the landmarks in the evolution of regression methods. For the first time analysts could leverage a search mechanism intended to automatically discover nonlinearity and interactions in the context of classical regression. The MARS procedure involves a forward stepwise model building stage followed by a backwards elimination of unneeded predictors to arrive at surprisingly high performance models, all automatically. At the heart of the MARS algorithm is the search for "knots" or breaks in the range of a predictor allowing a regression model containing that predictor to have different slopes in each region. Breaking predictors into regions permits nonlinearity, and when interactions are constructed from regions of predictors, remarkable discoveries are enabled.

In Friedman's terminology, a sub-region of a predictor is called a "basis function" and in the MARS® model construction procedure no predictor is ever used in a model without first being transformed into one or more basis functions. A MARS® model always looks like a conventional multiple regression model but the predictors in the model are selected from those that the MARS procedure has constructed automatically during its forward stepwise basis function construction phase. The predictors that make into the final MARS model are those have survived the automatic backwards "pruning" phase. Readers familiar with CART® will recognize the strategy; Friedman, who is the co-creator of CART® that wrote all of the procedure's source code, modeled MARS explicitly on the CART® methodology. And just like CART®, the final set of nested, progressively smaller models that result from the backwards stepping are all available to the user who might want to consider making their own choice regarding which model in the sequence is best suited for their needs.

While this flexibility is highly desirable there will be times when a modeler might want event more control over the construction and shaping of the final model. Given that MARS prints the programming code that defines each basis function users have always the option to use that code to create the inputs that are available to the MARS® model. But in SPM® 8.2 the user not afraid to try using the command interface can also arrange to have SPM® create the basis functions and save them to a new data set. From there, the modeler can run their own models including using the GPS/Generalized Lasso as a modern alternative to backwards stepwise basis function elimination as an optimal model selection mechanism.

Here is how to do it, using the BOSTON housing data set. Open a new SPM® Notepad using CTRL-N or from the menus File...New Notepad. There, enter these commands (you may well need to alter the file names and use fully qualified pathnames):

USE BOSTON.CSV
PARTITION CROSS=10
MODEL MV
KEEP AGE, B, CHAS, CRIM, DIS, INDUS, LSTAT, NOX, PT, RAD, RM, TAX, ZN
GPS SAVE="BOSTON_MARS_BF.CSV"
MARS GPS=YES GO

The first new item above is the GPS command. Here the only function of this command is to arrange to save the MARS basis functions. The second new item is the GPS=YES option on the MARS GO command. Running this script will build the conventional MARS model and save the basis function data set. Now you are free to run your own experiments!

COMING SOON: As you might expect the next release of SPM® 8 will actually go ahead and run the GPS/Generalized Lasso for you and display comparative results for classic MARS versus GPS post-processed MARS.

[J#393:1707]

Tags: MARS, Regression, SPM

  • SPM Version 8 Just Released!

    SPM Version 8 Just Released!

    NEW Salford Predictive Modeler software suite.

    Read more

  • Environmental Forecasting

    Environmental Forecasting

    Forecast the evolution of environmental outcomes using changes in habitat and climate as predictors.
  • Sports Analytics

    Sports Analytics

    "Discover the undisclosed predictors to successful athletic performance using modern decision trees."
  • Targeted Marketing

    Targeted Marketing

    Enabling you to get appropriate prospective customers more efficiently than any other marketing strategies.
  • Text Mining

    Text Mining

    Derive high-quality information from text to improve your understanding of behaviours and patterns.
  • Bioinformatics

    Bioinformatics

    "Increase your probability of solving formal and practical challenges arising from the analysis of biological data."
  • Bioinformatics

    Bioinformatics

    Learn how to make knowledge-driven decisions that can revolutionize your business performance.
  • Financial Services

    Financial Services

    Analyze your spending and financial investments to help influence a profitable future for your company
  • Industrial Optimisation

    Industrial Optimisation

    Overcome retail challenges and achieve new levels of predictive accuracy, profitability and reliability.
  • Music

    Music

    Predict musical score groupings, composers that complement each other and what song listeners prefer to listen to.
  • Retail Analytics

    Retail Analytics

    Make smarter decisions to help manage your business more effectively and efficiently.
  • SPM Version 8 Just Released!

    SPM Version 8 Just Released!

    Salford Systems' applications span every major industry and business function

    Read more

  • Environmental Forecasting

    Environmental Forecasting

    Forecast the evolution of environmental outcomes using changes in habitat and climate as predictors.
  • Sports Analytics

    Sports Analytics

    Discover the undisclosed predictors to successful athletic performance using modern decision trees.
  • Targeted Marketing

    Targeted Marketing

    Enabling you to get appropriate prospective customers more efficiently than any other marketing strategies.
  • Text Mining

    Text Mining

    Derive high-quality information from text to improve your understanding of behaviours and patterns.
  • Bioinformatics

    Bioinformatics

    Increase your probability of solving formal and practical challenges arising from the analysis of biological data.
  • Business

    Business

    Learn how to make knowledge-driven decisions that can revolutionize your business performance.
  • Financial Services

    Financial Services

    Analyze your spending and financial investments to help influence a profitable future for your company
  • Industrial Optimisation

    Industrial Optimisation

    Overcome retail challenges and achieve new levels of predictive accuracy, profitability and reliability.
  • Music

    Music

    Predict musical score groupings, composers that complement each other and what song listeners prefer to listen to.
  • Retail Analytics

    Retail Analytics

    Make smarter decisions to help manage your business more effectively and efficiently.

Get In Touch With Us

Request online support

Ph: 619-543-8880
9685 Via Excelencia, Suite 208, San Diego, CA 92126