Download Now Instant Evaluation
Get Price Quote

How to access data in relational databases via ODBC

*************
SPM 6.6 (TreeNet TN 6.4) or greater supports data access to Microsoft SQL Server, Oracle, MySQL and other RDMS via ODBC interface.

Since SQL Queries cannot be entered via standard Windows ODBC dialog data source selection dialog, one has to use command line to open data directly from SQL Server.

*************

How To Unlock The 30-Day Free Evaluation of Salford Predictive Modeler

The SPM® software suite must be downloaded with Administrator rights and read/write & modify permissions MUST be applied to the /bin directory PRIOR to proceeding. If you need help with SPM Installation (Administrator Rights & Ensuring Proper Permissions),
please contact usor email  Support (at) salford-systems (dot) com
 
Once the above instructions have been completed, you can now request your Unlock Key.
To unlock the SPM software for your 30–day free evaluation, please FILL OUT THIS FORM
or e–mail the following information to Unlock (at) salford-systems (dot) com

Introduction to Tree-Based Machine Learning

The following videos cover the underlying methods in the SPM® 8.2 Software Suite and provide demonstrations of each of the modeling engines.

Software Featured in the Videos:

  • SPM® 8.2 Software Suite
  • CART® Software
  • RandomForests® Software
  • TreeNet® Software
  • MARS® Software
  • RuleLearner™ Software
  • ISLE© Software
  • GeneralizedPathSeeker™ Software

MARS® - Multivariate Adaptive Regression Splines

MARS

Automatic Non-Linear Regression

The MARS® modeling engine is ideal for users who prefer results in a form similar to traditional regression while capturing essential nonlinearities and interactions. The MARS methodology’s approach to regression modeling effectively uncovers important data patterns and relationships that are difficult, if not impossible, for other regression methods to reveal. The MARS modeling engine builds its model by piecing together a series of straight lines with each allowed its own slope. This permits the MARS modeling engine to trace out any pattern detected in the data.

High-Quality Regression and Classification

The MARS Model is designed to predict numeric outcomes such as the average monthly bill of a mobile phone customer or the amount that a shopper is expected to spend in a web site visit. The MARS engine is also capable of producing high quality classification models for a yes/no outcome. The MARS engine performs variable selection, variable transformation, interaction detection, and self-testing, all automatically and at high speed.

High-Performance Results

Areas where the MARS engine has exhibited very high-performance results include forecasting electricity demand for power generating companies, relating customer satisfaction scores to the engineering specifications of products, and presence/absence modeling in geographical information systems (GIS).

[J#74:1707]

[art#41:1707]

Memory Requirements for the Salford Predictive Modeler® software suite

A user's license sets a limit on the amount of learn sample data that can be analyzed. The learn sample is the data used to build the model. Note that there is no limit to the number of test sample data points that may be analyzed. In other words, rows -by- columns of variable and observations used to build the model. Variable not used in the model do not count. Observations reserved for testing, or excluded for other reasons, do not count.

Reading MySQL tables with SPM®

SPM® for Windows has long had the ability to read tables in relational databases through the ODBC interface. This capability was also recently added to the command line version on Windows and it is planned on UNIX platforms (including MacOS X). The purpose of this article is to describe how to access MySQL databases specifically, but the same principles will apply to accessing data stored in other relational database systems. Probably, the only thing that will differ will be the driver used.

Saving MARS® Regression Spline Basis Functions to a New Dataset

MARS® (Multivariate Adaptive Regression Splines), introduced by Stanford University data mining guru Professor Jerome H. Friedman in 1988, is one of the landmarks in the evolution of regression methods. For the first time analysts could leverage a search mechanism intended to automatically discover nonlinearity and interactions in the context of classical regression.

Software Demonstrations

resources software demonstrations

The videos contains the demonstrations of the techniques using the SPM® Software Suite. Software Featured in the Videos: SPM® Software Suite, CART® Software, Random Forests® Software, TreeNet® Software, MARS® Software, RuleLearner® Software, ISLE© Software, Generalized PathSeeker™ Software.

SPM® 8.2 Software Suite Demonstrations

Introduction to SPM® 8.2 Software & Exploring Data

 

A Fast Introduction to RandomForests® Software

 

CART® Software For Regression: Part I

 
This video provides an introduction to CART® software using the SPM® 8.2 Software Suite.

Introduction to MARS® Software for Regression

 

Introduction to TreeNet® Software for Binary Classification

 

Scoring New Data (Generate Predictions)

 
Table of Contents: click the button to the left of the full screen button (hover your mouse over the lower right hand corner of the video)

[J#1773:1710]

SPM® Scalability

A user's license sets a limit on the amount of learn sample data that can be analyzed. The learn sample is the data used to build the model. Note that there is no limit to the number of test sample data points that may be analyzed. In other words, rows -by- columns of variables and observations used to build the model. Variable not used in the model do not count. Observations reserved for testing, or excluded for other reasons, do not count.

For example, suppose our 32MB version that sets a learn sample limitation of 8 MB. Each data point occupies 4 bytes. For instance, a 8MB capacity license will allow up to 8 * 1024 * 1024 / 4 = 2,097,152 learn sample data points to be analyzed.A data point is a represented by a 1-variable by- 1-observation (1-row by-1-column).

The following is a table that describes the current set of "sizes" available. Please note that the minimum required RAM is **not** the same as the learn sample limitation.

Size Data Limit MB Data Limit # of values  
minimum required
physical memory
(RAM) in MB
Licensed learn sample
data sizein MB 
(1 MB = 1,048,576 bytes)
Licensed # of learn
sample values
(rows by columns)
 
32 8 2,097,152  
64 18 4,718,592  
128 45 11,796,480  
256 100 26,214,400  
512 200 52,428,800  
1024 400 104,857,600  
2048 800 209,715,200 **64-bit only
3072 1200 324,572,800 **64-bit only

Additional larger capacity is available under 64-bit operating systems, using our non-GUI (command-line) builds. The non-GUI is very flexible and can be licensed for large data limits not currently available in the GUI product line. The current MAXIMUM is 8-GIG data capacity for our non-GUI builds.

[J#88:1602]

Survival Analysis with CART®, MARS®, and TreeNet®

CART®, MARS®, and TreeNet® were originally developed to analyze cross-sectional data, where each observation or record in the data is independent of all other records and no explicit accommodation is made for either time or censoring. Fortunately, research in statistics has shown us how to adapt our tools, as well as classical statistical tools such as logistic regression, to the analysis of time series cross-sectional and survival analysis data. This brief note outlines the topic, sometimes known as "discrete time survival analysis," showing you how to set up your data to estimate survival or failure time models. The methods discussed here also apply to the analysis of web logs and other sequentially-structured data. A collection of useful references is provided below.

The Evolution of Regression Modeling: from Classical Linear Regression to Modern Ensembles

The Evolution of Regression Modeling: from Classical Linear Regression to Modern Ensembles

Webinar Title: The Evolution of Regression Modeling: from Classical Linear Regression to Modern Ensembles

Date/Time: Friday, March 1, 15, 29, and April 12 2013, 10am-11am, PST


Course Description:
Regression is one of the most popular modeling methods, but the classical approach has significant problems. This webinar series address these problems. Are you are working with larger datasets? Is your data challenging? Does your data include missing values, nonlinear relationships, local patterns and interactions? This webinar series is for you! We will cover improvements to conventional and logistic regression, and will include a discussion of classical, regularized, and nonlinear regression, as well as modern ensemble and data mining approaches. This series will be of value to any classically trained statistician or modeler.

Part 1

Part 1: Regression methods discussed (download slides)

  • Classical Regression
  • Logistic Regression
  • Regularized Regression: GPS Generalized Path Seeker
  • Nonlinear Regression: MARS Regression Splines

The Evolution of Regression Modeling

 

The Evolution of Regression Modeling: from Classical Linear Regression to Modern Ensembles

[J#58:1708]

Get In Touch With Us

Request online support

Ph: 619-543-8880
9685 Via Excelencia, Suite 208, San Diego, CA 92126