Like many programs, the Salford Predictive Modeler™ reads, writes, and otherwise manages temporary files in the course of its work. These are written to a particular directory on your computer called a “scratch directory”. SPM also writes a command log to the scratch directory. The GUI version of SPM allows the location of this directory to be set as an option (with a sensible default), but non-GUI versions determine where to write temporary files by means of environment variables. Presently, SPM searches for the following environment variables and uses the value of the first one defined as its scratch directory:
If no environment variable identifying the scratch directory is defined, or if the user does not have permission to write to the identified scratch directory, non-GUI SPM will issue an error message, like the following, and then terminate:
***ERROR***
CART was unable to update the command log.
Is your temporary file directory accessible?
=================================
Errors and warnings for this job:
ERROR : 20060
Message: CART was unable to update the command log.
Is your temporary file directory accessible?
The proper remedy in such cases is to define one of the above environment variables as pointing to the desired directory. Our usual recommendation is to use TMPDIR on UNIX and UNIX-like systems (such as Linux). The TEMP environment variable will normally be defined by default on Microsoft Windows systems.
To define an environment variable in a terminal session, one can use a command like the following in the C-Shell (csh)
% setenv TMPDIR /tmp
Most other UNIX shells, such as the Korn Shell (ksh), the Z-Shell (zsh), or the GNU Bourne Again Shell (bash) are derived from the Bourne Shell (sh) and use similar syntax. In such shells, environment variables are defined as follows:
$ TMPDIR=/tmp
$ export TMPDIR
Syntax like the following will also work in the Korn Shell and most other Bourne Shell derivatives, but not in the Bourne Shell itself:
$ export TMPDIR=/tmp
As it is rather inconvenient to define environment variables every time one opens a new terminal session, UNIX shells generally provide for a start-up file that is executed automatically whenever a user logs on. The one used by the C-Shell is named .login and resides in the user's home directory ($HOME). If the command “setenv TMPDIR /tmp” is placed in that file, the environment variable TMPDIR will automatically be defined as /tmp for that particular user. Likewise, the Bourne Shell and derivatives use a file named .profile (also residing in the user's home directory) for the same purpose.
A system administrator can define environment variables for all users by placing the definitions in the system wide startup files for the appropriate shells. The one for the C-Shell is /etc/csh.login on most UNIX-like systems, but on some (like Solaris) it is /etc/.login. The corresponding file for the Bourne Shell and derivatives is /etc/profile. In some UNIX-like systems (particularly Linux distributions), there is a directory /etc/profile.d containing scripts which are run after the system-wide shell startup scripts named above. The ones for the C-shell have names with the .csh extension, while those for the Bourne Shell and derivatives will have the .sh extension.
See your shell's documentation for details.
It is also possible to define environment variables when the system starts up, but the locations of the files containing such definitions vary widely and are often not well documented. Also, changes in such definitions will not take effect until the system restarts.
The Salford Predictive Modeler™ suite (SPM) includes a number of automated tools to assist in the process of feature selection under the BATTERY mechanism. For example,
BATTERY KEEP
Selects a subset of features at random and builds a model from this random subset only. The GUI will guide you in how to use this option, but from the command line you would issue something like:
BATTERY KEEP=100, 15
Which requests 100 models, each of which includes 15 randomly-selected predictors. If we are sure that we want certain variables included in every such model, the command would look like:
BATTERY KEEP=100, 15 CORE= X1, X2, X3, X4, X5
MIAMI -- Salford Systems, the authority in data mining and predictive analytics software, unveiled its new Salford Predictive Modeler (SPM)™ software suite at NCDM 2010 here today. SPM provides businesses, institutions and government agencies with a highly accurate, ultra-fast platform for developing predictive, descriptive and analytical models from large and complex databases. SPM technology dramatically accelerates accurate, robust model generation by automatically sifting through such databases to isolate significant patterns and relationships. Yet the program is easy to use for both technical and nontechnical users.
Salford Predictive Modeling Suite (SPM) includes CART, MARS, TreeNet, and RandomForests, and powerful new automation and modeling capabilities not found elsewhere.
POWERFUL analytics you can trust
This e-mail address is being protected from spambots. You need JavaScript enabled to view it. Find out how you can use SPM technology in ways that are core and critical to your analytics challenges.
SAN DIEGO – Data mining technology allows sports teams to find new indicators to measure player performance while helping them gain insight into athletes’ future success, asserted Mikhail Golovnya, Salford Systems’ senior scientist, during his presentation at the MIT Sloan Sports Analytics Conference in Boston last week.
SAN DIEGO - Dr. Falk Huettmann, a wildlife ecologist and professor at the University of Alaska-Fairbanks, has written a report entitled Future of Alaska in which he forecasts how climate change, human activities, natural disasters and cataclysmic events might affect Alaska’s ecosystem over the next 100 years.
SAN DIEGO – Salford Systems CEO Dan Steinberg and Salford product user Felipe Fernandez will share with KDD 2011 attendees how broad scale predictive modeling and marketing optimization can be used to improve retail sales. The presentation will be included in the conference’s inaugural Industrial Practice Expo on Tuesday, Aug. 23.
SAN DIEGO – Salford Systems announces its 2012 Analytics and Data Mining Conference with the launch of its new conference website. The conference will be held in San Diego, Calif., May 24-25, 2012.
MIAMI – For the first time since its release, Salford Systems will train analysts on the advanced and novel features of its Predictive Modeling Suite. An Introduction to SPM is one of the featured computer training workshops included at the 2011 Joint Statistical Meetings in Miami Beach, Fla.
SAN DIEGO – A recent study confirms that a 17-gene genomic biomarker, identified by Salford Systems’ data mining algorithm TreeNet®, enables the Epidermal Genetic Information Retrieval (EGIR) method to detect melanoma accurately.