Download Now! Free 30 Day Trial of Salford System's Predictive Modeling Suite

Upcoming Tradeshows

  • JSM
    July 28, 2012 - August 02, 2012
    San Diego, CA, Booth TBA
  • KDD
    August 12, 2012 - August 16, 2012
    Beijing, China, Booth TBA
  • Statistical Learning and Data Mining III
    October 01, 2012
    Boston, MA
  • DMA
    October 13, 2012 - October 19, 2012
    Las Vegas, NV
  • INFORMS
    October 14, 2012 - October 16, 2012
    Phoenix, AZ
View full calendar
Thursday, December 29 2011 10:44

Working With A Large Number of Variables In SPM

Salford Systems Predictive Modeler, including CART®, MARS®, TreeNet®, and RandomForests®, can handle any number of variables you care to work with. By default your software will launch prepared to work with up to 32,768 variables which is sufficient for many users. However, if you need to work with a larger number you just need to let the software know at the time the application is launched.

If you are working with non–GUI version you make use of command line arguments informing the application of your preferences. For example the command line syntax is:

     SPM.EXE    -v< N >      Specifies max N variables for the session.

With the GUI version you essentially do the same adding the command line arguments by modifying the properties of the application.

Just follow the following steps, for example, to inform SPM you expect to work with up to 50,000 variables:

  1. Right click on the program group icon and select “Properties.”
  2. From the Properties dialog, be sure to select the “Shortcut” tab.
  3. Click to open image!
  4. From the Shortcut tab, add the parameter “-V50000” to the “Target” path. It should end up looking something like:
  5. Click to open image!

    The value used for this parameter reflects the number of variables allowed to be used in the application. For example, if you need to use 75,000 variables, then you would need to set this parameter at –V75000.

  6. Click the [Apply] button.
  7. Click the [OK] to close the shortcut properties dialog.
  8. Use your program group icon to start SPM or any other individual Salford Systems’ product.

Like many programs, the Salford Predictive Modeler™ reads, writes, and otherwise manages temporary files in the course of its work. These are written to a particular directory on your computer called a “scratch directory”. SPM also writes a command log to the scratch directory. The GUI version of SPM allows the location of this directory to be set as an option (with a sensible default), but non-GUI versions determine where to write temporary files by means of environment variables. Presently, SPM searches for the following environment variables and uses the value of the first one defined as its scratch directory:

  • CARTTEMP
  • SALFORDTEMP
  • TMPDIR
  • TEMP
  • TMP

If no environment variable identifying the scratch directory is defined, or if the user does not have permission to write to the identified scratch directory, non-GUI SPM will issue an error message, like the following, and then terminate:

***ERROR***
CART was unable to update the command log.
Is your temporary file directory accessible?

=================================
Errors and warnings for this job:

ERROR :  20060
Message: CART was unable to update the command log.
Is your temporary file directory accessible?

The proper remedy in such cases is to define one of the above environment variables as pointing to the desired directory. Our usual recommendation is to use TMPDIR on UNIX and UNIX-like systems (such as Linux). The TEMP environment variable will normally be defined by default on Microsoft Windows systems.

Defining Environment Variables on UNIX-Like Systems

To define an environment variable in a terminal session, one can use a command like the following in the C-Shell (csh)

% setenv TMPDIR /tmp

Most other UNIX shells, such as the Korn Shell (ksh), the Z-Shell (zsh), or the GNU Bourne Again Shell (bash) are derived from the Bourne Shell (sh) and use similar syntax. In such shells, environment variables are defined as follows:

$ TMPDIR=/tmp
$ export TMPDIR

Syntax like the following will also work in the Korn Shell and most other Bourne Shell derivatives, but not in the Bourne Shell itself:

$ export TMPDIR=/tmp

As it is rather inconvenient to define environment variables every time one opens a new terminal session, UNIX shells generally provide for a start-up file that is executed automatically whenever a user logs on. The one used by the C-Shell is named .login and resides in the user's home directory ($HOME). If the command “setenv TMPDIR /tmp” is placed in that file, the environment variable TMPDIR will automatically be defined as /tmp for that particular user. Likewise, the Bourne Shell and derivatives use a file named .profile (also residing in the user's home directory) for the same purpose.

A system administrator can define environment variables for all users by placing the definitions in the system wide startup files for the appropriate shells. The one for the C-Shell is /etc/csh.login on most UNIX-like systems, but on some (like Solaris) it is /etc/.login. The corresponding file for the Bourne Shell and derivatives is /etc/profile. In some UNIX-like systems (particularly Linux distributions), there is a directory /etc/profile.d containing scripts which are run after the system-wide shell startup scripts named above. The ones for the C-shell have names with the .csh extension, while those for the Bourne Shell and derivatives will have the .sh extension.

See your shell's documentation for details.

It is also possible to define environment variables when the system starts up, but the locations of the files containing such definitions vary widely and are often not well documented. Also, changes in such definitions will not take effect until the system restarts.

Published in Company