Download Now! Free 30 Day Trial of Salford System's Predictive Modeling Suite
randomforests_logo

Upcoming Tradeshows

View full calendar
Home Technical Overview Scalability
Scalability
To accommodate different dataset sizes, RandomForests® is available in several different memory sizes. The standard memory version of RandomForests for Windows is compiled for a machine with at least 64MB of memory (RAM), and can analyze more than 4.5 million learning sample observations. The table below shows the approximate number of learn sample observations that can be used in an analysis for a given RandomForests version size.

A user's license sets a limit on the amount of learn sample data that can be analyzed. The learn sample is the data used to grow the maximal tree. Note that there is no limit to the number of test sample data points that may be analyzed.

For example, suppose our 32MB version sets a learn sample limitation of 8 MB. Each data point occupies 4 bytes. Therefore, a 8MB license will allow up to 8 * 1024 * 1024 / 4 = 2,097,152 learn sample data points to be analyzed. A data point is represented by a 1-variable by- 1-observation (1-row by- 1-column).

In general, we feel that the analysis workspace provided to build the tree will be adequate for most modeling scenarios. However, if the user models a large number of high level categorical predictors, or is using a high level categorical target, they may encounter workspace limitations that will not allow the entire learn sample to be used. In these special cases the user will have to upgrade to a larger memory version.

The following is a table that describes the current set of "sizes" available. Please note that the minimum required RAM is not the same as the learn sample limitation. If you have any questions regarding the following information, please contact a sales representative.
  • Size = minimum recommended physical memory (RAM) in MB.
  • Data Limit MB = Licensed learn sample data size in MB (1 MB = 1,048,576 bytes)
  • Data Limit # of values = Licensed # of learn sample values (rows by columns)
  • SP cells = max number of 4 byte (Single Precision) workspace elements the program can use.
Single precision workspace may involve virtual memory when a run uses the maximum or near-maximum amount of workspace.

Size (MB) Data Limit (MB) Data Limit # of values SP Cells (4-byte)
32 8 2,097,152 10,000,000
64 18 4,718,592 13,500,000
128 45 11,796,480 33,750,000
256 100 26,214,400 75,000,000
512 200 52,428,800 150,000,000
1024 400 104,857,600 250,000,000
2048 800 209,715,200 356,000,000


* Custom compiles up to 32 gigs available.