The 64-bit World is Coming! TRIPLE your Training Data Set Sizes!
One of the most important new developments for predictive modelers is the growing availability of 64-bit computing environments for everyday computation. Newer notebooks are coming out with 64-bit processors and versions of Windows; 64-bit servers have been available for some time. This note briefly discusses the significance of this development and explains what it means for both the modeler and the user of Salford Systems tools.
For users of CART, MARS, TreeNet, RandomForests or the Salford Predictive Modeling suite, the best news is that you can increase your training data capacity dramatically and still operate with the Salford GUI software. If all you want to learn is how to accomplish this NOW, skip down to the section at the bottom of this on LARGE DATA CAPACITIES.
64-bit versus 32-bit
I suspect that the vast majority of computer users don't really know the difference between 32-bit and 64-bit computing environments and would never be able to tell the difference if they were switched from one environment to the other. For power users, three important differences can impact daily computational activity:
A 64-bit system allows you to work effectively with vast amounts of RAM. Typical 64-bit systems now come equipped with 32 GB of RAM. Because CART, MARS, TreeNet, and RandomForests prefer to work with plentiful RAM, 64-bit environments are ideal for these tools.
A 64-bit system allows you to effectively work with vast files.
64-bit systems can potentially perform computations considerably faster than 32-bit systems that are otherwise identical.
64-bit systems are constructed from both 64-bit hardware and 64-bit software starting with the operating system. If your hardware is 64-bit but your operating system software is not, then you will be working in essentially a 32-bit system. Naturally, 64-bit software will not run on 32-bit hardware so you will never encounter that combination in your computing environment. However, much 32-bit software will run in 64-bit environments; I talk about this in detail below:
For most of us, the 64-bit CPU will appear to be something new, and while for all practical purposes it is, the 64-bit CPU has actually been around for some time. Wikipedia reports that IBM developed a 64-bit processor as early as 1961. Digital Equipment Corporation (DEC) introduced its Alpha line of minicomputers in 1992, and Salford Systems started using these fantastic machines in 1996. Game players have benefited from 64-bit processors starting with Nintendo's 1996 models.
From the advent of Windows 7 you can assume that almost all new computers, desktop or notebook, are 64-bit.
Until recently, most of us have been working with 32-bit operating systems such as Windows XP or some flavor of Linux or UNIX, but if you have a relatively new computer you might be running 64-bit without knowing it.
The 64-bit OS most likely to be found on your computer would be a 64-bit edition of Microsoft Vista or Windows Server 2008. XP has also had a 64-bit version available since 2001 and Windows 7 is a 64-bit OS.
Large Data Capabilities
The current versions of our Graphical User Interface (GUI) software are all 32-bit, although 64-bit versions are in the works. The good news is that you can run this 32-bit software on a 64-bit platform and access double to triple the amount of data than is possible on a 32-bit machine.
To upgrade to such a version all you have to do is:
Verify that you are running a 64-bit version of Windows. This could be a Windows Server, a 64-bit XP, a 64-bit VISTA, or windows 7.
Equip your machine with at least 4 GB RAM. Technically, you can work with less but you will get better performance with 4 GB.
Contact Salford Systems to obtain the "64-bit aware" version of your software upgraded to permit larger training data sets. You ought to be able to work with 2-3 GB of training data, versus the 1 GB that is possible now.
That's all there is to it!
Technical Notes for Geeks
Our GUI apps (most of what we offer) are 32-bit Windows apps. While we do not currently offer a 64-bit GUI, in the interim, we do offer people who have 64-bit machines special versions of our 32-bit software that can run with considerably larger workspaces. We call these "64-bit aware" versions. Users ought to be able to work with training data sets that are two to three times larger than the capacity of our 1 GB versions.
These versions are provisional early releases, and may be subject to premium license pricing. Full 64-bit versions of our WINDOWS apps will probably first become available in June 2010. Salford Systems has no problem making them, but we rely on some third party software for report windows, grids controls, and graphs that are not yet 64-bit ready.
We also offer 64-bit versions of our software for Windows and selected varieties of UNIX (including Linux). These are text-based, command-oriented programs, primarily intended for batch-mode processing. Although these versions do not support graphical displays on their host Linux machines, if you download the grove files to a Windows machine you can access all the graphical displays in the Windows GUI. So long as you are comfortable with using a Windows PC for graphics and a Linux/UNIX server for number crunching you can enjoy the best of both worlds: huge training databases and superb graphical displays.