By default, R uses only one core for its computation. This can slow things down, when you have to run extensive analyses (e.g. involving bootstrapping over many groups).
However, you can set up R in parallel computing mode and boost speed considerably. There are several R-packages that enable R to use multiple cores. But if you are new to the field, it can be painful to choose the right package. Here, I show how a parallel R-analysis can be set up in five easy steps.
The following code works on all stand-alone operating systems ('socket'-mode). If you want to set it up for Windows, make sure your firewall is disabled. Your analysis must involve a split-apply combine strategy. This simply means that your analysis must include some form iterations, e.g. a model fitted for each region, species, etc.
Five steps to set up parallel R computation
1. Install and load packages into your workspace
doParallel and foreach register the cores and set up parallel computing; plyr specifies the functions to be performed.
2. Load your data
In my case, I am using the Edgar Anderson's iris dataset. You might want to import you dataset via functions like read.table.
3. Specify the number of clusters to be used
detectCores() is not obligatory. It checks the number of cores your PC/Notebook/server has (in case you don't know it). makeCluster() and registerDoParallel() setup the cores (also called registering). I am using six out of eight cores (two cores reserved for my remaining applications, such as e-mail client).
4. Start your analysis
Here, I am using the iris dataset as input to produce a list with model parameters for each species. In the .paropts argument, you purge all necessary data (and packages) into the workspace of each core. In my case, I am purging only the iris dataset; no special packages are needed. The proc.time() wrapper functions records the time to run the process (not obligatory).
5. Unregister your clusters
I am a plant ecologist and post-doctoral fellow at Masaryk University.