Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

No Format
module load R/3.4.0-gu (awoonga)
module load R/3.4.0 (gowonda)

Old Packages:
=============
module load R/3.3.3
OR
module load R/3.2.3
OR
module load R/3.2.0
module load R/3.4.0
There are other older versions as well.
module avail R

Griffith Uni R Users site

Griffith Uni R Users site: http://guru.forumotion.co.nz/


Accessing R on the gowonda cluster

...

No Format
#!/bin/bash
######################################################################
#### This section defines options for the pbs batching system
######################################################################
#PBS -m abe
#PBS -M YourEmail@griffith.edu.au
#PBS -q workq
#PBS -l select=1:ncpus=1:mem=2gb
#PBS -l ,walltime=60:00:00
#PBS -N vivaxIBMS3
#
###Load the pgi compilers.
module load compilers/pgi-32bit-12.4
###Load the R 
module load R/24.130.03


#####################################################################
#### This section is for my debugging purposes (not required)
######################################################################
echo Running on host `hostname`
echo Time is `date`
######################################################################
#### This section is setting up and running your executable or script
######################################################################
cd cd  $PBS_O_WORKDIR
###cd /export/home/s12345/pbs/R/
echo Directory is `pwd`

source $HOME/.bashrc
R CMD BATCH vivaxIBMS3.r resultsIBMS3.Rout '--args a=1 b=c(2,5,6)' test.R test.out
#CMD BATCH options which tells it to immediately run an R program
#instead of presenting an interactive prompt
#R.out is the screen output
#results.Rout is the R program output

...

If the gputools package is installed, it will provide R interfaces to a handful of common statistical algorithms. These algorithms are implemented in parallel using a mixture of Nvidia's CUDA langauge, Nvidia's CUBLAS library, and EMI Photonics' CULA libraries. On a computer equiped with an Nvidia GPU some of these functions may be substantially more efficient than native R routines.

Sample

...

This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template

No Format
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS  -l  walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS  -A qris-gu
                             
source $HOME/.bashrc
module  load R/3.2.3
cd  /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt

## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"

List all R packages

...

R script for test (test.R)

Source: https://www.r-bloggers.com/2007/08/including-arguments-in-r-cmd-batch-mode/

No Format
##First read in the arguments listed at the command line
args=(commandArgs(TRUE))
##args is now a list of character vectors
## First check to see if arguments are passed.
## Then cycle through each element of the list and evaluate the expressions.
if(length(args)==0){
    print("No arguments supplied.")
    ##supply default values
    a = 1
    b = c(1,1,1)
}else{
    for(i in 1:length(args)){
         eval(parse(text=args[[i]]))
    }
}
print(a*2)
print(b*3)


Sample script to run on flashlite cluster


This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template

No Format
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS  -l  walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS  -A qris-gu
                             
source $HOME/.bashrc
module  load R/3.2.3
cd  /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt

## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"


List all R packages

No Format
length(.packages(all.available=TRUE))
ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
rownames(ip) <- NULL
ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
print(ip, row.names=FALSE)

For example:


> length(.packages(all.available=TRUE))
[1] 200
> ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
> rownames(ip) <- NULL
> ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
> print(ip, row.names=FALSE)
       Package     Version
         abind       1.4-5
       acepack       1.4.1
       anchors       3.0-8
       askpass         1.1


How to run R code in parallel : Parallelisation using dplyr and doParallel

Parallelisation using plyr and doParallel

We have plyr and DoParallel in R/4.0.3

Threads vs. cores

There is often a lot of confusion between CPU threads and cores. A CPU core is the actual computation unit. Threads are a way of multi-tasking, and allow multiple simultaneous tasks to share the same CPU core. Multiple threads do not substitute for multiple cores. Because of this, compute-intensive workloads (like R) are typically only focused on the number of CPU cores available, not threads. (Ref: https://jstaf.github.io/hpc-r/parallel/)

No Format
Example:
module load R/4.0.3

> library(plyr)

> library(doParallel)

Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
>cores <- detectCores()
>cores
[1] 72
> registerDoParallel(cores=12)
 fake_func <- function(x) {
 Sys.sleep(0.1)
 return(x)
 }
 library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)
Unit: milliseconds

     expr       min        lq      mean    median        uq       max neval

   serial 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580     1

 parallel  226.2199  226.2199  226.2199  226.2199  226.2199  226.2199     1

> 


Sample interactive pbs run


No Format
qsub -I -l select=1:ncpus=8:mem=12gb,walltime=1:11:00 -q small
Once you are on the compute node:
module load R/4.0.3
R
library(plyr)
library(doParallel)
cores <- detectCores()
cores
[1] 12
registerDoParallel(cores=8)
fake_func <- function(x) {
Sys.sleep(0.1)
return(x)
}
library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)


Unit: milliseconds
     expr       min        lq      mean    median        uq       max neval
   serial 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702     1
 parallel  320.0491  320.0491  320.0491  320.0491  320.0491  320.0491     1
> 


Create R conda environment (Recommended way on the new cluster) to run R on the new cluster

Another easy way to install R and R packages is to use conda . Here are the steps

No Format
source /usr/local/bin/s3proxy.sh
module load anaconda3/2024.06
conda create --name myRenv R=4.3.1
source activate myRenv

To install packes like "tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra"
Simply google with the package name and conda as search strings e.g. tidyverse conda
You will see instructions to install these packages:

conda install r::r-lubridate
conda install r::r-tidyverse
conda install conda-forge::r-ggplot2
conda install conda-forge::r-tmap
conda install conda-forge::r-ggmap
conda install conda-forge::r-sf
conda install conda-forge::r-ggsci
conda install conda-forge::r-remotes
conda install conda-forge::r-raster
conda install conda-forge::r-readxl
conda install conda-forge::r-terra

Or simply:
conda create --name myRenv r=4.3.1 r-tidyverse r-ggplot2 r-tmap r-ggmap r-sf r-ggsci r-remotes r-raster r-readxl r-terra -c conda-forge 

Here is a sample pbs script:
#!/bin/bash
#PBS -m abe
#PBS -M XXXX@griffithuni.edu.au
#PBS -N MyTest
#PBS -q workq
#PBS -l select=1:ncpus=1:mem=4gb,walltime=01:00:00
module load anaconda3/2024.06
source activate myRenv
module list
cd $PBS_O_WORKDIR
R '--save' <MAIN_CODEmodel1.R
##sleep 22
echo "Done with job"
>>>>>

R script can start with loading the packages like this:
package_names <- c("tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra")

# Loop to load the packages
for (package_name in package_names) {
if (!requireNamespace(package_name, quietly = TRUE)) {
message(paste("Installing and loading", package_name))
install.packages(package_name, dependencies = TRUE)
}
library(package_name, character.only = TRUE)
}

<snip>



Reference

1. http://yusung.blogspot.com.au/2009/01/install-jags-and-rjags-in-fedora.html

...