Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

No Format
module load R/3.4.0-gu (awoonga)
module load R/3.4.0 (gowonda)

Old Packages:
=============
module load R/3.3.3
OR
module load R/3.2.3
OR
module load R/3.2.0
module load R/3.4.0
There are other older versions as well.
module avail R

Griffith Uni R Users site

Griffith Uni R Users site: http://guru.forumotion.co.nz/


Accessing R on the gowonda cluster

...

No Format
Installing the akima package:

R
>install.packages('akima')
akima is now installed in R's library subfolder, with all its dependencies intact. You may load it using

library(akima)
OR use:
install.packages('akima', dependencies=TRUE, repos='http://cran.rstudio.com/')
install.packages(c("EIAdata", "gdata", "ggmap", "ggplot2"))

 

Manually download the package

...

No Format
vi  /sw/R/2.13.0/lib64/R/bin/R

R_HOME_DIR=/sw/R/2.13.0/lib64/R
#R_HOME_DIR=/usr/local/src/apps/R-2.13.0/lib64/R

Additional packages

mvtnorm

...

You may need to add this in ~/.R/Makevars to compile some of the packages successfully.

CXX14 = g++ -std=c++1y -Wno-unused-variable -Wno-unused-function -fPIC
mvtnorm
No Format
> install.packages("mvtnorm", repos="http://R-Forge.R-project.org")
trying URL 'http://R-Forge.R-project.org/src/contrib/mvtnorm_0.9-9992.tar.gz'
Content type 'application/x-gzip' length 315853 bytes (308 Kb)
opened URL
==================================================
downloaded 308 Kb

* installing *source* package âmvtnormâ ...
** libs
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include  -I/usr/local/include    -fpic  -g -O2 -c miwa.c -o miwa.o
gfortran   -fpic  -g -O2 -c mvt.f -o mvt.o
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include  -I/usr/local/include    -fpic  -g -O2 -c randomF77.c -o randomF77.o
gfortran   -fpic  -g -O2 -c tvpack.f -o tvpack.o
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include  -I/usr/local/include    -fpic  -g -O2 -c tvpackAux.c -o tvpackAux.o
gcc -std=gnu99 -shared -L/usr/local/lib64 -o mvtnorm.so miwa.o mvt.o randomF77.o tvpack.o tvpackAux.o -lgfortran -lm -L/sw/R/2.13.0/lib64/R/lib -lR
installing to /sw/R/2.13.0/lib64/R/library/mvtnorm/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded

* DONE (mvtnorm)

The downloaded packages are in
        â/tmp/RtmpVkCqG5/downloaded_packagesâ
Updating HTML index of packages in '.Library'
Making packages.html  ... done

...

No Format
module load mpi/openMPI/1.8.5-gcc4.9.0
R CMD INSTALL --configure-args="--with-Rmpi-include=/sw/openMPI/1.8.5-gcc4.9.0/include --with-Rmpi-libpath=/sw/openMPI/1.8.5-gcc4.9.0/lib/openmpi --with-Rmpi-type=OPENMPI" /tmp/Rmpi_0.6-5.tar.gz

...


gpuR

No Format
module load cuda/7.5.18
install.packages('gpuR', dependencies=TRUE, repos='http://cran.rstudio.com/')

...

No Format
install.packages("devtools")
library(devtools)
install_github("enriquea/feseR")

...


R-3.3.3 Installation

 


Pre-Requisite

No Format
Pre-Requisite
============
>>>>>>>>>>>Install xy to take care of lzma dependency >>>>>>>>>>>>>>>>>>>>>>>>>>
xz-5.3.3
tar -zxvf xz-5.2.3.tar.gz
./configure --prefix=/sw/library/xz/5.3.3
make -j3
make install
>>>>>>>>>>>>pcre>>>>>>>>>>>>>>>
pcre-8.39
bunzip2 pcre-8.39.tar.bz2
tar -xvf pcre-8.39.tar
./configure --enable-utf8 --prefix=/sw/pcre/8.39
make -j3
make install
>>>>>>>>curl>>>
wget https://curl.haxx.se/download/curl-7.53.1.tar.gz

./configure --prefix=/sw/misc/curl/7.53.1
make -j3
make install

>>>>>>bzip2>>>>>>>>>
got error related to bizp2
“/sw/library/bzip2/1.0.6/lib/libbz2.a: could not read symbols: Bad value”

>>
>>>>>>bzip2 fix>>>>>>>>>
cd /sw/library/bzip2/src/bzip2-1.0.6fpic
vi Makefile
CFLAGS=-Wall -Winline -O2 -g $(BIGFILES) -fPIC
make  2>&1 | tee makeLog.txt
make install PREFIX=/sw/library/bzip2/1.0.6fpic  2>&1 | tee makeInstallLog.txt
>>>>>>>

...

No Format
module purge
module load library/zlib/1.2.8
module load library/bzip2/1.0.6fpic
module load library/xz/5.3.3
module load pcre/8.39
module load misc/curl/7.53.1
module load ATLAS/3.9.39
module load jdk/1.8.0_66

./configure --prefix=/sw/R/3.3.3 '--with-cairo'  '--with-jpeglib' '--with-readline' '--with-tcltk'  '--with-blas' '--with-lapack' '--enable-R-profiling'  '--enable-R-shlib'  '--enable-memory-profiling' --enable-R-shlib  --enable-BLAS-shlib  LIBnn=lib64 JAVA_HOME=/sw/sdev/jdk/jdk1.8.0_66/jre    2>&1 | tee log.configure_R.txt

make  2>&1 | tee makeLog.txt

make install 2>&1 | tee makeInstallLog.txt

 


Ref: http://pj.freefaculty.org/blog/?p=315

...

No Format
module purge
module load library/zlib/1.2.8
module load library/bzip2/1.0.6fpic
module load library/xz/5.3.3
module load pcre/8.39
module load misc/curl/7.53.1
module load ATLAS/3.9.39
module load jdk/1.8.0_66
module load gcc/4.9.3
module load misc/openssl/1.0.2

 ./configure --prefix=/sw/R/3.4.0 '--with-cairo'  '--with-jpeglib' '--with-readline' '--with-tcltk'  '--with-blas' '--with-lapack' '--enable-R-profiling'  '--enable-R-shlib'  '--enable-memory-profiling' --enable-R-shlib  --enable-BLAS-shlib  LIBnn=lib64 JAVA_HOME=/sw/sdev/jdk/jdk1.8.0_66/jre  LDFLAGS=" -L/sw/pcre/8.39/lib"  CPPFLAGS="-I/sw/pcre/8.39/include"  2>&1 | tee log.configure_R.txt

make  2>&1 | tee makeLog.txt

make install 2>&1 | tee makeInstallLog.txt

 


Useful Commands to install additional packages in R

...


No Format
install.packages("foobarbaz")

ap <- available.packages()

setRepositories()

install.packages("phyloseq")

library(devtools)
install_github("packageauthor/foobarbaz")
install_bitbucket("packageauthor/foobarbaz")
install_gitorious("packageauthor/foobarbaz")

install.packages("Rbbg", repos = "http://r.findata.org")
install.packages("lubridate", dependencies=TRUE, repos='http://cran.rstudio.com/')


source("https://bioconductor.org/biocLite.R")
biocLite("preprocessCore")

install_url("http://cran.r-project.org/src/contrib/Archive/sentiment/sentiment_0.2.tar.gz")

R CMD INSTALL /tmp/RcppParallel_4.3.20.tar.gz

 


Sample PBS jobs

Serial jobs

No Format
#!/bin/bash
######################################################################
#### This section defines options for the pbs batching system
######################################################################
#PBS -m abe
#PBS -M YourEmail@griffith.edu.au
#PBS -q workq
#PBS -l select=1:ncpus=1:mem=2gb
#PBS -l ,walltime=60:00:00
#PBS -N vivaxIBMS3
#
###Load the pgi compilers.
module load compilers/pgi-32bit-12.4
###Load the R 
module load R/24.130.03


#####################################################################
#### This section is for my debugging purposes (not required)
######################################################################
echo Running on host `hostname`
echo Time is `date`
######################################################################
#### This section is setting up and running your executable or script
######################################################################
cd
cd  $PBS_O_WORKDIR
###cd /export/home/s12345/pbs/R/
echo Directory is `pwd`

source $HOME/.bashrc
R CMD BATCH vivaxIBMS3.r resultsIBMS3.Rout
#CMD BATCH '--args a=1 b=c(2,5,6)' test.R test.out
#CMD BATCH options which tells it to immediately run an R program
#instead of presenting an interactive prompt
#R.out is the screen output
#results.Rout is the R program output

...

If the gputools package is installed, it will provide R interfaces to a handful of common statistical algorithms. These algorithms are implemented in parallel using a mixture of Nvidia's CUDA langauge, Nvidia's CUBLAS library, and EMI Photonics' CULA libraries. On a computer equiped with an Nvidia GPU some of these functions may be substantially more efficient than native R routines.

Sample

...

This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template

No Format
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS  -l  walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS  -A qris-gu
                             
source $HOME/.bashrc
module  load R/3.2.3
cd  /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt

## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"

...

R script for test (test.R)

Source: https://www.r-bloggers.com/2007/08/including-arguments-in-r-cmd-batch-mode/

No Format
##First read in the arguments listed at the command line
args=(commandArgs(TRUE))
##args is now a list of character vectors
## First check to see if arguments are passed.
## Then cycle through each element of the list and evaluate the expressions.
if(length(args)==0){
    print("No arguments supplied.")
    ##supply default values
    a = 1
    b = c(1,1,1)
}else{
    for(i in 1:length(args)){
         eval(parse(text=args[[i]]))
    }
}
print(a*2)
print(b*3)


Sample script to run on flashlite cluster


This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template

No Format
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS  -l  walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS  -A qris-gu
                             
source $HOME/.bashrc
module  load R/3.2.3
cd  /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt

## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"


List all R packages

No Format
length(.packages(all.available=TRUE))
ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
rownames(ip) <- NULL
ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
print(ip, row.names=FALSE)

For example:


> length(.packages(all.available=TRUE))
[1] 200
> ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
> rownames(ip) <- NULL
> ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
> print(ip, row.names=FALSE)
       Package     Version
         abind       1.4-5
       acepack       1.4.1
       anchors       3.0-8
       askpass         1.1


How to run R code in parallel : Parallelisation using dplyr and doParallel

Parallelisation using plyr and doParallel

We have plyr and DoParallel in R/4.0.3

Threads vs. cores

There is often a lot of confusion between CPU threads and cores. A CPU core is the actual computation unit. Threads are a way of multi-tasking, and allow multiple simultaneous tasks to share the same CPU core. Multiple threads do not substitute for multiple cores. Because of this, compute-intensive workloads (like R) are typically only focused on the number of CPU cores available, not threads. (Ref: https://jstaf.github.io/hpc-r/parallel/)

No Format
Example:
module load R/4.0.3

> library(plyr)

> library(doParallel)

Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
>cores <- detectCores()
>cores
[1] 72
> registerDoParallel(cores=12)
 fake_func <- function(x) {
 Sys.sleep(0.1)
 return(x)
 }
 library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)
Unit: milliseconds

     expr       min        lq      mean    median        uq       max neval

   serial 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580     1

 parallel  226.2199  226.2199  226.2199  226.2199  226.2199  226.2199     1

> 


Sample interactive pbs run


No Format
qsub -I -l select=1:ncpus=8:mem=12gb,walltime=1:11:00 -q small
Once you are on the compute node:
module load R/4.0.3
R
library(plyr)
library(doParallel)
cores <- detectCores()
cores
[1] 12
registerDoParallel(cores=8)
fake_func <- function(x) {
Sys.sleep(0.1)
return(x)
}
library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)


Unit: milliseconds
     expr       min        lq      mean    median        uq       max neval
   serial 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702     1
 parallel  320.0491  320.0491  320.0491  320.0491  320.0491  320.0491     1
> 


Create R conda environment (Recommended way on the new cluster) to run R on the new cluster

Another easy way to install R and R packages is to use conda . Here are the steps

No Format
source /usr/local/bin/s3proxy.sh
module load anaconda3/2024.06
conda create --name myRenv R=4.3.1
source activate myRenv

To install packes like "tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra"
Simply google with the package name and conda as search strings e.g. tidyverse conda
You will see instructions to install these packages:

conda install r::r-lubridate
conda install r::r-tidyverse
conda install conda-forge::r-ggplot2
conda install conda-forge::r-tmap
conda install conda-forge::r-ggmap
conda install conda-forge::r-sf
conda install conda-forge::r-ggsci
conda install conda-forge::r-remotes
conda install conda-forge::r-raster
conda install conda-forge::r-readxl
conda install conda-forge::r-terra

Or simply:
conda create --name myRenv r=4.3.1 r-tidyverse r-ggplot2 r-tmap r-ggmap r-sf r-ggsci r-remotes r-raster r-readxl r-terra -c conda-forge 

Here is a sample pbs script:
#!/bin/bash
#PBS -m abe
#PBS -M XXXX@griffithuni.edu.au
#PBS -N MyTest
#PBS -q workq
#PBS -l select=1:ncpus=1:mem=4gb,walltime=01:00:00
module load anaconda3/2024.06
source activate myRenv
module list
cd $PBS_O_WORKDIR
R '--save' <MAIN_CODEmodel1.R
##sleep 22
echo "Done with job"
>>>>>

R script can start with loading the packages like this:
package_names <- c("tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra")

# Loop to load the packages
for (package_name in package_names) {
if (!requireNamespace(package_name, quietly = TRUE)) {
message(paste("Installing and loading", package_name))
install.packages(package_name, dependencies = TRUE)
}
library(package_name, character.only = TRUE)
}

<snip>



Reference

1. http://yusung.blogspot.com.au/2009/01/install-jags-and-rjags-in-fedora.html

...