...
No Format |
---|
module load R/3.4.0-gu (awoonga) module load R/3.4.0 (gowonda) Old Packages: ============= module load R/3.3.3 OR module load R/3.2.3 OR module load R/3.2.0 module load R/3.4.0 There are other older versions as well. module avail R |
Griffith Uni R Users site
Griffith Uni R Users site: http://guru.forumotion.co.nz/
Accessing R on the gowonda cluster
...
No Format |
---|
Installing the akima package: R >install.packages('akima') akima is now installed in R's library subfolder, with all its dependencies intact. You may load it using library(akima) OR use: install.packages('akima', dependencies=TRUE, repos='http://cran.rstudio.com/') install.packages(c("EIAdata", "gdata", "ggmap", "ggplot2")) |
Manually download the package
...
No Format |
---|
vi /sw/R/2.13.0/lib64/R/bin/R R_HOME_DIR=/sw/R/2.13.0/lib64/R #R_HOME_DIR=/usr/local/src/apps/R-2.13.0/lib64/R |
Additional packages
mvtnorm
...
You may need to add this in ~/.R/Makevars to compile some of the packages successfully.
CXX14 = g++ -std=c++1y -Wno-unused-variable -Wno-unused-function -fPIC
mvtnorm
No Format |
---|
> install.packages("mvtnorm", repos="http://R-Forge.R-project.org")
trying URL 'http://R-Forge.R-project.org/src/contrib/mvtnorm_0.9-9992.tar.gz'
Content type 'application/x-gzip' length 315853 bytes (308 Kb)
opened URL
==================================================
downloaded 308 Kb
* installing *source* package âmvtnormâ ...
** libs
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c miwa.c -o miwa.o
gfortran -fpic -g -O2 -c mvt.f -o mvt.o
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c randomF77.c -o randomF77.o
gfortran -fpic -g -O2 -c tvpack.f -o tvpack.o
gcc -std=gnu99 -I/sw/R/2.13.0/lib64/R/include -I/usr/local/include -fpic -g -O2 -c tvpackAux.c -o tvpackAux.o
gcc -std=gnu99 -shared -L/usr/local/lib64 -o mvtnorm.so miwa.o mvt.o randomF77.o tvpack.o tvpackAux.o -lgfortran -lm -L/sw/R/2.13.0/lib64/R/lib -lR
installing to /sw/R/2.13.0/lib64/R/library/mvtnorm/libs
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
* DONE (mvtnorm)
The downloaded packages are in
â/tmp/RtmpVkCqG5/downloaded_packagesâ
Updating HTML index of packages in '.Library'
Making packages.html ... done
|
...
No Format |
---|
#!/bin/bash ###################################################################### #### This section defines options for the pbs batching system ###################################################################### #PBS -m abe #PBS -M YourEmail@griffith.edu.au #PBS -q workq #PBS -l select=1:ncpus=1:mem=2gb #PBS -l ,walltime=60:00:00 #PBS -N vivaxIBMS3 # ###Load the pgi compilers. module load compilers/pgi-32bit-12.4 ###Load the R module load R/24.130.03 ##################################################################### #### This section is for my debugging purposes (not required) ###################################################################### echo Running on host `hostname` echo Time is `date` ###################################################################### #### This section is setting up and running your executable or script ###################################################################### cd cd $PBS_O_WORKDIR ###cd /export/home/s12345/pbs/R/ echo Directory is `pwd` source $HOME/.bashrc R CMD BATCH vivaxIBMS3.r resultsIBMS3.Rout '--args a=1 b=c(2,5,6)' test.R test.out #CMD BATCH options which tells it to immediately run an R program #instead of presenting an interactive prompt #R.out is the screen output #results.Rout is the R program output |
...
If the gputools package is installed, it will provide R interfaces to a handful of common statistical algorithms. These algorithms are implemented in parallel using a mixture of Nvidia's CUDA langauge, Nvidia's CUBLAS library, and EMI Photonics' CULA libraries. On a computer equiped with an Nvidia GPU some of these functions may be substantially more efficient than native R routines.
Sample
...
This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template
No Format |
---|
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS -l walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS -A qris-gu
source $HOME/.bashrc
module load R/3.2.3
cd /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt
## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"
|
List all R packages
...
R script for test (test.R)
Source: https://www.r-bloggers.com/2007/08/including-arguments-in-r-cmd-batch-mode/
No Format |
---|
##First read in the arguments listed at the command line
args=(commandArgs(TRUE))
##args is now a list of character vectors
## First check to see if arguments are passed.
## Then cycle through each element of the list and evaluate the expressions.
if(length(args)==0){
print("No arguments supplied.")
##supply default values
a = 1
b = c(1,1,1)
}else{
for(i in 1:length(args)){
eval(parse(text=args[[i]]))
}
}
print(a*2)
print(b*3) |
Sample script to run on flashlite cluster
This sample will only work on flashlite (UQ) cluster. If you are using flashlite, you may modify this template
No Format |
---|
#! /bin/bash
# #PBS -m e
#PBS -N BigMemRjob
#PBS -l walltime=40:00:00
#PBS -q BigMemory
###Request 1.3TB of memory
#PBS -l nodes=1:ppn=24,mem=1331200mb,vmem=1331200mb
#PBS -A qris-gu
source $HOME/.bashrc
module load R/3.2.3
cd /home/userNameonFlashLite/pbs
export ROutfile=ROutfile001.txt
## RUN ##
echo "=== start R-3.2.3 ==="
Rscript --no-save myRscript.R &> $ROutfile
echo "Completed\n"
|
List all R packages
No Format |
---|
length(.packages(all.available=TRUE))
ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
rownames(ip) <- NULL
ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
print(ip, row.names=FALSE)
For example:
> length(.packages(all.available=TRUE))
[1] 200
> ip <- as.data.frame(installed.packages()[, c(1, 3:4)])
> rownames(ip) <- NULL
> ip <- ip[is.na(ip$Priority), 1:2, drop=FALSE]
> print(ip, row.names=FALSE)
Package Version
abind 1.4-5
acepack 1.4.1
anchors 3.0-8
askpass 1.1
|
How to run R code in parallel : Parallelisation using dplyr and doParallel
Parallelisation using plyr and doParallel
We have plyr and DoParallel in R/4.0.3
Threads vs. cores
There is often a lot of confusion between CPU threads and cores. A CPU core is the actual computation unit. Threads are a way of multi-tasking, and allow multiple simultaneous tasks to share the same CPU core. Multiple threads do not substitute for multiple cores. Because of this, compute-intensive workloads (like R) are typically only focused on the number of CPU cores available, not threads. (Ref: https://jstaf.github.io/hpc-r/parallel/)
No Format |
---|
Example:
module load R/4.0.3
> library(plyr)
> library(doParallel)
Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
>cores <- detectCores()
>cores
[1] 72
> registerDoParallel(cores=12)
fake_func <- function(x) {
Sys.sleep(0.1)
return(x)
}
library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)
Unit: milliseconds
expr min lq mean median uq max neval
serial 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580 2424.3580 1
parallel 226.2199 226.2199 226.2199 226.2199 226.2199 226.2199 1
>
|
Sample interactive pbs run
No Format |
---|
qsub -I -l select=1:ncpus=8:mem=12gb,walltime=1:11:00 -q small
Once you are on the compute node:
module load R/4.0.3
R
library(plyr)
library(doParallel)
cores <- detectCores()
cores
[1] 12
registerDoParallel(cores=8)
fake_func <- function(x) {
Sys.sleep(0.1)
return(x)
}
library(microbenchmark)
microbenchmark(
serial = llply(1:24, fake_func),
parallel = llply(1:24, fake_func, .parallel = TRUE),
times = 1
)
Unit: milliseconds
expr min lq mean median uq max neval
serial 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702 2402.7702 1
parallel 320.0491 320.0491 320.0491 320.0491 320.0491 320.0491 1
>
|
Create R conda environment (Recommended way on the new cluster) to run R on the new cluster
Another easy way to install R and R packages is to use conda . Here are the steps
No Format |
---|
source /usr/local/bin/s3proxy.sh
module load anaconda3/2024.06
conda create --name myRenv R=4.3.1
source activate myRenv
To install packes like "tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra"
Simply google with the package name and conda as search strings e.g. tidyverse conda
You will see instructions to install these packages:
conda install r::r-lubridate
conda install r::r-tidyverse
conda install conda-forge::r-ggplot2
conda install conda-forge::r-tmap
conda install conda-forge::r-ggmap
conda install conda-forge::r-sf
conda install conda-forge::r-ggsci
conda install conda-forge::r-remotes
conda install conda-forge::r-raster
conda install conda-forge::r-readxl
conda install conda-forge::r-terra
Or simply:
conda create --name myRenv r=4.3.1 r-tidyverse r-ggplot2 r-tmap r-ggmap r-sf r-ggsci r-remotes r-raster r-readxl r-terra -c conda-forge
Here is a sample pbs script:
#!/bin/bash
#PBS -m abe
#PBS -M XXXX@griffithuni.edu.au
#PBS -N MyTest
#PBS -q workq
#PBS -l select=1:ncpus=1:mem=4gb,walltime=01:00:00
module load anaconda3/2024.06
source activate myRenv
module list
cd $PBS_O_WORKDIR
R '--save' <MAIN_CODEmodel1.R
##sleep 22
echo "Done with job"
>>>>>
R script can start with loading the packages like this:
package_names <- c("tidyverse","lubridate","ggplot2", "tmap", "ggmap", "sf", "ggsci", "remotes", "raster", "readxl", "terra")
# Loop to load the packages
for (package_name in package_names) {
if (!requireNamespace(package_name, quietly = TRUE)) {
message(paste("Installing and loading", package_name))
install.packages(package_name, dependencies = TRUE)
}
library(package_name, character.only = TRUE)
}
<snip> |
Reference
1. http://yusung.blogspot.com.au/2009/01/install-jags-and-rjags-in-fedora.html
...