Griffith HPC User Guide
Contents
1. Introduction
Gowonda is a 792 core HPC cluster and consists of a mixture of SGI Altix XE and SGI® Rackable™ C2114-4TY14 servers. It is managed by the eResearch Services unit in Information Services. Gowonda was made possible by a grant from the Queensland Cyber Infrastructure Foundation (QCIF) and funding from the University Electronic Infrastructure Capital Fund.
Gowonda will be used to run computations that require large amount of computing resources. All Griffith University researchers and researchers from QCIF affiliated institutions will be able to get access to Gowonda. Similarly Griffith researchers can call upon additional resources from QCIF affiliated institutions if required. Gowonda is a major increase in capability for Griffith University and its partners.
It formally came into operation on Aug 1st 2011.
There is no plan to charge for legitimate research usage of gowonda. However, we will need to meet expectations of the stakeholders. Consequently, we will need to be able to account for all usage on the cluster to satisfy our stakeholders. Information is provided below about how you can assist in this.
This document is loosely modeled after “A Beginner's Guide to the Barrine Linux HPC Cluster” written by Dr. David Green (HPC manager, UQ) , ICE Cluster User Guide written by Bryan Hughes and Wiki pages of the City University of New York (CUNY) HPC Center (see reference for further details).
2. Gowonda Overview
2.1 Hardware
2.1.1 Hardware (2024 upgrade) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Hardware (2019 upgrade) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Node Names | Total Number | Mem per node | Cores per node | Processor Type | GPU card | ||
---|---|---|---|---|---|---|---|
gc-prd-hpcn001 gc-prd-hpcn002 gc-prd-hpcn003 gc-prd-hpcn004 gc-prd-hpcn005 gc-prd-hpcn006 | 6 | 192GB | 72 | 2x Intel Xeon 6140 | |||
n061 (gpu node) | 1 | 500 GB | 96 | 2X AMD EPYC 7413 24-Core Processor | 5 X A100 NVIDIA A100 80GB PCIe | ||
n060 (gpu node) | 1 | 380 GB | 72 | Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz | 8 X V100 NVIDIA V100-PCIE-32GB |
2.2 Old Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Node Type | Total Number of Cores | Total Amount of Memory (GB) | Compute Nodes | Cores Per node | Mem per Node (GB) | Memory per Core | Processor Type |
---|---|---|---|---|---|---|---|
Small Memory Nodes | 48 | 48 | (n001-n004) | 4 | 12 | 1 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Medium Memory Nodes | 108 | 216 | 9 (n005-n009,n010-n012,n019) | 12 | 24 | 2 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Large Memory Nodes | 72 | 288 | 6 (n013-n018) | 12 | 48 | 4 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Extra Large Memory Nodes with GPU (see table below for more details about GPU) | 48 | 384 | 4 (n020-n023) | 12 | 96 | 8 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Extra Large Memory Nodes (no GPU) | 96 | 768 | 8 (n031-n038) | 12 | 96 | 8 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Special Nodes (no GPU) | 64 | 128 | 16 (n039-n042) | 16 | 32 | 2 | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz |
Special Large Memory Nodes (no GPU) | 64 | 1024 | 8 (n044-n047) | 16 | 256 | 16 | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz |
Special Large Memory Nodes (no GPU) | 192 | 1536 | 24 (aspen01-aspen12) | 16 | 128 | 8 | Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz |
Please note that each of the Extra Large nodes (n020,n021, n022 and n023) have 2 nvidia tesla C-2050 GPU cards.
Node Type | Programming Model | Total Number of CUDA Cores | Total Amount of Memory (GB) | Compute Nodes | CUDA Cores Per node | CUDA cards per node | Mem per Node (GB) | Memory per Core | Processor Type |
---|---|---|---|---|---|---|---|---|---|
Extra Large Memory Nodes with GPU | GPU | 4X2X448 | 384 | 4 (n020-n023) | 2X448 | 2 | 96 | 8 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Special Administrative Nodes (Not used for computing purposes)
Node Type | Node Name | Total Number of Cores | Total Amount of Memory (GB) | Mem per Node (GB) | Memory per Core | Processor Type |
---|---|---|---|---|---|---|
File Servers | n024,n025 | 24 | 96G | 48GB | 4GB | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Test Node | testhpc | 12 | 24G | 24GB | 2GB | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
login Node | gowonda | 12 | 48G | 48GB | 4GB | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
Admin Node | n030 (admin) | 12 | 24G | 24GB | 2GB | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz |
More information about the Intel(R) Xeon(R) CPU X5650 processor can be obtained here.
In addition to the above, there is a special windows HPC node.
Node type | Total Number of Cores | Total Amount of Memory (GB) | Compute Nodes | Cores Per node | Mem per Node (GB) | Memory per Core | Processor Type | OS |
---|---|---|---|---|---|---|---|---|
Windows 2008 Large Memory Node | 12 | 48 | 1 n029 | 12 | 48 | 4 | Intel(R) Xeon(R) CPU X5650 @ 2.67GHz | Windows 2008 R2 with windows HPC pack |
Instructions for using the Windows HPC is given in a separate user guide
2.3 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The operating system on gowonda is RedHat Enterprise Linux (RHEL) 6.1 updated with SGI foundation-2.4 suite and accelerate-1.2 support package. The queuing system used is PBS Pro 11 . The exception is the Windows HPC node which runs Windows 2008 R2 with Windows HPC pack and Windows HPCS job scheduler. Gowonda has the following compilers and parallel library software. Much more detail on each can be found below.
- GNU C, C++ and Fortran compilers;
- Portland Group, Inc. optimizing C, C++, and Fortran compilers (currently awaiting to be installed);
- The Intel Cluster Studio including the Intel C, C++ and Fortran compilers, Math and Kernel Library;
- Intel MPI , SGI's proprietary MPT and OpenMPI
- Oracle Solaris Studio Compiler (formerly called Sun Studio Compiler)
The following third party applications are currently installed or will be installed shortly. The Gowonda HPC Center staff will be happy to work with any user interested in installing additional applications, subject to meeting that application's license requirements.
Software | Version | Usage | Status |
---|---|---|---|
AutoDOCK | 4.2.3 | Module load autodock423 autodockvina112 | Installed |
Bioperl | TBI (To be Installed) | ||
Blast | Installed | ||
CUDA | 4.0 | module load cuda/4.0 | Installed) |
Gaussian03 | module load gaussian/g03 | Installed | |
Gaussian09 | module load gaussian/g09 | Installed | |
Gromacs | Installed | ||
gromos | 1.0.0 | module load gromos/1.0.0 | Installed |
MATLAB | 2009b,2011a | module load matlab/2009b,module load matlab/2011a | Installed |
MrBayes | TBI (To be Installed) | ||
NAMD | module load NAMD/NAMD28b1 | Installed | |
numpy | 1.5.1 | module load python/2.7.1 | Installed |
PyCogent | - | module load python/2.7.1 | Installed |
qiime | To be Installed | ||
R | module load R/2.13.0 | Installed | |
SciPy | 0.9.0 | module load python/2.7.1 | Installed |
VASP | - | - | TBI |
The following graphics, IO, and scientific libraries are also supported.
Software | Version | Usage | Status |
---|---|---|---|
Atlas | 3.9.39 | module load ATLAS/3.9.39 | Installed |
FFTW | 3.2.2.,3.3a | module load fftw/3.3-alpha-intel | Installed |
GSL- | 1.09,1.15 | module load module load gsl/gsl-1.15 | Installed |
LAPACK- | 3.3.0 | - | - |
NETCDF- | 3.6.2,3.6.3,4.0,4.1.1,4.1.2 | e.g. module load NetCDF/4.1.2 | Installed |
3 Support
3.1 Hours of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
The fourth Thursday mornings in the month from 8:00AM to 12PM are normally reserved (but not always used) for scheduled maintenance. Please plan accordingly. Unplanned maintenance to remedy system related problems may be scheduled as needed. Reasonable attempts will be made to inform users running on those systems when these needs arise.
3.2 User Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Users are encouraged to read this Wiki carefully. In particular, the sections on compiling and running parallel programs, and the section on the PBS Pro batch queuing system will give you the essential knowledge needed to use the gowonda cluster.
Gowonda cluster staff along with outside vendors also offer some courses to the Griffith HPC community in parallel programming techniques, HPC computing architecture, and the essentials of using our systems. Please follow our mailings on the subject and feel free to inquire about such courses. We can schedule training visits and classes at the various Griffith campuses. Please let us know if such are training visit is of interest.
Users with further questions or requiring immediate assistance in use of the systems should submit a ticket here
Support staff may be contacted via:
Griffith Library and IT help 3735 5555 or X55555
email support: Submit form
You can log cases on service desk (category: eResearch services.HPC)
eResearch Services, Griffith University
Phone: +61 - 7 - 373 56649 (GMT +10 Hours)
Email: Submit a Question to the HPC Support Team
Web: griffith.edu.au/eresearch-services
3.3 Service Alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Any major disruptions/events will be communicated through emails. These emails will be logged on the following web pages as well.
Major Service Alerts
https://conf-ers.griffith.edu.au/display/GHCD/Service+Alerts
Minor Service Alerts
https://conf-ers.griffith.edu.au/display/GHCD/Service+Alerts
Mail outs
https://conf-ers.griffith.edu.au/display/GHCD/Service+Alerts
3.4 Data storage, retention/deletion, and back-ups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Home Directories
Each user account, upon creation, is provided a home directory (/export/home/staffNumber) with a default 100 GB storage ceiling on each system. A user may request an increase in the size of their home directory if there is a special need. The HPC staff will endeavor to satisfy reasonable requests, but storage is not unlimited. Please regularly remove unwanted files and directories to minimize this burden.
Currently user home directories are not backed up. It is your responsibility to backup your data in /export/home/<snumber> and /scratch to an external source"
In addition to the home directory, users have access to a /scratch directory. A soft link to this scratch directory is also provided in the home directory. Files on system temporary and scratch directories are not backed up. All files older than 15 days will be deleted from the /scratch/<snumber> directory.
4 Access
4.1 Request an account on gowonda
Please fill out this form:
https://conf-ers.griffith.edu.au/display/GHCD/Griffith+HPC+Support+Request+Form?src=contextnavpagetreemode
A staff from the gowonda HPC cluster team will contact you to provide you with login details.
4.2 Login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
To log in to the cluster, ssh to
gc-prd-hpclogin1.rcs.griffith.edu.au
You will need to be connected to the Griffith network (either at Griffith or through vpn from home).
Please check VPN installation instruction here:
https://intranet.secure.griffith.edu.au/computing/remote-access/virtual-private-network
ssh on windows platform . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . ..
To use X11 port forwarding, Install Xming X Server for Windows first.
See instructions here
If X11 forwading is not needed (true for most cases), do not install it.
To install a ssh client e.g. putty, please follow this instruction
ssh on Linux platform and mac platform . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . ..
ssh -Y gc-prd-hpclogin1.rcs.griffith.edu.au
Once you are on the system, have a look around. Your home directory is stored in:
/exports/home/<SNumber>
where you have 100GB of allocated space.
Work space is available at:
/scratch/<snumber>
which is a "scratch" area for short lived data. All data older than 15 days is deleted from this folder.
You should not read or write directly into your home directory with submitted jobs - they should always use the "/scratch/<snumber>" filesystem
4.3 File Transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Please check these instructions for transferring files between your desktop and gowonda cluster
The other method is to mount your gowonda home drive as follows
Mounting gowonda on Windows clients
Use win-sshfs to mount gowonda on windows:
http://code.google.com/p/win-sshfs/
First download the win-sshfs and install it on your windows machine.
After the installation, start the win-sshfs manager:
Start - All Programs - win-sshfs --> sshfs manager
Click on the Add button and fill in the details:
After saving the connection, click the mount button. It just becomes a drive on your desktop and can be used to transfer files back and forth from gowonda and local desktop.
Mounting gowonda on linux based clients:
The sshfs application is a FUSE program which will mount filesystems from SFTP servers.
FUSE (Filesystem in Userspace) is an interface which allows user programs, such as sshfs, to export a virtual filesystem to the Linux kernel. In this way, such filesystems can be attached to the Linux file hierarchy. This is particularly useful if you want to use Linux commands, scripts or programs to access your files on the remote server.
If you have access to FUSE on your Linux workstation or laptop, you can use sshfs to mount any of our gateway fileservers.
To install fuse-sshfs
sudo apt-get install sshfs (Ubuntu) yum install fuse-sshfs (Fedora) On CentOS or RHEL, FUSE and fuse-sshfs are available from the RPMforge repository. Go to https://rpmrepo.org/RPMforge/Using to see how to add RPMforge as a yum repository and then install sshfs as above. On Fedora and RHEL or CentOS, the user must belong to group fuse in order to mount remote filesystems with the sshfs command. To add yourself to group fuse, edit file /etc/group (as the root user, or with sudo). Find the line: fuse:x:474: and add your username to the end, separated by a comma from other usernames, if present. (The group number may differ).
First, you need to create a local directory as the mount point for the remote filesystem. Then give the sshfs command to mount your files from the fileserver. The format of the command is:
mkdir ~/f (vreate local directory mount point) sshfs user@host: mountpoint e.g: sshfs s12345@server.griffith.edu.au: ~/f Sshfs will prompt you for your password. Use you standard University password. You will then see your remote web site files in directory /myprojectsite. ls ~/f cgi-bin E.G.User index.htm TestScriptSchoolsComscZ.pdf To unmount (disconnect) the fileserver, use the fusermount -u command: fusermount -u ~/f
5 Software Modules
Gowonda uses the Modules package to allow users to quickly load various environments. The command
module can be used to load and unload these environments.
module list <=== This will list all the currently load environments No Modulefiles Currently Loaded. module avail <=== This will list all the available environments on gowonda ------------------------------------------------------------------------------------------------------ /usr/share/Modules/modulefiles ------------------------------------------------------------------------------------------------------- MPInside/3.1 compiler/gcc-4.4.4 module-cvs modules mpiplace/1.01 null perfcatcher sgi-upc/1.04 use.own chkfeature dot module-info mpi/intel-4.0 mpt/2.04 perfboost scotch/5.1.11 sgi-upc-devel/1.04 ------------------------------------------------------------------------------------------------------------- /etc/modulefiles -------------------------------------------------------------------------------------------------------------- openmpi-x86_64 ------------------------------------------------------------------------------------------------------------ /sw/com/modulefiles ------------------------------------------------------------------------------------------------------------ ATLAS/3.8.3 NetCDF/4.1.2 fftw/3.3-alpha-intel intel-cc-10/10.1.018 intel-itac/8.0.0.011(default) intel-tools-11/11.2.137 netCDF/4.0 ATLAS/3.9.39 R/2.13.0 gaussian/g03 intel-cc-11/11.1.072 intel-itac/8.0.1.009 matlab/2009b netCDF/4.1.1 NAMD/NAMD28b1 autodock/autodock423 gaussian/g09 intel-cmkl-11/11.1.072(default) intel-mpi/3.2.2.006(default) matlab/2011a netCDF/4.1.2 NetCDF/3.6.2-shared autodock/autodockvina112 gromos/1.0.0(default) intel-cmkl-11/11.2.137 intel-mpi/4.0.0.027 mpt/2.00 nose/1.0.0 NetCDF/3.6.3 cuda/4.0 gromos/1.0.0++ intel-fc-10/10.1.018 intel-mpi/4.0.1.007 mpt/2.02 python/2.7.1(default) NetCDF/4.0 fftw/3.2.2-gnu gsl/gsl-1.09 intel-fc-11/11.1.072(default) intel-ptu/3.2.001 netCDF/3.6.2-shared python/3.1.4 NetCDF/4.1.1 fftw/3.2.2-intel gsl/gsl-1.15 intel-fc-11/12.0.2.137 intel-tools-11/11.1.072(default) netCDF/3.6.3
module load matlab/2009b module list Currently Loaded Modulefiles: 1) matlab/2009b module swap matlab/2009b matlab/2011a module list Currently Loaded Modulefiles: 1) matlab/2011a module unload matlab/2011a module list No Modulefiles Currently Loaded. module load intel-fc-11/12.0.2.137 module load intel-mpi/4.0.1.007 module load mpi/intel-4.0 OR: module load intel-fc-11/12.0.2.137 intel-mpi/4.0.1.007 mpi/intel-4.0 == module list Currently Loaded Modulefiles: 1) intel-fc-11/12.0.2.137 2) intel-mpi/4.0.1.007 3) mpi/intel-4.0 module unload intel-fc-11/12.0.2.137 intel-mpi/4.0.1.007 mpi/intel-4.0 module list No Modulefiles Currently Loaded.
To use specific software (e.g: matlab, intel compilers, etc), its environment must be loaded first. You will also need to load them in the PBS script (more about it can be seen later in this manual). There are also software specific documentation available here.
5.1 cuda/4.0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
module load cuda/4.0 [root@n002 ~]# module list Currently Loaded Modulefiles: 1) cuda/4.0
The cuda libraries and binaries are loaded with this. The GPUs are installed only on n020,n021,n022 and n023.
More information can be found here.
https://conf-ers.griffith.edu.au/display/GHCD/cuda
5.2 intel-mpi/4.0.0.027 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
This module provides access to Intel MPI commands. This is the recommended means of running MPI programs on Gowonda. Once this module has been loaded, commands such as mpicc will default to the Intel version. There is a SGI proprietary mpi called mpt that can also be used on Gowonda.
More information about mpi on gowonda can be found here.
https://conf-ers.griffith.edu.au/display/GHCD/mpi
5.3 intel-cc-11/11.1.072, intel-cmkl-11/11.2.137, intel-fc-11/12.0.2.137 . . . . . . . . . . . . . . . . . . . . . . . . .
These modules provide access to the Intel C and Fortran compilers, respectively. Once this module is loaded
the Intel C compiler can be accessed by running icc. Once this module is loaded the Intel Fortran compiler
can be accessed by running ifort. If a program is compiled with the Intel compilers then the respective
-libs program should be loaded in the running environment.
6 Program Compilation
6.1 Serial Program Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Intel Compiler Suite
Intel's Cluster Studio (ICS) compilers, debuggers, profilers, and libraries are available on Gowonda.
module load intel-cc-11/11.1.072 icc -V Compiling a C program: icc -O3 -unroll mycode.c
The line above invokes Intel's C compiler (also used by Intel mpicc). It requests level 3 optimization and that loops be unrolled for performance. To find out more about 'icc', type 'man icc'.
Similarly for Intel Fortran and C++.
Compiling a Fortran program:
ifort -O3 -unroll mycode.f90
Compiling a C++ program:
icpc -O3 -unroll mycode.C
The Portland Group Compiler Suite
This is currently not installed on gowonda but is expected to be made available in the near future
The GNU Compiler Suite
The GNU compilers, debuggers, profilers, and libraries are available on Gowonda
To check for the default version installed:
gcc -v
Compiling a C program:
gcc -O3 -funroll-loops mycode.c
The line above invokes GNU's C compiler (also used by GNU mpicc). It requests level 3 optimization and that loops be unrolled for performance. To find out more about 'gcc', type man gcc.
Similarly for Fortran and C++.
Compiling a Fortran program:
gfortran -O3 -funroll-loops mycode.f90
Compiling a C++ program (uses gcc):
gcc -O3 -funroll-loops mycode.C
6.2 OpenMP and OpenMP SMP-Parallel Program Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
All the compute nodes on Gowonda include at least 2 sockets and multiple cores. These multicore SMP compute nodes offer the Gowonda user community the option of creating parallel programs using the OpenMP Symmetric Multi-Processing (SMP) parallel programming model. SMP parallel programming with the OpenMP model (and other SMP models) has been around for a long time because early parallel HPC systems were built only with shared memories.
In the SMP model, multiple processors work within a single program image and the same memory space. This eliminates the need to copy data from one program (process) image to another (required by MPI) and simplifies the parallel run-time environment significantly. As such, writing parallel programs to the OpenMP standard is generally easier and requires fewer lines of code. However, the size of the problem that can be addressed using OpenMP is limited by the amount of memory on a single compute node, and the parallel performance improvement to be gained is limited by the number of processors (cores) within the node that can address that same memory space. At present in Gowonda, OpenMP applications can run with a maximum of 12 cores (24 if hyperthreading is used).
Here, a simple OpenMP parallel version of the standard C "Hello, World!" program is set to run on 8 cores:
#include <omp.h> #include <stdio.h> #include <stdlib.h> #define NPROCS 8 int main (int argc, char *argv[]) { int nthreads, num_threads=NPROCS, tid; /* Set the number of threads */ omp_set_num_threads(num_threads); /* Fork a team of threads giving them their own copies of variables */ #pragma omp parallel private(nthreads, tid) { /* Each thread obtains its thread number */ tid = omp_get_thread_num(); /* Each thread executes this print */ printf("Hello World from thread = %d\n", tid); /* Only the master thread does this */ if (tid == 0) { nthreads = omp_get_num_threads(); printf("Total number of threads = %d\n", nthreads); } } /* All threads join master thread and disband */ }
An excellent and comprehensive tutorial on OpenMP with examples can be found at the Lawrence Berkeley National Lab web site: https://computing.llnl.gov/tutorials/openMP
Compiling This OpenMP Program Using the Intel Compiler Suite
The intel C compiler requires the '-openmp' option, as follows:
icc -o hello_world.exe -openmp hello_world.c
When run this program produces the following output:
./hello_world.exe Hello World from thread = 0 Hello World from thread = 6 Hello World from thread = 3 Hello World from thread = 5 Hello World from thread = 1 Hello World from thread = 2 Total number of threads = 8 Hello World from thread = 4 Hello World from thread = 7
OpenMP is supported in both Intel's Fortran and C++ compilers as well.
Compiling This OpenMP Program Using the GNU Compiler Suite
The GNU C compiler requires the '-fopenmp' option, as follows:
gcc -o hello_world.exe -fopenmp hello_world.c
The program produces the same output, although the order of the print statements cannot be predicted and will not be the same over repeated runs. OpenMP is supported in both GNU's Fortran and C++ compilers as well.
6.3 MPI and MPI Parallel Program Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
The Message Passing Interface (MPI) is a hardware-independent parallel programming and communications library callable from C, C++, or Fortran. Quoting from the MPI standard:
MPI is a message-passing application programmer interface (API), together with protocol and semantic specifications for how its features must behave in any implementation.
MPI has become the de facto standard approach for parallel programming in HPC. MPI is a collection of well-defined library calls (an Applications Program Interface, or API) for transferring data (packaged as messages) between completely independent processes within independent address spaces. These processes could be running within a single physical node or across distributed nodes connected by an interconnect such as GigaBit Ethernet or InfiniBand. MPI communication is generally two-sided with both the sender and receiver of the data actively participating in the communication events. Both point-to-point and collective communication is supported. MPI's goals are high performance, scalability, and portability. MPI remains the dominant parallel programming model used in high-performance computing today.
The original MPI-1 release was not designed with any special features to support traditional shared memory or distributed-shared memory parallel architectures, and MPI-2 provides only limited distributed, shared-memory support with some one-sided, remote direct memory access routines (RDMA). Nonetheless, MPI programs are regularly run on shared memory computers because the model is parallel architecture paradigm neutral. Writing parallel programs using the MPI model (as opposed to shared-memory models such as OpenMPI) requires the careful partitioning of program data among the communicating processes to reduce the number of communication events that can sap the performance of parallel applications when they are run at larger scale (with more processors).
There are several versions of MPI, including proprietary versions from Intel and SGI. We request the Gowonda users to use by default Intel MPI.
Parallel implementations of the "Hello world!" program in C and Fortran are presented here to give the reader a feel for the look of MPI code
hello_mpi
cat hello_mpi.c
/*The Parallel Hello World Program*/ #include <stdio.h> #include <mpi.h> main(int argc, char **argv) { int node; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &node); printf("Hello World from Node %d\n",node); MPI_Finalize(); }
To compile this:
module load intel-fc-11/12.0.2.137 module load intel-mpi/4.0.1.007 module load mpi/intel-4.0 mpicc -o hello_mpi hello_mpi.c -lmpi
Sample PBS script
cat pbs.run7
#!/bin/bash -l #PBS -m abe ### Mail to user #PBS -M <yourEmail>@griffith.edu.au ### Job name #PBS -N mpi #PBS -l walltime=60:00:00 ## Please note the walltime above . This value * must * be set so that if the ## MPI program runs in an infinite loop , or something similar , it will be ## killed after the given wall time . ### Number of nodes:Number of CPUs:Number of threads per node #PBS -l select=2:ncpus=12:mpiprocs=7 ## The number of nodes is given by the select =<NUM > above NODES=2 ##$PBS_NODEFILE is a node-list file created with select and mpiprocs options by PBS ### The number of MPI processes available is mpiprocs * nodes NPROCS=14 # This job's working directory echo "Working directory is $PBS_O_WORKDIR" cd $PBS_O_WORKDIR source $HOME/.bashrc module load intel-fc-11/12.0.2.137 module load intel-mpi/4.0.1.007 module load mpi/intel-4.0 echo "Starting job" echo Running on host `hostname` echo Time is `date` echo Directory is `pwd` #echo This jobs runs on the following processors: echo `cat $PBS_NODEFILE` ##mpirun -machinefile $PBS_NODEFILE /export/home/s2594054/pbs/mpi/p1.out mpirun -f $PBS_NODEFILE -n "$NODES" -r ssh -n "$NPROCS" /export/home/s2594054/pbs/mpi/2/hello_mpi echo "Done with job"
qsub pbs.run7 2659.pbsserver qstat 2659.pbsserver mpi s2594054 00:00:00 R workq cat mpi.o2659 Working directory is /export/home/SNUMBER/pbs/mpi/2 Starting job Running on host n010 Time is Wed Jul 27 08:37:14 EST 2011 Directory is /export/home/SNUMBER/pbs/mpi/2 n010 n010 n010 n010 n010 n010 n010 n020 n020 n020 n020 n020 n020 n020 Hello World from Node 0 Hello World from Node 2 Hello World from Node 4 Hello World from Node 6 Hello World from Node 3 Hello World from Node 1 Hello World from Node 5 Hello World from Node 9 Hello World from Node 11 Hello World from Node 7 Hello World from Node 13 Hello World from Node 8 Hello World from Node 10 Hello World from Node 12 Done with job
qsub -I pbs.run7 cd $PBS_O_WORKDIR module load intel-fc-11/12.0.2.137 module load intel-mpi/4.0.1.007 module load mpi/intel-4.0 mpirun -f $PBS_NODEFILE -n 2 -r ssh -n 14 /export/home/SNUMBER/pbs/mpi/2/hello_mpi Hello World from Node 0 Hello World from Node 2 Hello World from Node 1 Hello World from Node 4 Hello World from Node 6 Hello World from Node 3 Hello World from Node 8 Hello World from Node 5 Hello World from Node 13 Hello World from Node 7 Hello World from Node 12 Hello World from Node 10 Hello World from Node 11 Hello World from Node 9 OR mpirun -r ssh -f $PBS_NODEFILE --totalnum=$NPROCS --verbose -l -machinefile $PBS_NODEFILE -np $(wc -l $PBS_NODEFILE | gawk '{print $1}') /export/home/s2594054/pbs/mpi/2/hello_mpi running mpdallexit on n016 LAUNCHED mpd on n016 via RUNNING: mpd on n016 LAUNCHED mpd on n020 via n016 RUNNING: mpd on n020 3: Hello World from Node 3 2: Hello World from Node 2 0: Hello World from Node 0 6: Hello World from Node 6 4: Hello World from Node 4 1: Hello World from Node 1 5: Hello World from Node 5 7: Hello World from Node 7 11: Hello World from Node 11 12: Hello World from Node 12 10: Hello World from Node 10 13: Hello World from Node 13 8: Hello World from Node 8 9: Hello World from Node 9
The following could work as well
mpirun -r ssh -f $PBS_NODEFILE --totalnum=$NPROCS --verbose -l -machinefile $PBS_NODEFILE -np 8 /export/home/SNUMBER/pbs/mpi/p1.out mpirun -r ssh -f $PBS_NODEFILE --totalnum=$NPROCS --verbose -l -machinefile $PBS_NODEFILE -np $(wc -l $PBS_NODEFILE | gawk '{print $1}') /export/home/SNUMBER/pbs/mpi/p1.out --totalnum specifies the total number of mpds to start -np number - number of processes -n <n> or -np <n> # number of processes to start mpirun -f $PBS_NODEFILE -n $(cat $PBS_NODEFILE | gawk '{print $1}'|sort|uniq|wc -l) -r ssh -n $(wc -l $PBS_NODEFILE | gawk '{print $1}') /export/home/SNUMBER/pbs/mpi/p1.out To check if mpd is working well: mpdcheck -f $PBS_NODEFILE -v
Another version of MPI that will be supported is OpenMPI. OpenMPI (completely different from and not to be confused with OpenMP) is a project combining technologies and resources from several previous MPI projects (FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI) with the stated aim of building the best freely available MPI library. OpenMPI represents the merger between three well-known MPI implementations:
- FT-MPI from the University of Tennessee
- LA-MPI from Los Alamos National Laboratory
- LAM/MPI from Indiana University
with contributions from the PACX-MPI team at the University of Stuttgart. These four institutions comprise the founding members of the OpenMPI development team.
These MPI implementations were selected because OpenMPI developers thought that each excelled in one or more areas. The stated driving motivation behind OpenMPI is to bring the best ideas and technologies from the individual projects and create one world-class open source MPI implementation that excels in all areas. The OpenMPI project names several top-level goals:
- Create a free, open source software, peer-reviewed, production-quality complete MPI-2 implementation.
- Provide extremely high, competitive performance (low latency or high bandwidth).
- Directly involve the high-performance computing community with external development and feedback (vendors, 3rd party researchers, users, etc.).
- Provide a stable platform for 3rd party research and commercial development.
- Help prevent the "forking problem" common to other MPI projects.
- Support a wide variety of high-performance computing platforms and environments.
OpenMPI may be used to run jobs compiled with the Intel, PGI, or GNU compilers. Two simple MPI programs, one written in C and another in Fortran are shown below as examples. A good online tutorial on MPI can be found at LLNL here . A tutorial on parallel programming in general here
Example hello.c
#include <stdio.h> #include <mpi.h> int main (argc, argv) int argc; char *argv[]; { int rank, size; MPI_Init (&argc, &argv); /* starts MPI */ /* get current process id */ MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get number of processes */ MPI_Comm_size (MPI_COMM_WORLD, &size); printf( "Hello world from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; }
Example hello.f90
program hello include 'mpif.h' integer rank, size, ierror, tag, status(MPI_STATUS_SIZE) call MPI_INIT(ierror) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierror) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierror) print*, 'node', rank, ': Hello world' call MPI_FINALIZE(ierror) end
6.4 GPU Parallel Programming with Intel Compilers and NVIDIA's CUDA Fortran Programming Model . . . . . . . . . . . . . . . . . .
7 Requesting Resources
Warning Do not run jobs on the login node "gc-prd-hpclogin1.rcs.griffith.edu.au" Please use it for compilation and small debugging runs only.
Resources are requested through qsub.
A required option for all resource requests is walltime. By default all jobs have default walltime of 0 minutes. You can specify a walltime for your job with a -l option to qsub in the form of -l walltime=1:00 for one minute, or -l walltime=1:00:00 for one hour. There is currently no upper limit for walltime, the system is in place to help the scheduler.
7.1 Submit Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
qsub <pbscriptFile>
A pbs simple script file is as follows:
#!/bin/sh #PBS -m abe #PBS -M emailID@griffith.edu.au #PBS -N Simple_Test #PBS -q workq #####PBS -W group_list=gaussian #PBS -l walltime=00:15:00 #PBS -l select=1:ncpus=1:mpiprocs=1 ###source $HOME/.bashrc echo "Hello" echo "Starting job" module load matlab/2011a module list sleep 228 echo "test Done" echo "Done with job"
7.2 Monitor Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
qstat -1an
7.3 Delete Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
qdel <jobID>
7.4 Check the current status of the cluster .. . . . . . . . . . . . . . . . . . . . . . . . .
pbsnodes
pbsjobs
pbsqueues
To view the current cluster status, you use the elinks text browser on the login node to view the status like below: pbsnodes (elinks http://localhost:3000/nodes) pbsjobs (elinks http://localhost:3000/jobs) pbsqueues (elinks http://localhost:3000/queues) (You can press "Q" to quit from the below text-based browsers
8 Examples
8.1 Intel MPI and Compiler with Interactive Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
qsub -I -l walltime=00:15:00 -l select=2:ncpus=2:mpiprocs=2 This puts you in an interactive mode with two chunks of 2 mpiprocs assigned (Total of 4 cores)
8.2 Intel MPI and Compiler with Batch Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9 Windows HPC node User Guide
Please see the following link for Windows HPC user guide.
http://confluence.rcs.griffith.edu.au:8080/display/GHPC/Windows+HPC+User+Guide
Update: This facility has been removed and is no longer available on gowonda
10 Acknowledgements
We encourage you to acknowledge significant support you have received from the Griffith University eResearch Service & Specialised Platforms Team in your publications.
The following text is suggested as a starting point. (Please feel free to augment or modify as you see fit, in particular naming individuals who have been of assistance. If you wish to cite particular references please contact the relevant individuals):
"We gratefully acknowledge the support of the Griffith University eResearch Service & Specialised Platforms Team and the use of the High Performance Computing Cluster "Gowonda" to complete this research."
If you need to give hardware specifics for readers to reproduce results, it is:
Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
EDR InfiniBand Interconnect
For more technical information, please check here
If you need to give hardware specifics of the gpu node for readers to reproduce results, it is:
Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
- EDR InfiniBand Interconnect
GPU card: Tesla V100-PCIE-32GB
HPE Proliant HPE XL270d Gen 10 Node CTO server
Please advise us of this acknowledgment by emailing us at eresearch-services@griffith.edu.au for our record keeping.
Reference
1. Green, David. A Beginner's Guide to the Barrine Linux HPC Cluster. http://www.qcif.edu.au/Facilities
2. Hughes, Bryan. “TCC's ICE Cluster User Guide”. http://infohost.nmt.edu/~khan/ice-user.pdf.
3. HPC Facility , CUNY. http://wiki.csi.cuny.edu/cunyhpc/index.php/Main_Page
4. Monash University e-Research Centre http://www.monash.edu.au/eresearch/about/services.html
5. CPU information : http://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20X5650%20-%20AT80614004320AD%20(BX80614X5650).html