HPC cluster on the NeCTAR Research Cloud
HPC on the NeCTAR Cloud
By
Indy Siva, QCIF eResearch Analyst/Software Engineer
Dr. Matthew Parks, Research Fellow, Environmental Futures Centre, Griffith University
Introduction
There has been significant investment in research infrastructure by Australian Universities and various levels of Australian government. Queensland Cyber Infrastructure Foundation (QCIF) is an example of investment by the Queensland government which in partnership with Universities in Queensland provide Research Infrastructure to Researchers. QCIF has partnered with NeCTAR and RdSI to offer significant research infrastructure on the Cloud to Queensland researchers. This article will explore how the Environmental Futures Centre at Griffith University made use of this valuable service to create a HPC cluster on the Cloud. It is hoped that this article will help other researchers in Australia set up their HPC clusters on the cloud and enhance collaboration between researchers within and outside of Australia.
At the end, we will illustrate it with a case study of how this setup is used in laboratory facilities dedicated to both ancient and modern DNA research at Griffith University. The labs specializes in ancient DNA research and bioinformatics and specializes in the basic process of assembling a genome from DNA sequence data.
Reasons for this implementation
Pros
1. The University resources are shared which means they are often in high demand and cannot be reserved for a specific project / Research group. Being a shared resource, jobs may need to wait in a queue for a long time. With this setup, a dedicated cluster is always available to the research group.
2. Collaborator accounts (from anywhere in the world) can be created almost instantaneously. To use University resources, collaborators will need to get in-house accounts. Depending on the university, it can take significant time to create an account on a university system.
3. Direct access to the cloud resource, instead of institutional VPN access.
4. Administrator account on the cloud which means software installation and maintenance can be done researchers and do not have to wait for system administrators to schedule a downtime to do maintenance.
5. No charges for the infrastructure
6. No charges for data transfer at the cloud end.
Cons
1. University wide licensed software cannot be run on the cloud. Hence, this is only suitable for open software.
2. depending on the underlying Cloud infrastructure, performance can be a problem (e.g due to disk i/o issue).
3. If underdyling infrastructure fails, one is left to the mercy of the maintainers of the cloud to fix the problem.
Installation
Pre-requisites
Pre-requisite |
---|
Virtual Machines e.g provided by QCloud/ NeCTAR |
A Common Storage for home directories. This means this storage (e.g provided by RdSI) needs to be mounted on all servers |
Resource Manager software e.g: torque |
Open Source Application software |
Procesure
testing
Conclusion
Reference
1. http://www.adaptivecomputing.com/products/open-source/torque/