PBS

  File Modified

PDF File PBSProProgramGuide11.2.pdf

Dec 10, 2015 by Indy Siva

Text File mpi-integration.txt

Dec 10, 2015 by Indy Siva

Microsoft Powerpoint Presentation PBSAdminTraining_2011.pptx

Dec 10, 2015 by Indy Siva

Microsoft Powerpoint Presentation MPIandPBS.pptx

Dec 10, 2015 by Indy Siva

JPEG File PBS_Attributes.JPG

Dec 10, 2015 by Indy Siva

Special Queue: gpu

#!/bin/bash

#PBS -q gpu
#PBS -N cuda
#PBS -l nodes=1:ppn=1

echo "Hello from $HOSTNAME: date = `date`"
nvcc --version
echo "Finished at `date`"

usage

qsub -I -l nodes=1:ppn=8,feature=gpu -l walltime=12:00:00

qsub -I run.pbs

qsub -I -l nodes=1:ppn=8,ngpus=2,nodetype=xl -l walltime=12:00:00 run.pbs1

Special case

your user who is running interactive jobs simply ask for exclusive access to his nodes:

qsub -I -lselect=2:ncpus=2 -lplace=excl

would request 2 chunks of 2 cpus. So PBS could fullfill that with one node or two. The addition of the "excl" says to not let the other cores on the selected node be used by a subsequent job.

qsub -I -lselect=2:ncpus=4

would allow you to select two whole nodes.

usage

qmgr -c ‘print queue workq’
qmgr -c 'list queue workq'
qmgr -c 'list queue'

qmgr -c "print queue @default"
qmgr -c "list node n001"

active queue workq
active queue workq,gpu
Example: To make all queues at the default server active:
Qmgr: active queue @default

to create a queue named Q1 at the active server:
Qmgr: create queue Q1

qmgr -c "list node n023 resources_available.host"
qmgr -c "list queue workq resources_assigned.mpiprocs"

Set node offline
set node state = “offline”
Qmgr: set node n002 state=offline

gpu

basic

Example of Configuring PBS for Basic GPU Scheduling
In this example, there are two execution hosts, HostA and HostB, and each execution host has 4 GPU devices.
1. Stop the server and scheduler. On the server's host, type:
/etc/init.d/pbs stop
2. Edit PBS_HOME/server_priv/resourcedef, and add the following
line:
ngpus type=long flag=nh
3. Edit PBS_HOME/sched_priv/sched_config to add ngpus to
the list of scheduling resources:
resources: “ncpus, mem, arch, host, vnode, ngpus”
4. Restart the server and scheduler. On the server's host, type:
/etc/init.d/pbs start
5. Add the number of GPU devices available to each execution host in the
cluster via qmgr:
Qmgr: set node HostA resources_available.ngpus=4
Qmgr: set node HostB resources_available.ngpus=4

vi /var/spool/PBS/server_priv/resourcedef
vi /var/spool/PBS/sched_priv/sched_config
/etc/init.d/pbs stop
/etc/init.d/pbs start

qmgr
Max open servers: 49
Qmgr: set node n020 resources_available.ngpus=2
Qmgr: set node n021 resources_available.ngpus=2
Qmgr: set node n022 resources_available.ngpus=2
Qmgr: set node n023 resources_available.ngpus=2

ngpus

http://www.beowulf.org/archive/2010-March/027640.html

1. Create a custom resource called "ngpus" in the resourcedef 
file as in: 

ngpus type=long flag=nh 


2. This resource should then be explicitly set on each node that 
includes a GPU to the number it includes: 


set node compute-0-5 resources_available.ncpus = 8 
set node compute-0-5 resources_available.ngpus = 2 


Here I have set the number of cpus per node (8) explicitly to defeat 
hyper-threading and the actual number of gpus per node (2). On the 
other nodes you might have: 



set node compute-0-5 resources_available.ncpus = 8 
set node compute-0-5 resources_available.ngpus = 0 


Indicating that there are no gpus to allocate. 


3. You would then use the '-l select' option in your job file as follows: 


#PBS -l select=4:ncpus=2:ngpus=2 


This requests 4 PBS resource chunks. Each includes 2 cpus and 2 gpus. 
Because the resource request is "chunked" these 2 cpu x 2 gpu chunks would 
be placed together on one physical node. Because you marked some 
nodes as having 2 gpus in the nodes file and some to have 0 gpus, only those 
that have them will get allocated. As a consumable resource, as soon as 2 
were allocated the total available would drop to 0. In total you would have 
asked for 4 chunks distributed to 4 physical nodes (because only one of these 
chunks can fit on a single node). This also ensures a 1:1 mapping of cpus to 
gpus, although it does nothing about tying each cpu to a different socket. You 
would to do that in the script with numactl probably. 


There are other ways to approach by tying physical nodes to queues, which you 
might wish to do to set up a dedicate slice for GPU development. You may also 
be able to do this in PBS using the v-node abstraction. There might be some 
reason to have two production routing queues that map to slight different parts 
of the system. 




memory

1. Create a custom static integer memcount resource that will be tracked
at the server and queue:
a. In PBS_HOME/server_priv/resourcedef, add the line:
memcount type=long flag=q
b. Add the resource to the resources: line in PBS_HOME/
sched_priv/sched_config:
resources: “[...], memcount”
2. Set limits at BigMem and SmallMem so that they accept the correct
jobs:
Qmgr: set queue BigMem resources_min.mem = 8gb
Qmgr: set queue SmallMem resources_max.mem = 8gb
3. Set the order of the destinations in the routing queue so that BigMem is
tested first, so that jobs requesting exactly 8GB go into BigMem:
Qmgr: set queue RouteQueue route_destinations =
“BigMem, SmallMem”
4. Set the available resource at BigMem using qmgr. If you want a maximum
of 6 jobs from BigMem to use MemNode:
Qmgr: set queue BigMem
resources_available.memcount = 6
5. Set the default value for the counting resource at BigMem, so that jobs
inherit the value:
Qmgr: set queue BigMem resources_default.memcount
= 1
6. Associate the vnode with large memory with the BigMem queue:
Qmgr: set node MemNode queue = BigMem
The scheduler will only schedule up to 6 jobs from BigMem at a time
on the vnode with large memory.

>>>>>>>>>>>>>>>>>>

5.7.3Setting Values for String Arrays
A string array that is defined on vnodes can be set to a different set of strings on each vnode.
Example of defining and setting a string array:
• Define a new resource:
foo_arr type=string_array flag=h
• Setting via qmgr:
Qmgr: set node n4 resources_available.foo_arr=“f1,
f3, f5”
• Vnode n4 has 3 values of foo_arr: f1, f3, and f5. We add f7:
Qmgr: set node n4 resources_available.foo_arr+=f7
• Vnode n4 now has 4 values of foo_arr: f1, f3, f5 and f7.
• We remove f1:
Qmgr: set node n4 resources_available.foo_arr-=f1
• Vnode n4 now has 3 values of foo_arr: f3, f5, and f7.
• Submission:
qsub –l select=1:ncpus=1:foo_arr=f3



vi  /var/spool/PBS/server_priv/resourcedef
nodetype type=string_array flag=h

vi /var/spool/PBS/sched_priv/sched_config
resources: "ncpus, mem, arch, host, vnode, netwins, aoe ngpus memcount nodetype"

/etc/init.d/pbs stop
sleep 10
/etc/init.d/pbs start

Qmgr: 
set node n020 resources_available.nodetype=“xl”
set node n021 resources_available.nodetype=“xl”
set node n022 resources_available.nodetype=“xl”
set node n023 resources_available.nodetype=“xl”

creating queues

xl queue

create queue xl
set queue xl queue_type = Execution
set queue xl Priority = 50
set queue xl resources_min.nodetype = xl
set queue xl resources_default.nodetype = xl
set queue xl max_run = [u:PBS_GENERIC=1]
set queue xl enabled = True
set queue xl started = True

gpu

create queue gpu
set queue gpu queue_type = Execution
set queue gpu Priority = 50
set queue gpu resources_min.nodetype = xl
set queue gpu resources_default.nodetype = xl
set queue gpu max_run = [u:PBS_GENERIC=1]
set queue gpu enabled = True
set queue gpu started = True

sample pbs script to run jobs on the gpu queue

see: http://confluence.rcs.griffith.edu.au:8080/display/GHPC/BigDFT

pbs attributes

Special group to run restricted software

The following groups exists

DHI
stormsurge
nimrodusers
vasp
aerc
msr
genomics
gccm
glycomics
dbres

You could add this line in the pbs script 

#PBS -W group_list=nimrodusers

or simply use it in qsub as follows:

qsub  -q gpu -W group_list=nimrodusers  -l walltime=01:00:00