Running Production Jobs
Overview of the MOAB Batch System
All production jobs on the kronos cluster is run using the MOAB batch system.
Generally, each user is allowed to submit a maximum number of 2 jobs to MOAB.
If you need to increase the job limit, please contact us
to send such a request. You can run interactive jobs (MOAB queue "gdebug") to
debug your progrm before submitting it to the cluster. Or, Unlike interactive jobs,
batch jobs are controlled via scripts. These scripts tell the system which resources
a job will require and how long they will be needed, and then, are submitted to MOAB
queue manager to be processed. This table lists all the queues available in MOAB.
Example Usage:
The command msub "script" will submit the given script for processing. You must write a script containing the information MOAB needs to allocate the resources your job requires, to handle standard I/O streams, and to run the job. Please see the example scripts below. On submission, MOAB will return the job id.
[user@kronos]:>msub test.job 3676
The commands mshow/showq will show all jobs currently running or queued on the system.
[user@kronos ~]$ showq
active jobs------------------------
JOBID USERNAME STATE PROCS REMAINING STARTTIME
67986 shong Running 1 20:16:25 Thu Aug 6 09:51:46
67760 mnswain Running 24 1:19:25:49 Tue Aug 4 22:51:10
67542 xchen Running 1 1:23:36:47 Sun Aug 2 22:57:08
68154 murthy Running 8 2:05:48:33 Wed Aug 5 19:23:54
68155 murthy Running 8 2:05:50:15 Wed Aug 5 19:25:36
67572 sschurer Running 32 3:21:45:28 Mon Aug 3 11:20:49
67630 akumar Running 18 5:00:37:42 Tue Aug 4 14:13:03
67729 akumar Running 6 5:02:54:51 Tue Aug 4 16:30:12
67999 sschurer Running 32 6:02:01:28 Wed Aug 5 15:36:49
68157 akumar Running 18 6:08:11:57 Wed Aug 5 21:47:18
10 active jobs 148 of 240 processors in use by local jobs (61.67%)
21 of 30 nodes active (70.00%)
eligible jobs----------------------
JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME
68008 him Idle 144 23:50:00 Wed Aug 5 15:45:02
1 eligible job
blocked jobs-----------------------
JOBID USERNAME STATE PROCS WCLIMIT QUEUETIME
0 blocked jobs
Total jobs: 11
For details about your particular job, issue the command checkjob job id
where job id is obtained from the "Id" field of the llq output. The command
canceljob job id where job id is obtained from the "Id" field of the
showq output. This command will remove the job from the class and terminate the job
if it is running.
[user@kronos]:>canceljob 3676 job '3677' cancelled
An example script for a serial Job
#!/bin/bash
#
#MOAB -l nodes=1:ppn=1
#MOAB -l walltime=2:00:00
#MOAB-l mem=1GB
#MOAB -o output_filename
#MOAB -j oe
#MOAB -m bea
#MOAB -M uid@miami.edu
#MOAB -V
#MOAB -q gsmall
cd ${HOME}/sample_prog
sample_prog a b c
Here is a line-by-line breakdown of the keywords and their assigned values listed in this script:
#!/bin/bash
Specifies the shell to be used when executing the command portion of the script. The default is korn shell.
#MOAB -l nodes=1:ppn=1
Specifies a resource requirement of 1 compute node and 1 processor per node.
#MOAB -l walltime=2:00:00
Specifies a resource requirement of 2 hours of wall clock time to run the job.
#MOAB-l mem=1GB
Specifies a resource requirement of at least 1 GB to run the job.
#MOAB -o output_filename
Specifies the name of the file where job output is to be saved. May be omitted to generate filename appended with jobid number.
#MOAB -j oe
Specifies that job output and error messages are to be joined in one file.
#MOAB -m bea
Specifies that MOAB send email notification when the job begins (b), ends (e), or aborts (a).
#MOAB -M uid@miami.edu
Specifies the email address where MOAB notification is to be sent.
#MOAB -V
Specifies that all environment variables are to be exported to the batch job.
#MOAB -q gsmall
The -q directive specifies a queue for job submission, here the job is submitted to the gsmall queue
MOAB stops reading directives at the first executable (i.e. non-blank, and doesn't begin with #) line. The last two lines simply say to change to the directory /sample_prog and then run the executable sample_prog with arguments a b c.
An example script for an MPI Job
#!/bin/bash
#MOAB -l nodes=8:ppn=2
#MOAB -l walltime=2:00:00
#MOAB -l mem=1GB
#MOAB -o output_filename
#MOAB -j oe
#MOAB -m bea
#MOAB -M uid@miami.edu
#MOAB -V
#MOAB -q gsmall
#! Full path to executable + executable name
executable="<executable>"
#! Run options for the application
options="<options>"
#! Work directory
workdir="<work dir>"
###############################################################
### You should not have to change anything below this line ####
###############################################################
#! change the working directory (default is home directory)
cd $workdir
echo Running on host `hostname`
echo Time is `date`
echo Directory is `pwd`
echo MOAB job ID is $PBS_JOBID
echo This jobs runs on the following machines:
echo `cat $PBS_NODEFILE | uniq`
#! Create a machine file for MPI
cat $PBS_NODEFILE | uniq > machine.file.$PBS_JOBID
numnodes=`wc $PBS_NODEFILE | awk '{ print $1 }'`
#! Run the parallel MPI executable (nodes*cores/node)
echo "Running $executable -procs $numnodes -hostfile machine.file.$MOAB_JOBID $options"
mpirun $executable -procs $numnodes -hostfile machine.file.$PBS_JOBID $options
Here is a line-by-line breakdown of the keywords and their assigned values listed in this script:
#!/bin/bash
Specifies the shell to be used when executing the command portion of the script. The default is korn shell.
#MOAB -l nodes=8:ppn=2
Specifies a resource requirement of 8 compute nodes and 2 cores per node.
#MOAB -l walltime=2:00:00
Specifies a resource requirement of 2 hours of wall clock time to run the job.
#MOAB-l mem=1GB
Specifies a resource requirement of at least 1 GB/task to run the job.
#MOAB -o output_filename
Specifies the name of the file where job output is to be saved. May be omitted to generate filename appended with jobid number.
#MOAB -j oe
Specifies that job output and error messages are to be joined in one file.
#MOAB -m bea
Specifies that MOAB send email notification when the job begins (b), ends (e), or aborts (a).
#MOAB -M uid@miami.edu
Specifies the email address where MOAB notification is to be sent.
#MOAB -V
Specifies that all environment variables are to be exported to the batch job.
#MOAB -q gsmall
The -q directive specifies a queue for job submission, here the job is submitted to the gsmall queue
At the end of the MOAB protion of the script a machine file is built using the $PBS_NODEFILE variable in the work directory. This variable is supplied by the queueing system and contains the node names that are reserved by the queueing system for the particular job. The machine file is then passed as an argument to the mpirun command to launch the job
An example script for an OpenMP Job
#!/bin/bash
#MOAB -l nodes=1:ppn=8
#MOAB -l walltime=2:00:00
#MOAB -l mem=1GB
#MOAB -o output_filename
#MOAB -j oe
#MOAB -m bea
#MOAB -M uid@miami.edu
#MOAB -V
#MOAB -q gsmall
cd ${HOME}/sample_OpenMP_prog
export OMP_NUM_THREADS=8
sample_OpenMP_prog a b c
Here is a line-by-line breakdown of the keywords and their assigned values listed in this script:
#!/bin/bash
Specifies the shell to be used when executing the command portion of the script. The default is korn shell.
#MOAB -l nodes=1:ppn=8
Specifies a resource requirement of 1 compute nodes and 8 cores.
#MOAB -l walltime=2:00:00
Specifies a resource requirement of 2 hours of wall clock time to run the job.
#MOAB-l mem=1GB
Specifies a resource requirement of at least 1 GB/task to run the job.
#MOAB -o output_filename
Specifies the name of the file where job output is to be saved. May be omitted to generate filename appended with jobid number.
#MOAB -j oe
Specifies that job output and error messages are to be joined in one file.
#MOAB -m bea
Specifies that MOAB send email notification when the job begins (b), ends (e), or aborts (a).
#MOAB -M uid@miami.edu
Specifies the email address where MOAB notification is to be sent.
#MOAB -V
Specifies that all environment variables are to be exported to the batch job.
#MOAB -q gsmall
The -q directive specifies a queue for job submission, here the job is submitted to the gsmall queue
At the end of the MOAB protion of the script the environment variable OMP_NUM_THREADS is set to 8. This will result in the job running over 8 threads.

