Writing Batch Scripts for Commodity Clusters
Categories: Running Jobs
Print this article
Batch scripts are used to run a set of commands on a cluster’s compute partition. The batch script is simply a shell script containing options to the batch scheduler software (e.g., PBS) followed by commands to be interpreted by a shell. The batch script is submitted to the batch scheduler software, PBS, where it is parsed. Based on the parsed data, PBS places the script in the queue as a batch job. Once the batch job makes its way through the queue, the script will be executed on the primry compute node of the allocated resources.
Components of a Batch Script
Batch scripts are parsed into the following (3) sections:
The first line of a script can be used to specify the script’s interpreter; this line is optional. If not used, the submitter’s default shell will be used. The line uses the hash-bang syntax, i.e.,
PBS Submission Options
The PBS submission options are preceded by the string
#PBS, making them appear as comments to a shell. PBS will look for
#PBS options in a batch script from the script’s first line through the first non-comment line. A comment line begins with
#PBS options entered after the first non-comment line will not be read by PBS.
The shell commands follow the last
#PBS option and represent the executable content of the batch job. If any
#PBS lines follow executable statements, they will be treated as comments only. The exception to this rule is shell specification on the first line of the script.
The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments. Commands within this section will be executed on the batch job’s primary compute node after the job has been allocated. During normal execution, the batch script will end and exit the queue after the last line of the script.
Example Batch Script
1: #!/bin/bash 2: #PBS -A XXXYYY 3: #PBS -N test 4: #PBS -j oe 5: #PBS -l walltime=1:00:00,nodes=2 6: 7: cd $PBS_O_WORKDIR 8: date 9: mpirun -n 8 ./a.out
This batch script can be broken down into the following sections:
1: This line is optional and can be used to specify a shell to interpret the script.
2: The job will be charged to the “XXXYYY” project.
3: The job will be named
4: The job’s standard output and error will be combined into one file.
5: The job will request (2) nodes for (1) hour.
6: This line is left blank, so it will be ignored.
7: This command will change the current directory to the directory from where the script was submitted.
8: This command will run the
9: This command will run the executable
a.out on (8) cores via MPI.
Batch scripts can be submitted for execution using the
qsub command. For example, the following will submit the batch script named
If successfully submitted, a PBS job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job,; make a note of the job ID for each of your jobs in case you must contact the OLCF User Assistance Center for support.