In High Performance Computing (HPC), computational work is performed by jobs. Individual jobs produce data that lend relevant insight into grand challenges in science and engineering. As such, the timely, efficient execution of jobs is the primary concern in the operation of any HPC system.
A job on a commodity cluster typically comprises a few different components:
- A batch submission script.
- A binary executable.
- A set of input files for the executable.
- A set of output files created by the executable.
And the process for running a job, in general, is to:
- Prepare executables and input files.
- Write a batch script.
- Submit the batch script to the batch scheduler.
- Optionally monitor the job before and during execution.
The following sections describe in detail how to create, submit, and manage jobs for execution on commodity clusters.
Login vs Compute Nodes on Commodity Clusters
When you log into an OLCF cluster, you are placed on a login node. Login node resources are shared by all users of the system. Because of this, users should be mindful when performing tasks on a login node.
Login nodes should be used for basic tasks such as file editing, code compilation, data backup, and job submission. Login nodes should not be used for memory- or compute-intensive tasks. Users should also limit the number of simultaneous tasks performed on the login resources. For example, a user should not run (10) simultaneous
tar processes on a login node.
Rhea contains 521 compute nodes separated into two partitions:
|rhea (default)||512||128GB||–||[2x] Intel® Xeon® E5-2650 @ 2.0 GHz – 8 cores, 16 HT
(for a total of 16 cores, 32 HT per node)
|gpu||9||1TB||[2x] NVIDIA® K80||[2x] Intel® Xeon® E5-2695 @ 2.3 GHz – 14 cores, 28 HT
(for a total of 28 cores, 56 HT per node)
The first 512 nodes make up the rhea partition, where each node contains two 8-core 2.0 GHz Intel Xeon processors with Intel’s Hyper-Threading (HT) Technology and 128GB of main memory. Each CPU in this partition features 8 physical cores, for a total of 16 physical cores per node. With Intel® Hyper-Threading Technology enabled, each node has 32 logical cores capable of executing 32 hardware threads for increased parallelism.
Rhea also has nine large memory/GPU nodes, which make up the gpu partition. These nodes each have 1TB of main memory and two NVIDIA K80 GPUs in addition to two 14-core 2.30 GHz Intel Xeon processors with HT Technology. Each CPU in this partition features 14 physical cores, for a total of 28 physical cores per node. With Hyper-Threading enabled, these nodes have 56 logical cores that can execute 56 hardware threads for increased parallelism.
Writing Batch Scripts for Commodity Clusters
Batch scripts, or job submission scripts, are the mechanism by which a user configures and submits a job for execution. A batch script is simply a shell script that also includes commands to be interpreted by the batch scheduling software (e.g. PBS).
Batch scripts are submitted to the batch scheduler, where they are then parsed for the scheduling configuration options. The batch scheduler then places the script in the appropriate queue, where it is designated as a batch job. Once the batch jobs makes its way through the queue, the script will be executed on the primary compute node of the allocated resources.
Components of a Batch Script
Batch scripts are parsed into the following (3) sections:
The first line of a script can be used to specify the script’s interpreter; this line is optional. If not used, the submitter’s default shell will be used. The line uses the hash-bang syntax, i.e.,
PBS Submission Options
The PBS submission options are preceded by the string
#PBS, making them appear as comments to a shell. PBS will look for
#PBS options in a batch script from the script’s first line through the first non-comment line. A comment line begins with
#PBS options entered after the first non-comment line will not be read by PBS.
The shell commands follow the last
#PBS option and represent the executable content of the batch job. If any
#PBS lines follow executable statements, they will be treated as comments only.
The execution section of a script will be interpreted by a shell and can contain multiple lines of executables, shell commands, and comments. When the job’s queue wait time is finished, commands within this section will be executed on the primary compute node of the job’s allocated resources. Under normal circumstances, the batch job will exit the queue after the last line of the script is executed.
Example Batch Script
1: #!/bin/bash 2: #PBS -A XXXYYY 3: #PBS -N test 4: #PBS -j oe 5: #PBS -l walltime=1:00:00,nodes=2 6: 7: cd $PBS_O_WORKDIR 8: date 9: mpirun -n 8 ./a.out
This batch script shows examples of the three sections outlined above:
1: This line is optional and can be used to specify a shell to interpret the script. In this example, the bash shell will be used.
2: The job will be charged to the “XXXYYY” project.
3: The job will be named
4: The job’s standard output and error will be combined into one file.
5: The job will request (2) nodes for (1) hour.
6: This line is left blank, so it will be ignored.
7: This command will change the current directory to the directory from where the script was submitted.
8: This command will run the
9: This command will run (8) MPI instances of the executable
a.out on the compute nodes allocated by the batch system.
Batch scripts can be submitted for execution using the
qsub command. For example, the following will submit the batch script named
If successfully submitted, a PBS job ID will be returned. This ID can be used to track the job. It is also helpful in troubleshooting a failed job; make a note of the job ID for each of your jobs in case you must contact the OLCF User Assistance Center for support.
Interactive Batch Jobs on Commodity Clusters
Batch scripts are useful when one has a pre-determined group of commands to execute, the results of which can be viewed at a later time. However, it is often necessary to run tasks on compute resources interactively.
Users are not allowed to access cluster compute nodes directly from a login node. Instead, users must use an interactive batch job to allocate and gain access to compute resources. This is done by using the
-I option to
qsub. Other PBS options are passed to
qsub on the command line as well:
$ qsub -I -A abc123 -q qname -V -l nodes=4 -l walltime=00:30:00
This request will:
||Start an interactive session|
||Charge to the
||Run in the
||Export the user’s shell environment to the job’s environment|
||Request (4) nodes…|
||…for (30) minutes|
After running this command, the job will wait until enough compute nodes are available, just as any other batch job must. However, once the job starts, the user will be given an interactive prompt on the primary compute node within the allocated resource pool. Commands may then be executed directly (instead of through a batch script).
Using to Debug
A common use of interactive batch is to aid in debugging efforts. Interactive access to compute resources allows the ability to run a process to the point of failure; however, unlike a batch job, the process can be restarted after brief changes are made without losing the compute resource pool; thus speeding up the debugging effort.
Choosing a Job Size
Because interactive jobs must sit in the queue until enough resources become available to allocate, it is useful to choose a job size based on the number of currently unallocated nodes (to shorten the queue wait time).
showbf command (i.e. “show backfill”) to see resource limits that would allow your job to be immediately backfilled (and thus started) by the scheduler. For example, the snapshot below shows that (8) nodes are currently free.
$ showbf Partition Tasks Nodes StartOffset Duration StartDate --------- ----- ----- ------------ --------- -------------- rhea 4744 8 INFINITY 00:00:00 HH:MM:SS_MM/DD
See the output of the
showbf –help command for additional options.
Common Batch Options to PBS
The following table summarizes frequently-used options to PBS:
||Causes the job time to be charged to
||Maximum number of compute nodes. Jobs cannot request partial nodes.|
||Maximum wall-clock time.
||Allocates resources on specified partition.|
||Writes standard output to
||Writes standard error to
||Combines standard output and standard error into the standard error file (
||Sends email to the submitter when the job aborts.|
||Sends email to the submitter when the job begins.|
||Sends email to the submitter when the job ends.|
||Specifies email address to use for
||Sets the job name to
||Sets the shell to interpret the job script.|
||Directs the job to the specified queue.This option is not required to run in the default queue on any given system.|
||Exports all environment variables from the submitting shell into the batch job shell. Since the login nodes differ from the service nodes, using the ‘-V’ option is not recommended. Users should create the needed environment within the batch job.|
||Enables X11 forwarding. The -X PBS option should be used to tunnel a GUI from an interactive batch job.|
Further details and other PBS options may be found through the
qsub man page.
Batch Environment Variables
PBS sets multiple environment variables at submission time. The following PBS variables are useful within batch scripts:
||The directory from which the batch job was submitted. By default, a new job starts in your home directory. You can get back to the directory of job submission with
||The job’s full identifier. A common use for
||The number of nodes requested.|
||The job name supplied by the user.|
||The name of the file containing the list of nodes assigned to the job. Used sometimes on non-Cray clusters.|
Modifying Batch Jobs
The batch scheduler provides a number of utility commands for managing submitted jobs. See each utilities’ man page for more information.
Removing and Holding Jobs
Jobs in the queue in any state can be stopped and removed from the queue using the command
$ qdel 1234
Jobs in the queue in a non-running state may be placed on hold using the
qhold command. Jobs placed on hold will not be removed from the queue, but they will not be eligible for execution.
$ qhold 1234
Once on hold the job will not be eligible to run until it is released to return to a queued state. The
qrls command can be used to remove a job from the held state.
$ qrls 1234
Modifying Job Attributes
Non-running jobs in the queue can be modified with the PBS
qalter command. The
qalter utility can be used to do the following (among others):
Modify the job’s name:
$ qalter -N newname 130494
Modify the number of requested cores:
$ qalter -l nodes=12 130494
Modify the job’s walltime:
$ qalter -l walltime=01:00:00 130494
Monitoring Batch Jobs
PBS and Moab provide multiple tools to view queue, system, and job status. Below are the most common and useful of these tools.
Job Monitoring Commands
The Moab utility
showq can be used to view a more detailed description of the queue. The utility will display the queue in the following states:
|Active||These jobs are currently running.|
|Eligible||These jobs are currently queued awaiting resources. Eligible jobs are shown in the order in which the scheduler will consider them for allocation.|
|Blocked||These jobs are currently queued but are not eligible to run. A job may be in this state because the user has more jobs that are “eligible to run” than the system’s queue policy allows.|
To see all jobs currently in the queue:
To see all jobs owned by userA currently in the queue:
$ showq -u userA
To see all jobs submitted to partitionA:
$ showq -p partitionA
To see all completed jobs:
$ showq -c
-------------------------------------------------------------------- NOTE: The following information has been cached by the remote server and may be slightly out of date. --------------------------------------------------------------------
The Moab utility
checkjob can be used to view details of a job in the queue. For example, if job 736 is a job currently in the queue in a blocked state, the following can be used to view why the job is in a blocked state:
$ checkjob 736
The return may contain a line similar to the following:
BlockMsg: job 736 violates idle HARD MAXJOB limit of X for user (Req: 1 InUse: X)
This line indicates the job is in the blocked state because the owning user has reached the limit for jobs in the “eligible to run” state.
The PBS utility
qstat will poll PBS (Torque) for job information. However,
qstat does not know of Moab’s blocked and eligible states. Because of this, the
showq Moab utility (see above) will provide a more accurate batch queue state. To show show all queued jobs:
$ qstat -a
To show details about job 1234:
$ qstat -f 1234
To show all currently queued jobs owned by userA:
$ qstat -u userA
Batch Queues on Rhea
Rhea Partition Policy (default)
Jobs that do not specify a partition will run in the 512 node rhea partition.
|A||1 – 16 Nodes||0 – 48 hr|| max 4 jobs running and 4 jobs eligible
in bins A, B, and C
|B||17 – 64 Nodes||0 – 36 hr|
|C||65 – 384 Nodes||0 – 3 hr|
GPU Partition Policy
To access the 9 node gpu partition, batch job submissions should request
|1-2 Nodes||0 – 48 hrs|| max 1 job running
Users wishing to submit jobs that fall outside the queue structure are encouraged to request a reservation via the Special Request Form.
Allocation Overuse Policy
Projects that overrun their allocation are still allowed to run on OLCF systems, although at a reduced priority. Like the adjustment for the number of processors requested above, this is an adjustment to the apparent submit time of the job. However, this adjustment has the effect of making jobs appear much younger than jobs submitted under projects that have not exceeded their allocation. In addition to the priority change, these jobs are also limited in the amount of wall time that can be used.
For example, consider that
job1 is submitted at the same time as
job2. The project associated with
job1 is over its allocation, while the project for
job2 is not. The batch system will consider
job2 to have been waiting for a longer time than
job1. Also projects that are at 125% of their allocated time will be limited to only one running job at a time. The adjustment to the apparent submit time depends upon the percentage that the project is over its allocation, as shown in the table below:
|% Of Allocation Used||Priority Reduction||number eligible-to-run||number running|
|< 100%||0 days||4 jobs||unlimited jobs|
|100% to 125%||30 days||4 jobs||unlimited jobs|
|> 125%||365 days||4 jobs||1 job|
Job Execution on Commodity Clusters
Once resources have been allocated through the batch system, users have the option of running commands on the allocated resources’ primary compute node (a serial job) and/or running an MPI/OpenMP executable across all the resources in the allocated resource pool simultaneously (a parallel job).
Serial Job Execution on Commodity Clusters
The executable portion of batch scripts is interpreted by the shell specified on the first line of the script. If a shell is not specified, the submitting user’s default shell will be used.
The serial portion of the batch script may contain comments, shell commands, executable scripts, and compiled executables. These can be used in combination to, for example, navigate file systems, set up job execution, run serial executables, and even submit other batch jobs.
Parallel Job Execution on Commodity Clusters
By default, commands will be executed on the job’s primary compute node, sometimes referred to as the job’s head node. The
mpirun command is used to execute an MPI executable on one or more compute nodes in parallel.
mpirun accepts the following common options:
||Number of ranks per node|
||Total number of MPI ranks|
||Allow code to control thread affinity|
||Place N tasks per node leaving space for T threads|
||Place N tasks per socket leaving space for T threads|
||Assign tasks by socket placing N tasks on each socket|
||Have MPI explain which ranks have been assigned to which nodes / physical cores|
-n, the system will default to all available cores allocated to the job.
MPI Task Layout
Each compute node on Rhea contains two sockets each with 8 cores. Depending on your job, it may be useful to control task layout within and across nodes.
Default Layout: Sequential
The following will run a copy of a.out on two cores each on the same node:
$ mpirun -np 2 ./a.out
4 cores, 2 cores per socket, 1 node
The following will run a.out on 4 cores, 2 cores per socket, 1 node:
$ mpirun -np 4 --map-by ppr:2:socket ./a.out
4 cores, 1 core per socket, 2 nodes
The following will run a.out on 4 cores, 1 core per socket, 2 nodes. This can be useful if you need to spread your batch job over multiple nodes to allow each task access to more memory.
$ mpirun -np 4 --map-by ppr:1:socket ./a.out
--report-bindings flag can be used to report task layout:
$ mpirun -np 4 --map-by ppr:1:socket --report-bindings hostname [rhea2:47176] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]: [BB/../../../../../../..][../../../../../../../..] [rhea2:47176] MCW rank 1 bound to socket 1[core 8[hwt 0-1]]: [../../../../../../../..][BB/../../../../../../..] [rhea4:104150] MCW rank 2 bound to socket 0[core 0[hwt 0-1]]: [BB/../../../../../../..][../../../../../../../..] [rhea4:104150] MCW rank 3 bound to socket 1[core 8[hwt 0-1]]: [../../../../../../../..][BB/../../../../../../..] $
2 MPI tasks, 1 tasks per node, 16 threads per task, 2 nodes
$ setenv OMP_NUM_THREADS 16 $ mpirun -np 2 --map-by ppr:1:node:pe=16 ./a.out
2 MPI tasks, 1 tasks per socket, 4 threads per task, 1 node
$ setenv OMP_NUM_THREADS 4 $ mpirun -np 2 --map-by ppr:1:socket:pe=4 ./a.out
Resource Sharing on Commodity Clusters
Jobs on OLCF clusters are scheduled in full node increments; a node’s cores cannot be allocated to multiple jobs. Because the OLCF charges based on what a job makes unavailable to other users, a job is charged for an entire node even if it uses only one core on a node. To simplify the process, users are given a multiples of entire nodes through PBS.
In general, the cluster may move MPI tasks between cores within a node.
To help prevent a job’s tasks from being moved between cores each idle cycle the
mpi_yield_when_idle OpenMPI option may be used. For example:
$ mpirun -n 8 -mca mpi_yield_when_idle 0 a.out
This will help prevent the core from being given to other waiting tasks. This only affects MPI processes when they are blocking in MPI library calls.
By default OpenMPI will set this variable based on whether it believes the node is over-allocated or under-allocated. If over-allocated,
mpi_yield_when_idle, will be set to a value other than (1), allowing the core to be given to other waiting tasks when idle. If under-allocated,
mpi_yield_when_idle, will be set to (0).
If more tasks are running on a node than are cores, the OS will swap all tasks between cores on the node. The
mpi_yield_when_idle option only helps to slow this down; it will not fully prevent the swaps.
Job Accounting on Rhea
Jobs on Rhea are scheduled in full node increments; a node’s cores cannot be allocated to multiple jobs. Because the OLCF charges based on what a job makes unavailable to other users, a job is charged for an entire node even if it uses only one core on a node. To simplify the process, users are given a multiples of entire nodes through PBS.
Viewing Allocation Utilization
Projects are allocated time on Rhea in units of node-hours. This is separate from a project’s Titan or Eos allocation, and usage of Rhea does not count against that allocation. This page describes how such units are calculated, and how users can access more detailed information on their relevant allocations.
The node-hour charge for each batch job will be calculated as follows:
node-hours = nodes requested * ( batch job endtime - batch job starttime )
Where batch job starttime is the time the job moves into a running state, and batch job endtime is the time the job exits a running state.
A batch job’s usage is calculated solely on requested nodes and the batch job’s start and end time. The number of cores actually used within any particular node within the batch job is not used in the calculation. For example, if a job requests (6) nodes through the batch script, runs for (1) hour, uses only (2) CPU cores per node, the job will still be charged for 6 nodes * 1 hour = 6 node-hours.
Utilization is calculated daily using batch jobs which complete between 00:00 and 23:59 of the previous day. For example, if a job moves into a run state on Tuesday and completes Wednesday, the job’s utilization will be recorded Thursday. Only batch jobs which write an end record are used to calculate utilization. Batch jobs which do not write end records due to system failure or other reasons are not used when calculating utilization.
Each user may view usage for projects on which they are members from the command line tool
showusage and the My OLCF site.
On the Command Line via
showusage utility can be used to view your usage from January 01 through midnight of the previous day. For example:
$ showusage Usage: Project Totals
Project Allocation Usage Remaining Usage _________________|______________|___________|____________|______________ abc123 | 20000 | 126.3 | 19873.7 | 1560.80
-h option will list more usage details.
On the Web via My OLCF
More detailed metrics may be found on each project’s usage section of the My OLCF site.
The following information is available for each project:
- YTD usage by system, subproject, and project member
- Monthly usage by system, subproject, and project member
- YTD usage by job size groupings for each system, subproject, and project member
- Weekly usage by job size groupings for each system, and subproject
- Batch system priorities by project and subproject
- Project members
The My OLCF site is provided to aid in the utilization and management of OLCF allocations. If you have any questions or have a request for additional data, please contact the OLCF User Assistance Center.
Enabling Workflows through Cross-System Batch Submission
The OLCF now supports submitting jobs between OLCF systems via batch scripts. This can be useful for automatically triggering analysis and storage of large data sets after a successful simulation job has ended, or for launching a simulation job automatically once the input deck has been retrieved from HPSS and pre-processed.
The key to remote job submission is the command
qsub -q host script.pbs which will submit the file
script.pbs to the batch queue on the specified host. This command can be inserted at the end of an existing batch script in order to automatically trigger work on another OLCF resource. This feature is supported on the following hosts:
|Host||Remote Submission Command|
|Data Transfer Nodes (DTNs)||
Example Workflow 1: Automatic Post-Processing
The simplest example of a remote submission workflow would be automatically triggering an analysis task on Rhea at the completion of a compute job on Titan. This workflow would require two batch scripts, one to be submitted on Titan, and a second to be submitted automatically to Rhea. Visually, this workflow may look something like the following:
#PBS -l walltime=0:30:00 #PBS -l nodes=4096 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # run compute job on titan cd $MEMBERWORK/prj123 aprun -n 65536 ./run_simulation.exe # Submit visualization processing job to Rhea qsub -q rhea Batch-script-2.pbs
#PBS -l walltime=2:00:00 #PBS -l nodes=10 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Launch exectuable cd $MEMBERWORK/prj123 mpirun -n 10 ./post_process_job.exe
The key to this workflow is the
qsub -q batch@rhea-batch Batch-script-2.pbs command, which tells
qsub to submit the file
Batch-script-2.pbs to the batch queue on Rhea.
Initializing the Workflow
We can initialize this workflow in one of two ways:
- Log into
- From Titan or Rhea, run
qsub -q titan Batch-script-1.pbs
Example Workflow 2: Data Staging, Compute, and Archival
Now we give another example of a linear workflow. This example shows how to use the Data Transfer Nodes (DTNs) to retrieve data from HPSS and stage it to your project’s scratch area before beginning. Once the computation is done, we will automatically archive the output.
#PBS -l walltime=0:30:00 #PBS -l nodes=1 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Retrieve Data from HPSS cd $MEMBERWORK/prj123 htar -xf /proj/prj123/input_data.htar input_data/ # Launch compute job qsub -q titan Batch-script-2.pbs
#PBS -l walltime=6:00:00 #PBS -l nodes=4096 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Launch exectuable cd $MEMBERWORK/prj123 aprun -n 65536 ./analysis-task.exe # Submit data archival job to DTNs qsub -q dtn Batch-script-3.pbs
#PBS -l walltime=0:30:00 #PBS -l nodes=1 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Launch exectuable cd $MEMBERWORK/prj123 htar -cf /proj/prj123/viz_output.htar viz_output/ htar -cf /proj/prj123/compute_data.htar compute_data/
Initializing the Workflow
We can initialize this workflow in one of two ways:
- Log into
- From Titan or Rhea, run
qsub -q dtn Batch-script-1.pbs
Example Workflow 3: Data Staging, Compute, Visualization, and Archival
This is an example of a “branching” workflow. What we will do is first use Rhea to prepare a mesh for our simulation on Titan. We will then launch the compute task on Titan, and once this has completed, our workflow will branch into two separate paths: one to archive the simulation output data, and one to visualize it. After the visualizations have finished, we will transfer them to a remote institution.
#PBS -l walltime=0:30:00 #PBS -l nodes=10 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Prepare Mesh for Simulation mpirun -n 160 ./prepare-mesh.exe # Launch compute job qsub -q titan Step-2.compute.pbs
#PBS -l walltime=6:00:00 #PBS -l nodes=4096 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Launch exectuable cd $MEMBERWORK/prj123 aprun -n 65536 ./analysis-task.exe # Workflow branches at this stage, launching 2 separate jobs # - Launch Archival task on DTNs qsub -q dtn@dtn-batch Step-3.archive-compute-data.pbs # - Launch Visualization task on Rhea qsub -q rhea Step-4.visualize-compute-data.pbs
#PBS -l walltime=0:30:00 #PBS -l nodes=1 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Archive compute data in HPSS cd $MEMBERWORK/prj123 htar -cf /proj/prj123/compute_data.htar compute_data/
#PBS -l walltime=2:00:00 #PBS -l nodes=64 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Visualize Compute data cd $MEMBERWORK/prj123 mpirun -n 768 ./visualization-task.py # Launch transfer task qsub -q dtn Step-5.transfer-visualizations-to-campus.pbs
#PBS -l walltime=2:00:00 #PBS -l nodes=1 #PBS -A PRJ123 #PBS -l gres=atlas1%atlas2 # Transfer visualizations to storage area at home institution cd $MEMBERWORK/prj123 SOURCE=gsiftp://dtn03.ccs.ornl.gov/$MEMBERWORK/visualization.mpg DEST=gsiftp://dtn.university-name.edu/userid/visualization.mpg globus-url-copy -tcp-bs 12M -bs 12M -p 4 $SOURCE $DEST
Initializing the Workflow
We can initialize this workflow in one of two ways:
- Log into
- From Titan or the DTNs, run
qsub -q rhea Step-1.prepare-data.pbs
Checking Job Status
|Host||Remote qstat||Remote showq|
|Data Transfer Nodes (DTNs)||
Deleting Remote Jobs
In order to delete a job (say, job number 18688) from a remote queue, you can do the following
|Data Transfer Nodes (DTNs)||
The OLCF advises users to keep their remote submission workflows simple, short, and mostly linear. Workflows that contain many layers of branches, or that trigger many jobs at once, may prove difficult to maintain and debug. Workflows that contain loops or recursion (jobs that can submit themselves again) may inadvertently waste allocation hours if a suitable exit condition is not reached.
#PBS -A <PROJECTID>field is set to the correct project prior to submission. This will ensure that resource usage is associated with the intended project.