Interactive Batch Jobs on Commodity Clusters
Categories: Running Jobs
Print this article
Batch scripts are useful when one has a pre-determined group of commands to execute, the results of which can be viewed at a later time. However, it is often necessary to run tasks on compute resources interactively.
Users are not allowed to access cluster compute nodes directly from a login node. Instead, users must use an interactive batch job to allocate and gain access to compute resources. This is done by using the
-I option to
qsub. Other PBS options are passed to
qsub on the command line as well:
$ qsub -I -A abc123 -q qname -V -l nodes=4 -l walltime=30:00:00
This request will:
||Start an interactive session|
||Charge to the
||Run in the
||Export the user’s shell environment to the job’s environment|
||Request (4) nodes…|
||…for (30) minutes|
After running this command, the job will wait until enough compute nodes are available, just as any other batch job must. However, once the job starts, the user will be given an interactive prompt on the primary compute node within the allocated resource pool. Commands may then be executed directly (instead of through a batch script).
Using to Debug
A common use of interactive batch is to aid in debugging efforts. Interactive access to compute resources allows the ability to run a process to the point of failure; however, unlike a batch job, the process can be restarted after brief changes are made without losing the compute resource pool; thus speeding up the debugging effort.
Choosing a Job Size
Because interactive jobs must sit in the queue until enough resources become available to allocate, it is useful to base core selection on the number of currently unallocated cores (to shorten the queue wait time).
showbf command (i.e. “show backfill”) to see resource limits that would allow your job to be immediately backfilled (and thus started) by the scheduler. For example, the snapshot below shows that (8) nodes are currently free.
$ showbf Partition Tasks Nodes StartOffset Duration StartDate --------- ----- ----- ------------ --------- -------------- lens 4744 8 INFINITY 00:00:00 HH:MM:SS_MM/DD
See the output of the
showbf –help command for additional options.