titan

Up since 11/8/17 02:45 pm

eos

Up since 11/14/17 11:20 pm

rhea

Up since 10/17/17 05:40 pm

hpss

Up since 11/20/17 09:15 am

atlas1

Up since 11/15/17 07:25 am

atlas2

Up since 11/27/17 10:45 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

Interactive Batch Jobs

See this article in context within the following user guides: Eos | Titan

Batch scripts are useful for submitting a group of commands, allowing them to run through the queue, then viewing the results at a later time. However, it is sometimes necessary to run tasks within a job interactively.

Users are not permitted to access compute nodes nor run aprun directly from login nodes. Instead, users must use an interactive batch job to allocate and gain access to compute resources interactively. This is done by using the -I option to qsub.

Interactive Batch Example

For interactive batch jobs, PBS options are passed through qsub on the command line.

$ qsub -I -A pjt000 -q debug -X -l nodes=3,walltime=30:00

This request will:

-I Start an interactive session
-A Charge to the “pjt000” project
-X Enables X11 forwarding. The DISPLAY environment variable must be set.
-q debug Run in the debug queue
-l nodes=3,walltime=30:00 Request 3 compute nodes for 30 minutes (you get all cores per node)

After running this command, you will have to wait until enough compute nodes are available, just as in any other batch job. However, once the job starts, you will be given an interactive prompt on the head node of your allocated resource. From here commands may be executed directly instead of through a batch script.

Debugging via Interactive Jobs

A common use of interactive batch is to aid in debugging efforts. Interactive access to compute resources allows the ability to run a process to the point of failure; however, unlike a batch job, the process can be restarted after brief changes are made without loosing the compute resource allocation. This may help speed the debugging effort because a user does not have to wait in the queue in between each run attempts.

Note: To tunnel a GUI from an interactive batch job, the -X PBS option should be used to enable X11 forwarding.
Choosing an Interactive Job’s nodes Value

Because interactive jobs must sit in the queue until enough resources become available to allocate, to shorten the queue wait time, it is useful to base nodes selection on the number of unallocated nodes. The showbf command (i.e “show backfill”) to see resource limits that would allow your job to be immediately back-filled (and thus started) by the scheduler. For example, the snapshot below shows that 802 nodes are currently free.

$ showbf
Partition   Tasks   Nodes   StartOffset    Duration       StartDate
---------   -----   -----   ------------   ------------   --------------
ALL         4744    802     INFINITY       00:00:00       HH:MM:SS_MM/DD

See showbf –help for additional options.