Job Resource Accounting
Categories: Running Jobs
Print this article
The hybrid nature of Titan’s accelerated XK7 nodes mandated a new approach to its node allocation and job charge units. For the sake of resource accounting, each Titan XK7 node will be defined as possessing (30) total cores (e.g. (16) CPU cores + (14) GPU core equivalents). Jobs consume charge units in “Titan core-hours”, and each Titan node consumes (30) of such units per hour.
As in years past, jobs on the Titan system will be scheduled in full node increments; a node’s cores cannot be allocated to multiple jobs. Because the OLCF charges based on what a job makes unavailable to other users, a job is charged for an entire node even if it uses only one core on a node. To simplify the process, users are required to request an entire node through PBS.
Notably, codes that do not take advantage of GPUs will have only (16) CPU cores available per node; however, allocation requests–and units charged–will be based on (30) cores per node.
Viewing Allocation Utilization
Projects are allocated time on Titan in units of “Titan core-hours”. Other OLCF systems are allocated in units of “core-hours”. This page describes how such units are calculated, and how users can access more detailed information on their relevant allocations.
Titan Core-Hour Calculation
The Titan core-hour charge for each batch job will be calculated as follows:
Titan core-hours = nodes requested * 30 * ( batch job endtime - batch job starttime )
Where batch job starttime is the time the job moves into a running state, and batch job endtime is the time the job exits a running state.
A batch job’s usage is calculated solely on requested nodes and the batch job’s start and end time. The number of cores actually used within any particular node within the batch job is not used in the calculation. For example, if a job requests 64 nodes through the batch script, runs for an hour, uses only 2 CPU cores per node, and uses no GPU cores, the job will still be charged for 64 * 30 * 1 = 1,920 Titan core-hours.
Utilization is calculated daily using batch jobs which complete between 00:00 and 23:59 of the previous day. For example, if a job moves into a run state on Tuesday and completes Wednesday, the job’s utilization will be recorded Thursday. Only batch jobs which write an end record are used to calculate utilization. Batch jobs which do not write end records due to system failure or other reasons are not used when calculating utilization.
Each user may view usage for projects on which they are members from the command line tool
showusage and the My OLCF site.
On the Command Line via
showusage utility can be used to view your usage from January 01 through midnight of the previous day. For example:
$ showusage Usage on titan: Project Totals <userid> Project Allocation Usage Remaining Usage _________________________|___________________________|_____________ <YourProj> 2000000 | 123456.78 1876543.22 | 1560.80
-h option will list more usage details.
On the Web via My OLCF
More detailed metrics may be found on each project’s usage section of the My OLCF site.
The following information is available for each project:
- YTD usage by system, subproject, and project member
- Monthly usage by system, subproject, and project member
- YTD usage by job size groupings for each system, subproject, and project member
- Weekly usage by job size groupings for each system, and subproject
- Batch system priorities by project and subproject
- Project members
The My OLCF site is provided to aid in the utilization and management of OLCF allocations. If you have any questions or have a request for additional data, please contact the OLCF User Assistance Center.