titan

Since 4/15/14 10:45 pm

eos

Since 4/9/14 02:50 pm

rhea

Since 4/15/14 05:30 pm

hpss

Since 4/15/14 09:50 am
OLCF User Assistance Center

Can't find the information you need below? Need advice from a real person? We're here to help.

OLCF support consultants are available to respond to your emails and phone calls from 9:00 a.m. to 5:00 p.m. EST, Monday through Friday, exclusive of holidays. Emails received outside of regular support hours will be addressed the next business day.

Transferring Data with HSI and HTAR

Bookmark and Share

The commands hsi and htar provide users with easy-to-use interfaces to their User Archive and Project Archive spaces on the OLCF’s HPSS-based archival storage system.

HSI Overview

The hsi utility allows automatic authentication and provides a user-friendly command line and interactive interface to HPSS. HSI is the preferred method of accessing HPSS. Features of HSI include:

  • Password security: passwords are not transmitted in clear text over the network.
  • Usability in pipelines and shell scripts and in batch jobs.
  • Support for command stacking (multiple commands per line).
  • Support for interactive or one-liner (i.e., command-line-only) modes.
  • Support for abbreviations for most commands and keywords.
  • Support for recursion for many common commands (such as those for storing, retrieving, and listing files).
  • Extensive online full-screen help.
Using HSI

Issuing the command hsi will start HSI in interactive mode. Alternatively, you can use:

  hsi [options] command(s)

…to execute a set of HSI commands and then return. Note that you may need to add /opt/public/bin to your search path to find the HSI executable.

hsi commands are similar to ftp commands. For example, hsi get and hsi put are used to retrieve and store individual files, and hsi mget and hsi mput can be used to retrieve multiple files.

To send a file to HPSS, you might use:

  hsi put a.out

To retrieve one, you might use:

  hsi get /proj/projectid/a.out

Using HTAR

The htar command provides an interface very similar to the traditional tar command found on UNIX systems. It is used as a command-line interface. For example, to store all files in the directory dir1 to a file named allfiles.tar on HPSS, use a command similar to:

  htar -cvf allfiles.tar dir1/*
Storage Locations

Users are provided with a User Archive directory on HPSS that is located at /home/userid (where userid is your User ID). Additionally, each project is given a Project Archive directory located at /proj/projectid (where projectid is the six-character project ID).

Documentation

There is interactive documentation on the hsi command available by running:

  hsi help

Additionally, documentation can be found at the Gleicher Enterprises website, including an HSI User Guide and man pages for HSI and HTAR.

Transfer File Sizes

HPSS is optimized for larger files, so if you have multiple files that are smaller than 2GB, you should combine them and store a single, larger file. In most cases, this will provide a faster transfer and it will allow HPSS to store the data more efficiently. The HTAR command is very useful for doing this, and is often faster than using the conventional tar command and then transferring via HSI.

Ideal HSI and HTAR transfer limits are as follows:

File Size Logins Puts Concurrent Sessions
2GB – 256GB < 500 a day < 500 a day < 3
Warning: Use of hsi in excess of the guidelines listed in the table above may result in HPSS account termination or delays.
Setting the Number of Copies

Our HPSS supports up to (2) tape copies for files. By default files are written with only (1) copy. For very critical files that have no backup elsewhere and cannot be easily re-created, you may want to store these files employing (2) copies.

Warning: By default, files in HPSS are written with only (1) copy. If you wish to write files to HPSS employing (2) copies, you must explicitly request such behavior.

You can specify (2) copies by issuing the copies=2 command before any put statements in hsi or as part of your htar command line, as shown below.

For non-interactive HSI (i.e., in a batch script):

  <batch script>
  ...
  hsi "copies=2; put test.file"
  ...
  $ <run batch script>
    put 'test.file' : '/home/username/test.file' ( 56420696 bytes, 30566.2 KBS (cos=6003))
  $

For interactive HSI (i.e., within HSI interactive mode):

  $ hsi
  O:[/home/username]: copies=2
  O:[/home/username]: put test.file
    put 'test.file' : '/home//test.file' ( 56420696 bytes, 58627.9 KBS (cos=6003))
  O:[/home/username]: exit
  $ 

With the htar command

  $ htar -H copies=2 -cf test.tar .
    HTAR: HTAR SUCCESSFUL
  $
Verifying the Number of Copies

Copies in HPSS are controlled through Classes of Service (COS). Each COS is either a (1)-copy COS or a (2)-copy COS. You can check the COS of a file by issuing the ls -UH command:

  O:[/home/username]: ls -UH test.file.tar
  Mode        Links  Owner  Group  COS   Acct    Where  Size         DateTime     Entry
  -rw-r--r--  1      $USER  users  6057  act001  DISK   50064568320  May 16 2008  test.file.tar

The above example shows that the test.file.tar file is in the 6057 COS. Typically, even COS are (1)-copy and odd COS are (2)-copy. You can verify if a COS is (1)-copy or (2)-copy by issuing the lscos command within HSI:

  O:[/home/]: lscos

  10 HPSS Classes of Service defined
  COS   Name                 Excl.  Copies   Subsys        Min Size -        Max Size + 1
  ID                         Flags           IDs   
  ---------------------------------------------------------------------------------------
  5081  Disk X-Small                2        ALL                  0 -             131,072
  5081  Disk X-Small                1        ALL                  0 -             131,072
  6001  Disk Small 9840             2        ALL            131,072 -          16,777,216
  6001  Disk Small 9840             1        ALL            131,072 -          16,777,216
  6002  Disk Medium 9840            1        ALL         16,777,216 -         536,870,912
  6003  Disk Medium 2-Copy          2        ALL         16,777,216 -         536,870,912
  6054  Disk Large_T 1-Copy         1        ALL        536,870,912 -       8,589,934,592
  6055  Disk Large_T 2-Copy         2        ALL        536,870,912 -       8,589,934,592
  6056  Disk X-Large_T              1        ALL      8,589,934,592 - 281,474,976,710,656
  6057  Disk X-Large_T 2-Copy       2        ALL      8,589,934,592 - 281,474,976,710,656
  ---------------------------------------------------------------------------------------
  Flags: U/G/A - unavailable to current uid/gid/account   N - no auto assignment

From the above output, we can tell that COS 6057 is a (2)-copy COS; the copies column has a (2) in it.

Direct Transfers Between HPSS and Remote Systems

Because HSI is a third-party package, clients may be available for remote systems (e.g., your personal workstation). However, the OLCF currently supports access to the HPSS only through HSI clients on the HPC systems. To transfer data directly to or from the OLCF’s HPSS, you will need to use an OLCF resource as a staging system.

For example, to transfer data from your directory on HPSS to a system outside the OLCF, you will need to copy the data in reasonable chunks to an OLCF system using the HSI utility. Once a portion of the data is on an OLCF system, you can use a utility such as BBCP or SFTP/SCP to move the data to the system outside the OLCF.