The OLCF provides a comprehensive suite of hardware and software resources for the creation, manipulation, and retention of scientific data. A summary of the OLCF’s Data Management Policy is presented below.
Data Storage Areas
The OLCF provides an array of data storage areas, each designed with a particular purpose in mind. Storage areas are broadly divided into two categories: those intended for user data and those intended for project data. Within each of the two categories, we provide different sub-areas, each with an indented purpose:
|Long-term data for routine access that is unrelated to a project||User Home||
|Long-term data for archival access that is unrelated to a project||User Archive||
|Long-term project data for routine access that’s shared with other project members||Project Home||
|Short-term project data for fast, batch-job access that you don’t want to share||Member Work||
|Short-term project data for fast, batch-job access that’s shared with other project members||Project Work||
|Short-term project data for fast, batch-job access that’s shared with those outside your project||World Work||
|Long-term project data for archival access that’s shared with other project members||Project Archive||
Data Retention, Purge, & Quota Summary
Users must agree to the full Data Management Policy as part of their account application. The “Data Retention, Purge, & Quotas” section is useful and is summarized below.
|User-Centric Storage Areas|
||NFS||User-controlled||10 GB||Yes||No||90 days|
||HPSS||User-controlled||2 TB ||No||No||90 days|
|Project-Centric Storage Areas|
||NFS||770||50 GB||Yes||No||90 days|
||Lustre®||700 ||10 TB||No||14 days||14 days|
||Lustre®||770||100 TB||No||90 days||90 days|
||Lustre®||775||10 TB||No||90 days||90 days|
||HPSS||770||100 TB ||No||No||90 days|
|Area||The general name of storage area.|
|Path||The path (symlink) to the storage area’s directory.|
|Type||The underlying software technology supporting the storage area.|
|Permissions||UNIX Permissions enforced on the storage area’s top-level directory.|
|Quota||The limits placed on total number of bytes and/or files in the storage area.|
|Backups||States if the data is automatically duplicated for disaster recovery purposes.|
|Purged||Period of time, post-file-creation, after which a file will be marked as eligible for permanent deletion.|
|Retention||Period of time, post-account-deactivation or post-project-end, after which data will be marked as eligible for permanent deletion.|
 In addition, there is a quota/limit of 2,000 files on this directory.
 Permissions on Member Work directories can be controlled to an extent by project members. By default, only the project member has any accesses, but accesses can be granted to other project members by setting group permissions accordingly on the Member Work directory. The parent directory of the Member Work directory prevents accesses by “UNIX-others” and cannot be changed (security measures).
 In addition, there is a quota/limit of 100,000 files on this directory.
The OLCF provides a number of hardware resources for the physical storage and management of large amounts of scientific data.
The OLCF’s center-wide Lustre® file system, called Spider, is the operational work file system for most OLCF computational resources. As an extremely high-performance system, Spider has over 26,000 clients, providing 32 petabytes of disk space and can move data at more than 1 TB/s. Spider is currently accessible by nearly all of the OLCF’s computational resources, including Titan and its 18,000+ compute nodes. The filesystem is available to Titan, Rhea, Eos, and the Data Transfer Nodes. For more information on the Spider filesystem, see the Spider Knowledgebase article.
HPSS is the archival mass-storage resource at ORNL and consists of robotic tape and disk storage components, Linux servers, and associated software. Incoming data is written to disk and later migrated to tape for long term archival. As storage, network, and computing technologies continue to change, ORNL’s storage system evolves to take advantage of new equipment that is both more capable and more cost-effective. For more information on HPSS allocations and use, see the Understanding HPSS Storage Allocations page or the Transferring Data with HSI and HTAR page within our support documentation.
The OLCF provides an array of software tools for managing large amounts of scientific data. All relevant software packages are listed under the Software: Data Management section of our support documentation.
Highlighted below are some OLCF-developed data management tools of particular interest:
The Adaptable I/O System (ADIOS) library, developed at OLCF, is a flexible method for applications to transparently manage data I/O. Using ADIOS, applications can change their I/O patterns and methods (such as MPI-IO or POSIX I/O) by simply editing a configuration file. This allows applications to optimize their I/O workload for different computing systems utilizing different I/O subsystems and filesystems. For more information on ADIOS, see the ADIOS project page or the ADIOS software page within our support documentation.
The Lustre User Toolkit, developed at OLCF, is an API for interacting with Lustre filesystems. LUT can also provide I/O timing information, allowing application authors to optimize their use of Lustre filesystems. OLCF has also made available lustre-optimized user utilities developed using LUT. For more information on LibLUT, see the LibLUT software page within our support documentation.
LustreDU was developed at the OLCF to give end-users the ability to view the last recorded size of a directory on Spider. The advantage of LustreDU is that the query does not hit the Lustre metadata servers as a normal `du` command would; rather, the query pulls from a seperate database and quickly returns the last known size of the directory. For more information, see the LustreDU software page in our support documentation.