py-h5py Overview

HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data.

From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accessed using the tradional POSIX /path/to/resource syntax.

Support

Usage

The following modules are needed to be loaded into the environment:
$ module load python
$ module load python_h5py
Using h5py with python is very straightforward. Import the h5py and numpy python modules. Because of the similarities between h5py and numpy's datasets, both modules work very well together.
import h5py
import numpy as np

f= h5py.File("testfile.hdf5","w")
dset = f.create_dataset("dset1", (100,), dtype='i')
dset[...] = np.arange(100)
The h5py script is run the same as any other python script. To make sure it worked, you can load the hdf5 module and use h5dump to display the contents of the newly created hdf5 data file.
$ module load cray-hdf5
$ h5dump testfile.hdf5
HDF5 "testfile.hdf5" {
GROUP "/" {
 DATASET "dset1" {
 DATATYPE H5T_STD_I32LE
 DATASPACE SIMPLE { ( 100 ) / ( 100 ) }
 DATA {
 (0): 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
 (19): 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
 (35): 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
 (51): 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,
 (67): 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82,
 (83): 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98,
 (99): 99
 }
 }
}
}
For a more in depth tutorial, see http://docs.h5py.org/en/latest/quick.html
Builds

SUMMIT

  • py-h5py@2.6.0%gcc@4.8.5+mpi
  • py-h5py@2.6.0%gcc@4.8.5~mpi ^hdf5~mpi
  • py-h5py@2.6.0%gcc@4.8.5~mpi ^python@3.5.2 ^hdf5~mpi

RHEA

  • 2.5.0
  • 2.2.0

TITAN

  • 2.4.0
  • 2.5.0
  • 2.6.0
  • 2.6.0_parallel