HDF5-based file IO for PyMVPA objects.

Based on the h5py package, this module provides two functions (obj2hdf() and hdf2obj(), as well as the convenience functions h5save() and h5load()) to store (in principle) arbitrary Python objects into HDF5 groups, and using HDF5 as input, convert them back into Python object instances.

Similar to pickle a Python object is disassembled into its pieces, but instead of serializing it into a byte-stream it is stored in chunks which type can be natively stored in HDF5. That means basically everything that can be stored in a NumPy array.

If an object is not readily storable, its __reduce__() method is called to disassemble it into basic pieces. The default implementation of object.__reduce__() is typically sufficient. Hence, for any new-style Python class there is, in general, no need to implement __reduce__(). However, custom implementations might allow for leaner HDF5 representations and leaner files. Basic types, such as list, and dict, whose __reduce__() method does not do help with disassembling are also handled.


Although, in principle, storage and reconstruction of arbitrary object types is possible, it might not be implemented yet. The current focus lies on storage of PyMVPA datasets and their attributes (e.g. Mappers).


asobjarray(x) Generates numpy.ndarray with dtype object from an iterable
h5load(filename[, name]) Loads the content of an HDF5 file that has been stored by h5save().
h5save(filename, data[, name, mode, mkdir]) Stores arbitrary data in an HDF5 file.
hdf2obj(hdf[, memo]) Convert an HDF5 group definition into an object instance.
obj2hdf(hdf, obj[, name, memo, noid]) Store an object instance in an HDF5 group.