Installing h5py

h5py is a python module which allow you to easily use the HDF5 format from python. HDF5 is a powerful format which supports compression and parallel I/O. Installing h5py from source was not a piece of cake. I show here the easy and fast way. As a souvenir, I also show the manual way, which is to be avoided.

The easy way

Use pip, a package manager for python ! The issues mentioned below (i.e a custom MPI path) can be solved with :

$ CFLAGS="-I/usr/lib/openmpi/include/" pip install h5py

If you have HDF5 in a non-standard location, for example in /HDF5_folder, use instead:

$ CFLAGS="-I/usr/lib/openmpi/include/ -I/HDF5_folder/include" LDFLAGS=-L/HDF5_folder/lib" pip install h5py

To install HDF5, see this post.

The painful way

Okay, let’s download the source for h5py and compile it. HDF5 is already installed so it should be easy as python setup.py install, right ?

$ python setup.py build
/usr/local/include/H5public.h:61:20: fatal error: mpi.h: 
No such file or directory  

It cannot find the MPI header files. No problem, we just need to update the compiler include flags. Turns out, it is not so easy:

$ python setup.py build --include-dirs=/usr/lib/openmpi/include/
error: option --include-dirs not recognized  

After searching 10min on the Internet, I finally find it:

$ python setup.py build_ext --include-dirs=/usr/lib/openmpi/include/ 

Some compilation later, the module fails to link:

/usr/bin/ld: /usr/local/lib/libhdf5.a(H5.o): relocation 
R_X86_64_32 against `.rodata.str1.1' can not be used when
making a shared object; recompile with -fPIC
libhdf5.a: could not read symbols: Bad value

Now, what does this even mean ?? After looking into the installation folder, I realize only the static libraries have been installed from HDF5. Let’s try to reinstall HDF5 both shared and static libraries. This should solve the issue:

$ ./configure --enable-fortran --enable-parallel --prefix=/usr/local --enable-shared --enable-static

It is a very bad idea, the compiler spurts a lot of not-so-nice messages at me. Now, let’s try to install only the shared version of HDF5:

$ ./configure --enable-fortran --enable-parallel --prefix=/usr/local --enable-shared 
$ make

It compiles ! Now, we just remove the old files and install HDF5 properly. Even better, h5py is happy this thime and install itself properly with:

$ python setup.py build_ext --include-dirs=/usr/lib/openmpi/include/
$ python setup.py install

Finally, we can try loading the module in python:

    >>> import h5py
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/usr/local/lib/python2.7/dist-packages/h5py-2.3.0-py2.7-linux-x86_64.egg/h5py/__init__.py",
      line 10, in <module>
          from h5py import _errors
          ImportError: libhdf5.so.8: cannot open shared object
          file: No such file or directory

The library is not found so let us add it to the proper variable:

$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib 

And now it works ! It was not that straightforward though.

Other possible error

Installing it on a Fedora machine, I got:

    >>> import h5py
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
        File "h5py/__init__.py", line 10, in <module>
            from h5py import _errors
            ImportError: cannot import name _errors

It happens when you try to import h5py from the source directory. Change to another directory and try again.