I am trying to execute MPI and CUDA code in a cluster. The code works fine on one machine, but when I try to execute it on a cluster, I get an error:
when loading shared libraries: libcudart.so.4: cannot open shared objects file: no such file or directory
I checked my PATH and LD_PATH and everything looks fine. I have a .bashrc file that contains the following entries -
export PATH = $ PATH: / usr / local / lib /: / usr / local / lib / openmpi: / usr / local / cuda / bin export LD_LIBRARY_PATH = $ LD_LIBRARY_PATH: / usr / local / lib: / usr / local / lib / openmpi /: / usr / local / cuda / lib
All machines have the same installation of CUDA and OpenMPI.
I also have / usr / local / cuda / lib in / etc / ld.so.conf
Can anyone help me with this. This problem is really annoying.
Thank.
source
share