ReplicaExchange: Could not find MPI. Using Serial Replica Exchange
Hi,
I'm trying to run the exosome example with MPI but can't seem to get it to work. I installed binaries both on Fedora 19 and Ubuntu Trusty. I have openmpi on both and mpich on Fedora as well. mpirun is in my path.
Do I need to build with MPI mysefl for this to work or is this supposed to work with binaries out of the box?
Thanks, Julian
On 08/30/2016 09:00 AM, Julian wrote: > I'm trying to run the exosome example with MPI but can't seem to get it > to work. I installed binaries both on Fedora 19 and Ubuntu Trusty. I > have openmpi on both and mpich on Fedora as well. mpirun is in my path. > > Do I need to build with MPI mysefl for this to work or is this supposed > to work with binaries out of the box?
You need the IMP.mpi module to use MPI (and you have to run IMP via mpirun). We don't include this in the regular binary package because we don't want to force everybody to install an MPI library just to use IMP, and there are multiple MPI libraries to choose from. But on Fedora you can simply install the IMP-mpich RPM which includes the IMP.mpi module built against MPICH, and then it should work. We don't currently include a similar package for Ubuntu, but it probably wouldn't be that hard to add one for future releases.
Ben
Thanks for the quick reply. I was doing what you described but no luck. I then compiled from source against openmpi, used the magic python import lines from the IMP.mpi docs and that turned out to be fairly straightforward. Everywhere along the way, things seemed to be working fine, no errors. But then when I ran it with
mpirun -np 10 ~/IMP/imp-2.6.2_release/setup_environment.sh python exosome.modeling.py --test
it doesn't appear to be any faster than a single process. I started it and then immediately started a single proc on the same 24-core box in a different dir
~/IMP/imp-2.6.2_release/setup_environment.sh python exosome.modeling.py --test
Looking at the number of frames it's outputting, it seems like the single proc is going faster. So it appears I'm getting all the downside from the MPI overhead without any of the benefits. Any ideas? Thanks much
On Tue, Aug 30, 2016 at 1:27 PM, Ben Webb ben@salilab.org wrote:
> On 08/30/2016 09:00 AM, Julian wrote: > >> I'm trying to run the exosome example with MPI but can't seem to get it >> to work. I installed binaries both on Fedora 19 and Ubuntu Trusty. I >> have openmpi on both and mpich on Fedora as well. mpirun is in my path. >> >> Do I need to build with MPI mysefl for this to work or is this supposed >> to work with binaries out of the box? >> > > You need the IMP.mpi module to use MPI (and you have to run IMP via > mpirun). We don't include this in the regular binary package because we > don't want to force everybody to install an MPI library just to use IMP, > and there are multiple MPI libraries to choose from. But on Fedora you can > simply install the IMP-mpich RPM which includes the IMP.mpi module built > against MPICH, and then it should work. We don't currently include a > similar package for Ubuntu, but it probably wouldn't be that hard to add > one for future releases. > > Ben > -- > ben@salilab.org https://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users >
On 8/31/16 9:35 AM, Julian wrote: > But then when I ran it with > > mpirun -np 10 ~/IMP/imp-2.6.2_release/setup_environment.sh python > exosome.modeling.py http://exosome.modeling.py --test > > it doesn't appear to be any faster than a single process.
It won't be any faster than a single process. It's doing replica exchange, so rather than sampling N frames with a single replica, it samples N frames with 10 replicas. So you'll get 10x the sampling in the same amount of walltime. There's almost no communications overhead with replica exchange, so I'd be enormously surprised to see any slowdown.
Ben
Got it. Thanks for the help.
On Wed, Aug 31, 2016 at 1:34 PM, Ben Webb ben@salilab.org wrote:
> On 8/31/16 9:35 AM, Julian wrote: > >> But then when I ran it with >> >> mpirun -np 10 ~/IMP/imp-2.6.2_release/setup_environment.sh python >> exosome.modeling.py http://exosome.modeling.py --test >> >> it doesn't appear to be any faster than a single process. >> > > It won't be any faster than a single process. It's doing replica exchange, > so rather than sampling N frames with a single replica, it samples N frames > with 10 replicas. So you'll get 10x the sampling in the same amount of > walltime. There's almost no communications overhead with replica exchange, > so I'd be enormously surprised to see any slowdown. > > > Ben > -- > ben@salilab.org https://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users >
participants (2)
-
Ben Webb
-
Julian