Re: [modeller_usage] parallel modeller
I ran into the same problem. I searched the wiki but couldn't find a solution. From the description provided in the thread is not obvious how to proceed. Is there a working script that could be used as a starting point to run parallel jobs in a cluster using PBS?
I managed to submit jobs using independent modeller.py scripts differing in the start and end model parameter only: in file modeller01.py start=1 end=1 in file modeller02.py start=2 end=2 each one of these run in a different node (jobs submitted with qsub), but the two resulting models 1s58.B99990001.pdb and 1s58.B99990002.pdb are exactly the same, which brings me to my second question: Is there a way to specify the initial random seed used by each modeller script?
Thanks!
Starr Hazard wrote: > The references to parallelization seem to point rather strongly to the SGE scheduler...
Not at all - the 'job' class is simply a bag of 'slave' objects. There is no requirement that you use any particular resource management system. For example, local_slave starts up a slave on the local machine (ideal if you have a multi-core machine). ssh_slave starts up a slave on a machine accessible by ssh, ideal if you have a cluster set up to allow passwordless ssh (or rsh) to individual nodes. The only slave classes which use SGE are sge_pe_slave and sge_qsub_slave. I wrote those because we happen to have an SGE cluster. But there's no reason why you couldn't write your own slaves to use PBS mechanisms.
Can any of the commands eg
sge_qsub_job(options, maxslave, seq=(), modeller_path=None, host=None)
work with PBS scheduler?
Well, obviously not sge_qsub_job, as the name would suggest, since that is a convenience class for SGE. Just use the regular job base class. If you then have a traditional ssh-to-any-node setup, all you then need to do is loop over the nodes in your machine file and make an ssh_slave for each one. Alternatively, it would be pretty simple to write a class that used the PBS TM mechanism via something like mpiexec. I suggest you put the result into the Modeller wiki, so that other PBS users can use or modify it.
On Sun, Dec 7, 2008 at 12:53 PM, Mauricio Carrillo Tripp <trippm@scripps.edu > wrote:
> I ran into the same problem. I searched the wiki but couldn't find a solution. From the description > provided in the thread is not obvious how to proceed. Is there a working script that could be used > > as a starting point to run parallel jobs in a cluster using PBS? > > I managed to submit jobs using independent modeller.py scripts differing in the start and end model > parameter only: > in file modeller01.py start=1 end=1 > > in file modeller02.py start=2 end=2 > each one of these run in a different node (jobs submitted with qsub), but the two resulting models > 1s58.B99990001.pdb and 1s58.B99990002.pdb are exactly the same, which brings me to my second question: > > Is there a way to specify the initial random seed used by each modeller script? > >
I was under the (wrong) impression that just by using different values for automodel.starting_model and automodel.ending_model in two different modeller scripts (as described above) would produce two different final models. The correct answer is that this will just produce the exact same final model but named differently.
I found automodel.rand_method=randomize.xyz and automodel.rand_method=randomize.dihedrals (and I'm guessin automodel.rand_method=None being the default). If I add this declaration to the two modeller scripts from above, would that have the same effect as to have only one script with automodel.starting_model=1 automodel.ending_model=2?
> > > Thanks! > > > Starr Hazard wrote: >> > The references to parallelization seem to point rather strongly > > to the SGE scheduler... > > Not at all - the 'job' class is simply a bag of 'slave' objects. There is > no requirement that you use any particular resource management system. For > example, local_slave starts up a slave on the local machine (ideal if you > have a multi-core machine). ssh_slave starts up a slave on a machine > accessible by ssh, ideal if you have a cluster set up to allow passwordless > ssh (or rsh) to individual nodes. The only slave classes which use SGE are > sge_pe_slave and sge_qsub_slave. I wrote those because we happen to have > an SGE cluster. But there's no reason why you couldn't write your own > slaves to use PBS mechanisms. > > Can any of the commands > eg > > sge_qsub_job(options, maxslave, seq=(), modeller_path=None, host=None) > > work with PBS scheduler? > > Well, obviously not sge_qsub_job, as the name would suggest, since that is > a convenience class for SGE. Just use the regular job base class. If you > then have a traditional ssh-to-any-node setup, all you then need to do is > loop over the nodes in your machine file and make an ssh_slave for each > one. Alternatively, it would be pretty simple to write a class that used > the PBS TM mechanism via something like mpiexec. I suggest you put the > result into the Modeller wiki, so that other PBS users can use or modify > it. > > > > > -- > 0 | Mauricio Carrillo Tripp, PhD > / | Department of Molecular Biology, TPC6 > 0 | The Scripps Research Institute > \ | 10550 North Torrey Pines Road > 0 | La Jolla, California 92037 > / | trippm@scripps.edu > 0 | http://www.scripps.edu/~trippmhttp://www.scripps.edu/%7Etrippm > > ** Aut tace aut loquere meliora silentio ** >
Mauricio Carrillo Tripp wrote: > I ran into the same problem. I searched the wiki but couldn't find a > solution. From the description provided in the thread is not obvious > how to proceed. Is there a working script that could be used as a > starting point to run parallel jobs in a cluster using PBS?
The solution is the same as before - we don't have a PBS cluster, so haven't written a startup mechanism for it (only for SGE). If you do have a PBS cluster and a little bit of Python knowledge, it should be straightforward for you (or others) to write such a PBS class using the existing SGE classes as an example. But nobody has done that yet.
> Is there a way to specify the initial random seed used by each > modeller script?
Sure - set rand_seed when you create your environ object; see http://salilab.org/modeller/9v5/manual/node105.html
For example, env = environ(rand_seed=-2000)
Ben Webb, Modeller Caretaker
participants (2)
-
Mauricio Carrillo Tripp
-
Modeller Caretaker