[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: embarassingly parallel modeller



Hello Dave,


Clustor is a relatively pricey job scheduling program, we do not use it
anymore, there are freeware programs for this purpose too.

The normal process to speed up the calculation of many models is to
submit them separately for separate computers  or if you have a
multiple  processors machine then separately to the same machine.

In the top files you need to set the calculation for only one model like
this e.g.:

SET STARTING_MODEL = 11
SET ENDING_MODEL = 11


(note that SET STARTING_MODEL = 11
SET ENDING_MODEL = 12
generates two models 11 and 12)

and you need to take care of changing the random number feed otherwise
you will run identical optimizations and calculate identical models

SET RAND_SEED = 'rand_seed'

where you need to generate these rand_seed numbers externally.

If you submit the (many) top files, one for each model, you need to take
care of different top file names too, by this way you do not need to
worry about overwriting the log files.


Andras


"David E. Konerding" wrote:
> 
> Hi, while reading the documentation for modeller I noticed a reference
> to a script
> called 'run_clustor' but I don't think that it comes with modeller (I
> assume this is some
> internal lab script for a cluster).  However, this got me to thinking:
> in most of my usage of
> modeller, all I do is say:
> 
> SET STARTING_MODEL = 1
> SET ENDING_MODEL = 20
> 
> and wait for a while for the model to be built.
> 
> I wonder, would I get the "same' results (or at least similar, due to
> differences
> in random number generators) if I ran 20 scripts:
> 
> SET STARTING_MODEL = 1
> SET ENDING_MODEL = 2
> 
> SET STARTING_MODEL = 2
> SET ENDING_MODEL = 3
> 
> My worry is mainly that if the jobs are run the same directory, the log
> files will all overwrite each other.
> 
> Dave