Identical output for different batches
Dear MODELLER,
In order to generate hundreds of PDBs, I submit multiple jobs to the cluster. The input template is the same for all the jobs. For each job (i.e. each batch), 10 PDBs are generated. The 10 PDBs within the same batch are different in their "Objective Function" in the PDB remark. But the PDBs between the batches have the same "Objective Function".
For example, ) the 3rd PDB in Batch 1 has the same "Objective Function" as the 3rd PDB in Batch 5 ) the 6th PDB in Batch 2 has the same "Objective Function" as the 6th PDB in Batch 10
So it seems that all the jobs are started from a same "seed". Is there a way to randomise the seed?
Thank you!
Cheng
The script I use is:
from modeller import * from modeller.automodel import * # Load the automodel class
log.verbose()
class MyModel(automodel): def special_patches(self, aln): # Rename both chains and renumber the residues in each self.rename_segments(segment_ids=['A', 'B', 'C', 'D','E', 'F','G', 'H','I', 'J','K', 'L','N', 'O','M', 'P','Q', 'R','S', 'U','V', '1','2', '3','4', '5','6', '7','8', '9',], renumber_residues=[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) # Another way to label individual chains: self.chains[0].name = 'A' self.chains[1].name = 'B' self.chains[2].name = 'C' self.chains[3].name = 'D' self.chains[4].name = 'E' self.chains[5].name = 'F' self.chains[6].name = 'G' self.chains[7].name = 'H' self.chains[8].name = 'I' self.chains[9].name = 'J' self.chains[10].name = 'K' self.chains[11].name = 'L' self.chains[12].name = 'N' self.chains[13].name = 'O' self.chains[14].name = 'M' self.chains[15].name = 'P' self.chains[16].name = 'Q' self.chains[17].name = 'R' self.chains[18].name = 'S' self.chains[19].name = 'U' self.chains[20].name = 'V' self.chains[21].name = '1' self.chains[22].name = '2' self.chains[23].name = '3' self.chains[24].name = '4' self.chains[25].name = '5' self.chains[26].name = '6' self.chains[27].name = '7' self.chains[28].name = '8' self.chains[29].name = '9'
env = environ() # directories for input atom files env.io.atom_files_directory = ['.', '../atom_files']
# Be sure to use 'MyModel' rather than 'automodel' here! a = MyModel(env, alnfile = '../../1_raw/JN254802-5tx1.ali' , # alignment filename knowns = '5tx1', # codes of the templates sequence = 'JN254802') # code of the target
a.starting_model= 1 # index of the first model a.ending_model = 10 # index of the last model # (determines how many models to calculate) a.make() # do comparative modeling
On 6/5/20 7:00 PM, ZHANG Cheng via modeller_usage wrote: > So it seems that all the jobs are started from a same "seed". Is there a > way to randomise the seed?
Of course - set rand_seed when you create your environ() object: https://salilab.org/modeller/9.24/manual/node117.html
Ben Webb, Modeller Caretaker
Dear Ben,
Thank you very much! Does it mean that I need to change "env = environ()" to e.g. "env = environ(rand_seed=-81)", or "env = environ(rand_seed=-775)", just anything that between -2 and -50000?
That is doable. But I have 100 jobs to run. In order to have the same code for different jobs, can I do like this?
random_integer = random.randint(-50000, -2) env = environ(rand_seed=random_integer)
Thank you!
Yours sincerely Cheng
------------------ Original ------------------ From: "Modeller Caretaker"<modeller-care@salilab.org>; Date: Sat, Jun 6, 2020 12:19 PM To: "ZHANG Cheng"<272699575@qq.com>;"modeller_usage"<modeller_usage@salilab.org>;
Subject: Re: [modeller_usage] Identical output for different batches
On 6/5/20 7:00 PM, ZHANG Cheng via modeller_usage wrote: > So it seems that all the jobs are started from a same "seed". Is there a > way to randomise the seed?
Of course - set rand_seed when you create your environ() object: https://salilab.org/modeller/9.24/manual/node117.html
Ben Webb, Modeller Caretaker
On 6/6/20 3:00 AM, ZHANG Cheng wrote: > Thank you very much! Does it mean that I need to change "env = > environ()" to e.g. "env = environ(rand_seed=-81)", or "env = > environ(rand_seed=-775)", just anything that between -2 and -50000?
Yes.
> That is doable. But I have 100 jobs to run. In order to have the same > code for different jobs, can I do like this? > > random_integer = random.randint(-50000, -2) > env = environ(rand_seed=random_integer)
I wouldn't recommend this as there's a small but non-zero chance of multiple jobs getting the same seed. Normally we use an environment variable that is guaranteed to be different for each run. For example if you run your jobs as tasks with something based on Sun Grid Engine (SGE) this environment variable would be $SGE_TASK_ID.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
ZHANG Cheng