Hi All, I have a class called Mutant(), its constructor generates a random FASTA sequence.
A class method *writeToPDB()* uses modeller to push the FASTA into a template PDB. This is already working very well.
If I do 1000 mutants this takes an eternity (one after the other)... so I thought of crafting a ThreadPoolExecutor.
Modeller runs inside the class method. Concurrent calls should not share memory with other instances of Mutant().
mutants = [] #Holds an Array of mutant objects executor = ThreadPoolExecutor(max_workers=20) #Thread executor
for i in range(1000): mutants.append(Mutant()) executor.submit(mutants[i].writeToPDB, templatePDB)
This simply tells the executor to queue a new thread with the writeToPDB() method.
When I try to do this, it seems that modeller is being executed in the first call and sharing its state with all subsequent calls... so I get like 90 models (not 1000) written in whatever state modeller was when running the other instances.
How can I force a class method to use its own modeller instance?.
This is the method:
env = environ() env.libs.topology.read(file='$(LIB)/top_heav.lib') env.libs.parameters.read(file='$(LIB)/par.lib')
aln = alignment(env) mdl = model(env, file=code) aln.append_model(mdl, atom_files=code, align_codes=code) residue = self.fasta[resid-Mutant.compensate] sel.mutate(residue_type=Mutant.res1to3[residue]) #mutate
aln.append_model(mdl, align_codes='mut') mdl.clear_topology() mdl.generate_topology(aln['mut']) mdl.transfer_xyz(aln) mdl.build(initialize_xyz=False, build_method='INTERNAL_COORDINATES')
name = str(self.pdbpath)+str(self.id)+".pdb" mdl.write(file=name)
On 5/7/20 10:16 AM, Pedro Guillem wrote: > When I try to do this, it seems that modeller is being executed in the > first call and sharing its state with all subsequent calls... so I get > like 90 models (not 1000) written in whatever state modeller was when > running the other instances.
What do you mean by "state" in this case?
> How can I force a class method to use its own modeller instance?.
If you want a separate copy of Modeller, you'd need to use processes rather than threads. But I don't know why you'd want to do that. Building a model from sequence is pretty fast so you'd probably lose more in startup overhead if you do that. Most likely the most expensive part is reading top_heav.lib and par.lib, so I would just do that once and then use the same environ object for every sequence.
Ben Webb, Modeller Caretaker