Hello,
I just wanted to ask a few questions...I am modelling loops on a GPCR. My current script follows my questions.
1. Will I get better sampling by running 20 different runs (with different random number seeds) that create 25 models each thus a total of 500 models? 2. Would setting dynamic_coulomb = True and a relative _dielectric = 80 be the best way to simulate an aqueous environment for the loops? 3. Is it possible to run the loop optimization with explicit hydrogens? I guess my env.io.hydrogen = True does not actually set up the calculation to do explicit hydrogens but only tells Modeller to read the hydrogens from my pdb file. 4. For the refinement level, I am using md_level = refine.very_slow for what size system is the refine.slow_large used? 5. Would I get better sampling by setting repeat.optimization greater than 1?
Thank you in advance.
Judy Norris
========================================================================== #Homology modelling by the automodel class
from modeller.automodel import * # Load the automodel class
log.verbose()
env = environ(rand_seed=-32601) env.io.atom_files_directory = './:../atom_files' env.edat.dynamic_coulomb = True env.edat.relative_dielectric = 80 env.io.hydrogen = True
class myloop(loopmodel): def select_loop_atoms(self): stat = 'INITIALIZE' for segs in (('1:', '15:'), ('49:', '54:'), ('84:', '89:'),('124:','133:'), ('226:','233:')): self.pick_atoms(selection_segment=segs, selection_search='segment', pick_atoms_set=1, res_types='all', atom_types='all', selection_from='all', selection_status=stat) stat = 'ADD'
m = myloop(env, inimodel = 'protein_lp01.B99990001.pdb', # alignment filename sequence = 'protein_01') # code of the target m.loop.starting_model= 1 # index of the first model m.loop.ending_model = 25 # index of the last model # (determines how many models to calculate) m.loop.md_level = refine.very_slow # No refinement of model
m.make() # do homology modelling
=====================================================================
Judy Barnett-Norris wrote: > I just wanted to ask a few questions...I am modelling loops on a GPCR. > My current script follows my questions. > 1. Will I get better sampling by running 20 different runs (with > different random number seeds) that create 25 models each thus a total > of 500 models?
This is probably virtually the same as using the same random seed and just building 500 models. In difficult modeling cases the sampling becomes an issue, so the more models you build the more likely you are to sample the 'true' best model.
> 2. Would setting dynamic_coulomb = True and a relative _dielectric = 80 > be the best way to simulate an aqueous environment for the loops?
No. The loop modeling potential is a statistical potential, and thus implicitly includes solvation. It is not parameterized to work well with electrostatics, to the best of my knowledge.
> 3. Is it possible to run the loop optimization with explicit > hydrogens? I guess my env.io.hydrogen = True does not actually set up > the calculation to do explicit hydrogens > but only tells Modeller to read the hydrogens from my pdb file.
Correct. To build models with hydrogens, you'd need to load the top_allh.lib topology library rather than top_heav.lib (the default). The loop modeling potential is not parameterized for hydrogens though, so this probably wouldn't work well anyway. You'd probably be better off adding hydrogens to your models after optimization is complete.
> 4. For the refinement level, I am using md_level = refine.very_slow for > what size system is the refine.slow_large used?
slow_large uses a larger timestep (10fs rather than 4fs) so you are likely to see more integrator error if you use this instead of very_slow.
> 5. Would I get better sampling by setting repeat.optimization greater > than 1?
This just repeats the optimization several times, so might help to avoid local minima.
Ben Webb, Modeller Caretaker