Dear Modellers,
I've read previous posts on the same topic and concluded that it is better
to generate multiple models with moderate refinement and loop optimization
level, rather that a few with very thorough parameterization. I've also
noticed myself that with the thorough parameterization parts of the
secondary structure are distorted.
I have concluded about the optimum alignment after a lot of experimentation
and would like to set up a very effective optimization process. However I'm
not sure about the output files. My code looks like this:
a = MyLoopModel(env, alnfile=alignment,
> knowns=known_templates,
> assess_methods=(assess.DOPEHR,assess.normalized_dope),
> sequence='target')
> a.starting_model = 1
> a.ending_model = 2
> # Normal VTFM model optimization:
> a.library_schedule = autosched.normal
> a.max_var_iterations = 200 ## 200 by default
> # Very thorough MD model optimization:
> a.md_level = refine.slow
> a.repeat_optimization = 1
>
> a.loop.starting_model = 1 # First loop model
> a.loop.ending_model = 5 # Last loop model
> a.loop.md_level = refine.slow # Loop model refinement
> level
>
Which generates the following pdb files:
target.B99990001.pdb target.B99990002.pdb target.BL00040002.pdb
> target.IL00000001.pdb target.IL00000002.pdb
>
I thought the above should perform model refinement twice and write 5
different conformations (loop optimization) for each. So my questions are
the following:
1) Can you explain what's happening with the .pdb files?
2) I 'd like to ask your opinion about the most effective way to find a
near-native protein conformation in low sequence identity levels. How should
the parameters shown above be set? I don't care if it's running a day or so
as long as I get good results.
3) I also attempted to cluster the models with a.cluster(cluster_cut=1.5),
which generated a representative structure with the parts of the protein
that remained similar in most of the models but without the variable parts
(files cluster.ini and cluster.opt). Does it make sense to select the model
that is closer to that consensus structure? If yes is there a way to do it
with Modeller? I know it can been found with Maxcluster program. Or
alternatively, do you reckon it is better to select the based model based on
the normalized DOPE z-score?
Hope to get some answers on these question cause I've been strangling to
find the best refinement/optimization protocol for several weeks.
thanks,
Thomas