[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[modeller_usage] structure refinement and loop optimization protocol



Dear Modellers,

I've read previous posts on the same topic and concluded that it is better to generate multiple models with moderate refinement and loop optimization level, rather that a few with very thorough parameterization. I've also noticed myself that with the thorough parameterization parts of the secondary structure are distorted.

I have concluded about the optimum alignment after a lot of experimentation and would like to set up a very effective optimization process. However I'm not sure about the output files. My code looks like this:

            a = MyLoopModel(env, alnfile=alignment,
                      knowns=known_templates, assess_methods=(assess.DOPEHR,assess.normalized_dope),
                      sequence='target')
            a.starting_model = 1
            a.ending_model = 2
            # Normal VTFM model optimization:
            a.library_schedule = autosched.normal
            a.max_var_iterations = 200 ## 200 by default
            # Very thorough MD model optimization:
            a.md_level = refine.slow
            a.repeat_optimization = 1

            a.loop.starting_model = 1           # First loop model
            a.loop.ending_model   = 5          # Last loop model
            a.loop.md_level       = refine.slow # Loop model refinement level
 

Which generates the following pdb files:

target.B99990001.pdb  target.B99990002.pdb  target.BL00040002.pdb  target.IL00000001.pdb  target.IL00000002.pdb


I thought the above should perform model refinement twice and write 5 different conformations (loop optimization) for each. So my questions are the following:

1) Can you explain what's happening with the .pdb files?

2) I 'd like to ask your opinion about the most effective way to find a near-native protein conformation in low sequence identity levels. How should the parameters shown above be set? I don't care if it's running a day or so as long as I get good results.

3) I also attempted to cluster the models with a.cluster(cluster_cut=1.5), which generated a representative structure with the parts of the protein that remained similar in most of the models but without the variable parts (files cluster.ini and cluster.opt). Does it make sense to select the model that is closer to that consensus structure? If yes is there a way to do it with Modeller? I know it can been found with Maxcluster program. Or alternatively, do you reckon it is better to select the based model based on the normalized DOPE z-score?
 

Hope to get some answers on these question cause I've been strangling to find the best refinement/optimization protocol for several weeks.

thanks,
Thomas