Dear Modeller creators and users,
I wonder, which of the scoring functions available in Modeller is preferable to choose best models.
A similar question was answered back in 2006: https://salilab.org/archives/modeller_usage/2006/msg00214.html
but it was related to very different use case (template of very low identity), and the answer omits the relation to molpdf. The modeller documentation on DOPE score and the 2006 DOPE paper say, that it was designed to choose the best model. However, this value is not printed out by default neither appear in the detailed evaluation of local model problems presented in the tables of restraints violations.
The fact that molpdf only is selected as the default score of model quality makes DOPE at little suspicious. I mean, it looks like there was a reason to hide it. In my case, my models are good enough to all have GA341 scores equal to 1, what makes this value useless. Still, molpdf and DOPE do not correlate, so which one shall I consider more important for model choice?
I see from my experience that molpdf is rather sensitive to the choice of templates (e.g. the same protein template from X-ray vs NMR can give very different molpdf values of models, with much more restraint violations for NMR). Also, if one is using multiple templates for the same sequence, the molpdf values are closer to ones obtained using single worst template than to using the best one, while DOPE values do not change much. These observations would count against relying on molpdf, but if molpdf is misleading, then the entire verbose analysis of restraint violations from which molpdf is calculated must be equally misleading.
Could you share your insight and comments?
With regards,
Paweł Kędzierski
On 2/9/19 5:50 AM, Pawel Kedzierski wrote: > I wonder, which of the scoring functions available in Modeller is > preferable to choose best models.
There isn't really a debate here - the answer is DOPE or SOAP.
> The fact that molpdf only is selected as the default score of model > quality makes DOPE at little suspicious.
molpdf is the Modeller scoring function, so it is always output - we can't build models without it. Modeller computes no assessment functions by default. Note that an assessment function and a scoring function are not the same thing, and are designed to solve different problems (assessment shows how like a "normal protein" your model is; scoring shows how like the template it is).
Restraint violations are an indication of problems with your inputs (the alignment, typically). You shouldn't use them to rank models.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
Pawel Kedzierski