Dear all,
In the output for salign, there is a raw quality score, a percentage quality score, and many other values. What value should I focus on to make sure that I have a good alignment? Is there a cutoff that I should use? If so, are there references that I could cite?
Thanks,
Brian
On 7/25/11 10:10 PM, Brian Tsui wrote: > In the output for salign, there is a raw quality score, a percentage > quality score, and many other values. What value should I focus on to > make sure that I have a good alignment? Is there a cutoff that I should > use? If so, are there references that I could cite?
For a structural alignment, there is an objective measure of the "best" alignment - that which maximizes the percentage of residues within the cutoff distance. This is the "quality score", which is roughly similar to the GDT_TS score; see Protein Eng Des Sel 22, 569-574, 2009. Modeller includes an "iterative structural alignment" method which maximizes this score.
For a sequence alignment, you have a tougher job - it depends what you want to use the alignment for. For comparative modeling, one approach is to score the quality of models built using that alignment. See Nucleic Acids Res 31, 3982-3992, 2003.
Ben Webb, Modeller Caretaker
In the Madhusudhan paper, the cutoff was at least 70% for a SO of 3.5 A. What number does that correspond to in the salign output?
For the quality score, is there an objective number that I can use for every alignment to see whether it is good or not?
Lastly, for "iterative salign," is it recommended to run the regular version first or just jump straight to using the iterative salign? The Madhusudhan paper seems to forgo the use of iterative salign if the alignment is 70% at a SO of 3.5, but if there were enough computing resources, would it be wise to run iterative salign without running the regular version first?
I know it's a lot of questions, but thanks in advance for answering them.
--Brian
________________________________ From: Modeller Caretaker modeller-care@salilab.org To: Brian Tsui btsui17@yahoo.com Cc: "modeller_usage@salilab.org" modeller_usage@salilab.org Sent: Tuesday, July 26, 2011 10:05 AM Subject: Re: [modeller_usage] Good Quality Score
On 7/25/11 10:10 PM, Brian Tsui wrote: > In the output for salign, there is a raw quality score, a percentage > quality score, and many other values. What value should I focus on to > make sure that I have a good alignment? Is there a cutoff that I should > use? If so, are there references that I could cite?
For a structural alignment, there is an objective measure of the "best" alignment - that which maximizes the percentage of residues within the cutoff distance. This is the "quality score", which is roughly similar to the GDT_TS score; see Protein Eng Des Sel 22, 569-574, 2009. Modeller includes an "iterative structural alignment" method which maximizes this score.
For a sequence alignment, you have a tougher job - it depends what you want to use the alignment for. For comparative modeling, one approach is to score the quality of models built using that alignment. See Nucleic Acids Res 31, 3982-3992, 2003.
Ben Webb, Modeller Caretaker -- modeller-care@salilab.org http://www.salilab.org/modeller/ Modeller mail list: http://salilab.org/mailman/listinfo/modeller_usage
On 7/26/11 8:33 PM, Brian Tsui wrote: > In the Madhusudhan paper, the cutoff was at least 70% for a SO of 3.5 A.
Do you mean the SO was 70% for a cutoff of 3.5A?
> What number does that correspond to in the salign output?
The percentage of equivalent (i.e. within the cutoff) positions is the qscorepct member of the SalignData object returned by salign(). Set the cutoff with the rms_cutoff argument. And make sure you have QUALITY in the output argument.
> For the quality score, is there an objective number that I can use for > every alignment to see whether it is good or not?
As I said, only for structure alignments.
> Lastly, for "iterative salign," is it recommended to run the regular > version first or just jump straight to using the iterative salign? The > Madhusudhan paper seems to forgo the use of iterative salign if the > alignment is 70% at a SO of 3.5
I'm assuming you're looking at fig 1 in the paper here. There are three iterative procedures within that flowchart; one of them is skipped if the first procedure (simpler alignment using a subset of features) found a reasonable starting guess for the last procedure (structural alignment). The second procedure is simply an attempt to improve this guess if the first failed.
Ben Webb, Modeller Caretaker
participants (2)
-
Brian Tsui
-
Modeller Caretaker