next up previous contents index
Next: Model building Up: Comparative protein modeling primer Previous: Becoming familiar with the   Contents   Index


Selecting the templates

The new improved alignment is input to the ID_TABLE or COMPARE_SEQUENCES commands to construct a matrix of pairwise sequence distances. This matrix is then used either to prepare an `evolutionary' tree for the whole family or to cluster the proteins by the principal components technique available through the PRINCIPAL_COMPONENTS command of MODELLER. For evolutionary trees, the DENDROGRAM command of MODELLER or the PHYLIP program written by Joe Felsenstein can be used (you can get PHYLIP by anonymous FTP from evolution.genetics.washington.edu/pub/phylip) [Felsenstein, 1985]. The clustering is then examined to decide which known structures are suitable templates for model building in the next stage. Usually, all significantly different structures in the cluster that contains the target sequence are used. It is not always best to use all related 3D structures as templates because the objective function may become too rugged, sometimes resulting in sub-optimal solutions (e.g., six templates is a large number of templates). It also does not make sense to include two relatively similar templates solved at a high and low resolution; use only the high resolution template. Depending on the modeling problem at hand, other factors can be considered in the selection of templates, such as ligands bound to the template and/or target, whether the template structure was solved in solution or in a crystal, etc. Moreover, more experienced users can try to use a smaller number of templates for mainchain distance restraints and a larger number of templates for sidechain conformation, but that involves editing the TOP scripts for comparative modeling. Also, templates can be very short, such as loops from unrelated protein structures that fit on the given framework regions; for example, canonical loops [Chothia & Lesk, 1987] could be used as templates in modeling a complementarity determining region of an immunoglobulin. However, MODELLER's ab initio loop modeling facility is the preferred way to model loops (Section 3.3).


next up previous contents index
Next: Model building Up: Comparative protein modeling primer Previous: Becoming familiar with the   Contents   Index
Ben Webb 2004-04-20