In principle MODELLER "copies" the
coordinates of the atoms in the template to the corresponding residues
(as defined by the alignment) in the target sequence. If multiple
templates are used then it tries to find a middle solution. As you
realise the amino-acid sequence of target protein has little in common
with the amino-acid sequence of your templates, therefore its is
plausible to expect that the native structure of your enzyme will
deviate from the templates.
The following paragraph is taken from the Review with PMID
16510277 :Two important factors influence the ability to predict accurate models:
the extent of structural conservation between target and template, and
the correctness of alignment [
4 and 14•• A. Kryshtafovych, C. Venclovas, K. Fidelis and J. Moult, Progress over the first decade of CASP experiments, Proteins 61 (2005) (suppl 7), pp. 225–236. Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (84)
A description of the progress made in protein structure prediction during the course of the CASP experiments.14••]. Models based on templates with more than 50% sequence identity are generally very accurate and can exhibit
1 Å
Cα atom rmsd from the experimental structure. Proteins with 30–50%
sequence identity share at least 80% of their structures; the best CASP
models within this range usually do not exceed 4 Å rmsd (typically
2–3 Å) from the native structure, with errors located mainly in loop
regions. Structural conservation can be as low as 55% for proteins that
display 20–30% sequence identity or even lower when sequence identity
drops below 20%. Whereas alignments are most often near optimal for
targets with more than 30% sequence identity to template structures
(easy targets), below this threshold (mainly difficult targets),
alignment quality sharply decreases and even as many as half of all
residues may be misaligned when sequence identity is less than 20% [
14••].
Sequence Related:
The multiple alignment was only used for the phylogenetic analysis,
whereas the pairwise alignments were used as a starting point to create
the models. This procedure is conceptually wrong, since the alignment
created by PSI-BLAST is not necessarily the best one. In fact, PSI-BLAST
makes a pairwise alignment to find the most similar sequences, not to
find the best alignment between the sequences. Moreover, with such a low
sequence similarity (less than 30%) the best procedure to be sure of a
good starting alignment is to perform a multiple alignment, and in case
to use also predictions of the position of secondary structure elements,
etc in order to improve as much as possible the quality of the
alignment (see refs. to the CASP competitions). Then, the alignment
between template(s) and model should be extrapolated by the multiple
sequence alignment.
Homolgy Definition:
The
term "homology" simply indicates the presence of a common evolutionary
origin between two biological entities: therefore, two proteins are
homologous or not. Somebody said that talking about "X% of homology"
would be more or less the same as talking about a women who is "X%
pregnant". Instead, it is correct to say that two proteins have X% of
their amino acids identical.
Energy Minimization:
The authors do not have the energy minimized structure, which is a must
for carrying out interaction analyses and if not done, leads to
incorrect interpretations. They have assumed the program Modeller
provides a proper energy-minimized structure with 'automodel'
environment. However, published literature that use Modeller in such an
environment still have been found to energy minimize the structure and
only then use further.