Zoe Katsimitsoulia wrote: > 1. How will modeller handle the 10 template sequences (which themselves
> show a significant amount of variation at certain key positions). More > specifically, a particular area of my seq alignment may read like this: > > template 1 xxxxxGxxxxxxxx > template 2 xxxxxTxxxxxxxx
> template 3 xxxxxGxxxxxxxx > template 4 xxxxxGxxxxxxxx > template 5 xxxxxGxxxxxxxx > template 6 xxxxxGxxxxxxxx > template 7 xxxxxGxxxxxxxx > target xxxxxTxxxxxxxx > > In simple terms, would modeller in this case use the coord of T in the
> template 2 sequence because it matches the target at that location, or > will it use G because it is predominant in the majority of the > structures?
Check the Modeller papers. Modeller doesn't use the coordinates
directly, but other properties of the templates, and it uses a weighted sum over all templates. For instance, the Ca-Ca distances of the target would in this case be modeled by a sum of gaussians, where the peak
positions correspond to the observed distances in the templates and the weights to the template weights. The templates are weighted by local sequence similarity, which would probably favor the 'T' sequences in
this case (but I can't tell for sure because the neighboring residues are considered too, which you haven't shown). You should look at the .rsr file that Modeller produces to see which restraints it's using (although it's not that easy to read).
> 2. The 10 template structures I am using are themselves in different > states, for example, partially closed conformation, open, transition, > inhibitor bound, etc. Is this beneficial to building my model or more
> detrimental? I am assuming the former based on the logic that the more > info I am giving to modeller, the better my model should be. > However, perhaps there is something about the way modeller works which I
> have not grasped that means its actually doing more harm adding the > structures in different states?
The target will be constrained to look as much like a weighted sum of the templates as possible, so you probably want to have the templates in
the same state as your desired target state.
I would like to add my little experience here (please correct if I am wrong): in case of large number of templates, there are large number of restraints, which may be difficult to satisfy at a time by optimizer, so obj. function value may go high, which is not desirable.
Also, I observed that if templates are structurally similar and approximately 'identical' (seq. identity) to target sequence, then results are better, probably in that case, better of the templates is chosen for the concerned part of the sequence for its modelling.