forwarded by the list owner
--------------------------------------------------------
Hi,
for question 1, i think it is normal and expected that a model, even if built on a sequentially 100% identical template, will be somewhat different compared to an experimental solution. Although it should not go beyond let us say 0.5, or certainly not beyond 1.0 Ang RMSD.
It is below the "experimental error" i.e. if the same protein is solved experimentally in different crystal forms, or at different resolution levels, or solved at high resolution but once by X-ray and once by NMR, you will still see an approx <1 Ang RMSD difference among the structures. So there is nothing special to see that your model is not exactly identical to the experimental one. for a reference you can look up figure 6 (and text) in chapter 7 (pp.167-206), book: Protein Structure (determination, analysis and applications for drug discovery) editor: DI Chasman, 2003 Marcel Dekker.
question 2: it is a very interesting and useful survey that you did. Unfortunately it is difficult to generalize, because in each modeling case the set of available templates (their sequence identity to the target and structural variability with each other) is different. However your experiment about a proper "essay" is near exhaustive within your specific experiment, so you are certainly in a position to make a point. Of course the best would be to use instead of Procheck or other programs the actual experimental structures to verify the best "essay", e.g. re-model your protein A without the 100 % identical template and explore the same question you did for protein B. In this case you can compare your resulting models with the actual X-ray structure.
Andras
On Mon, 2003-05-12 at 15:52, Douglas Kojetin wrote: > please see the message, originally directed towards dr. sali, below. > > if anyone has any comments, please send them! > > many thanks, > doug kojetin > > Begin forwarded message: > > > Dr. Sali: > > > > I am a graduate student in the Department of Molecular and Structural > > Biochemistry at North Carolina State University. I have a question > > more about modeling process itself rather than the program MODELLER. > > > > I have used your program, MODELLER, to create models of a subfamily of > > proteins our lab and collaborators are interested in (total ~ 30). > > There are approximately 10 solved structures to the domain of > > interest. One of these solved structures (structure A) is in the same > > subfamily within the same species of proteins we are modeling (model > > A), whereas the other 29 proteins are of unknown solved structure. My > > question concerning the use of templates in the modeling process. > > > > ############## > > my main question > > ############## > > > > (if this is confusing, please let me know and i will rephrase) ... > > > > Would using a solved structure (structure A) to model a protein of > > exact sequence (model A) which will be used in a comparison of 29 > > other structures with no known structures (and lower 'homology' > > compared to that of structure A to model A -- which is 100%) bias > > model A? Overall, we are interested in comparing all 30 structures. > > This comes mostly from outside comments that our modeled protein does > > not look 'exactly' like the solved structure. As one would like it to > > look as close as possible to the solved structure, it is a model after > > all, and perhaps we just need to be more descriptive in explaining our > > results, especially pertaining to this specific model. > > > > ##################### > > how i modeled the proteins > > ##################### > > > > I performed a 'modeling parameter assay' to find the number of > > templates to use to model a protein (model B), ranging from 1 to ~8 > > templates. In addition, I 'assayed' the amount of refinement to use. > > > > Overall, I had an assay 'shaped' like a matrix with, for example, > > refinement across the top and # of templates going down. I produced 50 > > models for each and ran a variety of analyses on the models (including > > Ca RMSD to the most homologous protein, ERRAT, PROCHECK, etc) and > > computed the average 'value' output from the respective analyses. > > > > All in all, using four (4) templates and a refinement value of 1 > > produced the 'stereochemically best' models. > > > > I applied the same rationale to another protein of interest (model C), > > and the same trends were extrapolated. > > > > question > > --> is this rationale 'acceptable'? or how would you do something > > similar? > > > > Many thanks for your input, and I'm sorry for the long-winded email. > > > > Douglas Kojetin