Dr. Sali:
I am a graduate student in the Department of Molecular and Structural
Biochemistry at North Carolina State University. I have a question
more about modeling process itself rather than the program MODELLER.
I have used your program, MODELLER, to create models of a subfamily of
proteins our lab and collaborators are interested in (total ~ 30).
There are approximately 10 solved structures to the domain of
interest. One of these solved structures (structure A) is in the same
subfamily within the same species of proteins we are modeling (model
A), whereas the other 29 proteins are of unknown solved structure. My
question concerning the use of templates in the modeling process.
##############
my main question
##############
(if this is confusing, please let me know and i will rephrase) ...
Would using a solved structure (structure A) to model a protein of
exact sequence (model A) which will be used in a comparison of 29
other structures with no known structures (and lower 'homology'
compared to that of structure A to model A -- which is 100%) bias
model A? Overall, we are interested in comparing all 30 structures.
This comes mostly from outside comments that our modeled protein does
not look 'exactly' like the solved structure. As one would like it to
look as close as possible to the solved structure, it is a model after
all, and perhaps we just need to be more descriptive in explaining our
results, especially pertaining to this specific model.
#####################
how i modeled the proteins
#####################
I performed a 'modeling parameter assay' to find the number of
templates to use to model a protein (model B), ranging from 1 to ~8
templates. In addition, I 'assayed' the amount of refinement to use.
Overall, I had an assay 'shaped' like a matrix with, for example,
refinement across the top and # of templates going down. I produced 50
models for each and ran a variety of analyses on the models (including
Ca RMSD to the most homologous protein, ERRAT, PROCHECK, etc) and
computed the average 'value' output from the respective analyses.
All in all, using four (4) templates and a refinement value of 1
produced the 'stereochemically best' models.
I applied the same rationale to another protein of interest (model C),
and the same trends were extrapolated.
question
--> is this rationale 'acceptable'? or how would you do something
similar?
Many thanks for your input, and I'm sorry for the long-winded email.
Douglas Kojetin