strategies for model refinement

Irene Newhouse

5 Jan 2011 5 Jan '11

5:24 p.m.

I'm a relatively new user of modeller & successfully built a fairly nice model of a protein with 258 residues. It forms a homodimer, and has to be modeled as a dimer, because there's a very flexible loop that can easily adopt conformation that prevent dimerization if the 2nd unit isn't present. [I confirmed that by trying it].

Someone on this group suggested the Molprobity server as a way to assess model quality, for which I'm very grateful. He also suggested relaxing the structure with Rosetta, & after epic battles with python setup & my operating system, I got it up & running. Only to learn from the RosettaCommons forum that Rosetta relax doesn't scale well with chain length & isn't useful for proteins with >200 residues. I managed to get the Molprobity score down to a respectable 2.-something using lots & lots of iterations of Schrodinger Prime loop refinement & side chain minimization. It took about a week of setting up a loop, running it, checking to see if it refined, iterating 'til I could move on, etc. So I'm looking for something a bit less hands-on, if possible. Also, Schrodinger is expensive & my boss may not renew the licenses when this lot expires in March. And finally, Molprobity signaled several C-beta issues in my structure, & Schrodinger just doesn't seem to do much about those. Therefore, I'm looking for other suggestions, preferably freeware, for model refinement. I have tried some loop refinement with Modeller itself, but wasn't thrilled with the outcome. This could easily be because I couldn't find sample scripts to copy, or an explanation of how to tweak the parameters if the loop doesn't seem to be refining. If anyone can point me in those directions, I'd be grateful, too.

Thanks so much!

Attachments:

attachment.htm (text/html — 1.9 KB)

Show replies by date

Thomas Evangelidis

5 Jan 5 Jan

6:40 p.m.

I am aware that PyRosetta doesn't scale well with chain length but I'm not sure what you mean by "isn't useful for proteins with >200 residues". Is this the message in RosettaCommons forum where you got this information?

http://www.rosettacommons.org/node/2240

I use PyRosetta to relax proteins up to 850 aa long, it takes 5-6 days for each job on a 1.8GB processor, but the results are still impressive. IMO a good model wrt torsion angles, bond lengths and steric clashes is worth the time. For your protein length I would run 10 jobs and select the best model among them. In my experience more that 10 would be needless. Otherwise you can use an MD package which is much faster, but don't expect to be thrilled with the results.

With respect to loop refinement, this is the section that described the default optimization and refinement protocol:

http://www.salilab.org/modeller/manual/node19.html#SECTION:model-changeopt

Details about loop modelling can be found here:

http://www.salilab.org/modeller/manual/node452.html#SECTION:loopmethod

Have you also read this relatively recent message where I state which parameters I use to tweak?

http://salilab.org/archives/modeller_usage/2010/msg00383.html

I personally prefer to run loop modelling after model building so that I can keep the rest of the protein rigid, apply secondary structure restraints, distance restrains, etc. You can find an example script under modeller9v8/examples/automodel/loop.py that does loop modeling from an initial conformation.

This is what I had to suggest considering my 1 year experience with MODELLER and structure prediction in general. If anyone else disagrees or wishes to add something to my hints, please feel free to do so.

Thomas

On 6 January 2011 03:24, Irene Newhouse irenenew@hawaii.edu wrote:

> I'm a relatively new user of modeller & successfully built a fairly nice > model of a protein with 258 residues. It forms a homodimer, and has to be > modeled as a dimer, because there's a very flexible loop that can easily > adopt conformation that prevent dimerization if the 2nd unit isn't present. > [I confirmed that by trying it]. > > Someone on this group suggested the Molprobity server as a way to assess > model quality, for which I'm very grateful. He also suggested relaxing the > structure with Rosetta, & after epic battles with python setup & my > operating system, I got it up & running. Only to learn from the > RosettaCommons forum that Rosetta relax doesn't scale well with chain > length & isn't useful for proteins with >200 residues. I managed to get the > Molprobity score down to a respectable 2.-something using lots & lots of > iterations of Schrodinger Prime loop refinement & side chain minimization. > It took about a week of setting up a loop, running it, checking to see if it > refined, iterating 'til I could move on, etc. So I'm looking for something a > bit less hands-on, if possible. Also, Schrodinger is expensive & my boss may > not renew the licenses when this lot expires in March. And finally, > Molprobity signaled several C-beta issues in my structure, & Schrodinger > just doesn't seem to do much about those. Therefore, I'm looking for other > suggestions, preferably freeware, for model refinement. I have tried some > loop refinement with Modeller itself, but wasn't thrilled with the outcome. > This could easily be because I couldn't find sample scripts to copy, or an > explanation of how to tweak the parameters if the loop doesn't seem to be > refining. If anyone can point me in those directions, I'd be grateful, too. > > Thanks so much! > > > _______________________________________________ > modeller_usage mailing list > modeller_usage@salilab.org > https://salilab.org/mailman/listinfo/modeller_usage > >

Mensur Dlakic

6 Jan 6 Jan

3:27 p.m.

Irene,

I know firsthand that Rosetta works even for proteins >200 residues, meaning it will make them much better in terms of energy and stereochemistry. Now, if you are expecting that the program will remove absolutely all problems with the model so it gets a perfect Molprobity score, that would be unrealistic. It seems to me that you already have a pretty good model - it doesn't have to be free of all violations to be useful. Even many legitimate protein structures, which are based on experimental maps, have "issues" when analyzed by Molprobity.

Having said all of this, I have good experience with RAPPER when it comes to loop modeling, assuming loops are of reasonable length. The program is here:

http://mordred.bioc.cam.ac.uk/~rapper/

If you get the program to install and work properly, I can offer you a small script that will take a PDB file and a range of residues for refinement, and produce a new model resampled loops. Alternatively, you can send me your PDB file and a range of residues to be refined and I will send you the model back. If you are happy with it, then you can spend more time installing and optimizing RAPPER.

Good luck,

Mensur

At 06:24 PM 1/5/2011, Irene Newhouse wrote:

>I'm a relatively new user of modeller & successfully built a fairly nice >model of a protein with 258 residues. It forms a homodimer, and has to be >modeled as a dimer, because there's a very flexible loop that can easily >adopt conformation that prevent dimerization if the 2nd unit isn't >present. [I confirmed that by trying it]. > >Someone on this group suggested the Molprobity server as a way to assess >model quality, for which I'm very grateful. He also suggested relaxing the >structure with Rosetta, & after epic battles with python setup & my >operating system, I got it up & running. Only to learn from the >RosettaCommons forum that Rosetta relax doesn't scale well with chain >length & isn't useful for proteins with >200 residues. I managed to get >the Molprobity score down to a respectable 2.-something using lots & lots >of iterations of Schrodinger Prime loop refinement & side chain >minimization. It took about a week of setting up a loop, running it, >checking to see if it refined, iterating 'til I could move on, etc. So I'm >looking for something a bit less hands-on, if possible. Also, Schrodinger >is expensive & my boss may not renew the licenses when this lot expires in >March. And finally, Molprobity signaled several C-beta issues in my >structure, & Schrodinger just doesn't seem to do much about >those. Therefore, I'm looking for other suggestions, preferably freeware, >for model refinement. I have tried some loop refinement with Modeller >itself, but wasn't thrilled with the outcome. This could easily be because >I couldn't find sample scripts to copy, or an explanation of how to tweak >the parameters if the loop doesn't seem to be refining. If anyone can >point me in those directions, I'd be grateful, too. > >Thanks so much! > >_______________________________________________ >modeller_usage mailing list >modeller_usage@salilab.org >https://salilab.org/mailman/listinfo/modeller_usage

Thomas Evangelidis

8 Jan 8 Jan

6:14 p.m.

On 7 January 2011 01:27, Mensur Dlakic mdlakic@montana.edu wrote:

> Irene, > > I know firsthand that Rosetta works even for proteins >200 residues, > meaning it will make them much better in terms of energy and > stereochemistry. Now, if you are expecting that the program will remove > absolutely all problems with the model so it gets a perfect Molprobity > score, that would be unrealistic. It seems to me that you already have a > pretty good model - it doesn't have to be free of all violations to be > useful. Even many legitimate protein structures, which are based on > experimental maps, have "issues" when analyzed by Molprobity. > > That's very true!

> Having said all of this, I have good experience with RAPPER when it comes > to loop modeling, assuming loops are of reasonable length. The program is > here: > > http://mordred.bioc.cam.ac.uk/~rapper/http://mordred.bioc.cam.ac.uk/%7Erapper/ > > If you get the program to install and work properly, I can offer you a > small script that will take a PDB file and a range of residues for > refinement, and produce a new model resampled loops. Alternatively, you can > send me your PDB file and a range of residues to be refined and I will send > you the model back. If you are happy with it, then you can spend more time > installing and optimizing RAPPER. > > Mensur, are you aware of any benchmark analysis that shows RAPPER's superiority on loop modeling over MODELLER? I am aware of one that compares Rosetta, MODELLER and CABS on loops up to 24 aa if I remember correctly, and shows that in short lengths all 3 programs are equivalent, but for longer loops the combination of MODELLER with CABS prevails.

I have several proteins with missing regions of varying length (8-47 aa) which I want to fold. I have tried PyRosetta and MODELLER in the past but I'm not impressed from the results of loop modeling, especially as the length increases. I guess these routines were designed to model flexible animo acid stretches that's why the predicted conformations adopt coiled coils. To this end I also tried I-TASSER server (which employs Monte Carlo with Replica Exchange for regions with no templates and is questionably more accurate in folding long aa stretches) by supplying my initial structure and excluding all homologues, but the meta-threading program (LOMETS) it implements always detects traces of "homology" to some irrelevant structures and uses them as templates. I guess my last resort is MD with Replica Exchange but I haven't found the time yet to set up my system and figure out a way to keep the rest of the model rigid, namely to fold only the missing regions.

This message is slightly off-topic, but I just wanted to share my experience (and desperation) with other people that might find it helpful (I hope not the desperation).

Thomas

Thomas Evangelidis

6:19 p.m.

> Mensur, are you aware of any benchmark analysis that shows RAPPER's > superiority on loop modeling over MODELLER? I am aware of one that compares > Rosetta, MODELLER and CABS on loops up to 24 aa if I remember correctly, and > shows that in short lengths all 3 programs are equivalent, but for longer > loops the combination of MODELLER with CABS prevails. > > I have several proteins with missing regions of varying length (8-47 aa) > which I want to fold. I have tried PyRosetta and MODELLER in the past but > I'm not impressed from the results of loop modeling, especially as the > length increases. I guess these routines were designed to model flexible > animo acid stretches that's why the predicted conformations adopt coiled > coils. To this end I also tried I-TASSER server (which employs Monte Carlo > with Replica Exchange for regions with no templates and is questionably more > accurate in folding long aa stretches) by supplying my initial structure and > excluding all homologues, but the meta-threading program (LOMETS) it > implements always detects traces of "homology" to some irrelevant structures > and uses them as templates. I guess my last resort is MD with Replica > Exchange but I haven't found the time yet to set up my system and figure out > a way to keep the rest of the model rigid, namely to fold only the missing > regions. > > This message is slightly off-topic, but I just wanted to share my > experience (and desperation) with other people that might find it helpful (I > hope not the desperation). > > * I meant I-TASSER is ARGUABLY more accurate in folding long aa stretches.

5179

Age (days ago)

5182

Last active (days ago)

List overview

Download

4 comments

3 participants

tags (0)

participants (3)

Irene Newhouse
Mensur Dlakic
Thomas Evangelidis