Hi all.
I have some sequences from real bacteria that correspond to one protein in the pdb, 1SIG. Of course, they have lots of mutations. I would like to know which changes the structure would have, due to these mutations.
I used Modeller to answer that question, for the first time. The predicted structures are very very similar to the given structure. I am surprised that some helices are not broken or bended just a bit.
So I did the following experiment: I replaced in the sequence, by hand, twenty amino acids that correspond to a helix in 1SIG, with 20 amino acids that are already in a loop in the sequence. Then I asked Modeller to calculate the structure and I get almost the same structure again, with helix, even in the part in which I replaced the helix sites.
As far as I know, to get a helix requires several energy states to coincide, so it is difficult that by change I get the helix. So it seems to me the program just copies the structure.
The details are the following: 1SIG is from the pdb. I attach the sequence I would like to model, Ppertucin, and the alignment between them.
I also attach the hand modified sequence, Ppertucintest in which the 23 amino acids DPDDDGLSGTDNAVVPAAAAKPKD (that are present in Ppertucin (46-69) already but have no alignment with 1SIG) replace 23 amino acids that were between positions 97 and 120, matched to a helix in 1SIG, and I keep the same alignment.
The model for Ppertucin does not predicts a helix for the 23 amino acids that correspond to a gap in 1SIG. But when the same 23 amino acids appear again in Ppertucintest aligned to a helix, then the model for Ppertucintest is a helix in that fragment.
I can conclude that if I assume the helix is conserved, then it will be, otherwise, not. How can make the results useful?
Modeller was not intended for these kind of situations?
Thanks for any hint,
Jairo Rocha University of the Balearic Islands Spain
>Ppertucin RIEEGIREVMAAISMFPGTVDGILTEYQRIAEEGGRLTDIFNGYIDPDDDGLSGTDNAVV PAAAAKPKDEKKASDDDEEEEEDTEDDTEEETDGGPDPEIARQRFGAVQEQLEKVRKVLK AKKGDRSHPDVVAEMETLAQLFMPIKLVPKQYDALVARVRGVQDDIRARERAIMQLCVRD ARMPRADFLRSFPGNETNEKWIDEVLAKKPAYADALAALQPDILRQQQQLIALEQDAQLT IAQV
>P1;1SIG.pdb structureX:1SIG.pdb:113:A:446:A:ferredoxin:Peptococcus aerogenes: 2.00:-1.00 MEGEIDIAKRIEDGINQVQCSVAEYPEAITYLLEQYNRVEAEEARLSDLITGFVD----- ------------DLAPTATHVGSELSQE-------------DLDIDPELAREKFAELRAQ YVVTRDTIKA------HATAQEEILKLSEVFKQFRLVPKQFDYLVNSMRVMMDRVRTQER LIMKLCVEQCKMPKKNFITLFTGNETSDTWFNAAIAMNKPWSEKLHDVSEEVHRALQKLQ QIEEETGLTIEQVKDINRRMSIGEAKARRAKKEMVEANLRLVISIAKKYTNRGLQFLDLI QEGNIGLMKAVDKFEYRRGYKFSTYATWWIRQAITRSIADQ*
>P1;Ppertucin sequence:Ppertucin:1 :A:244 :A:ferredoxin:Peptococcus aerogenes: 2.00:-1.00 ---------RIEEGIREVMAAISMFPGTVDGILTEYQRIAEEGGRLTDIFNGYIDPDDDG LSGTDNAVVPAAAAKPKDEKKASDDDEEEEEDTEDDTEEETDGGPDPEIARQRFGAVQEQ LEKVRKVLKAKKGDRSHPDVVAEMETLAQLFMPIKLVPKQYDALVARVRGVQDDIRARER AIMQLCVRDARMPRADFLRSFPGNETNEKWIDEVLAKKPAYADALAALQPDILRQQQQLI ALEQDAQLTIAQV----------------------------------------------- -----------------------------------------*
>Ppertucintest RIEEGIREVMAAISMFPGTVDGILTEYQRIAEEGGRLTDIFNGYIDPDDDGLSGTDNAVV PAAAAKPKDEKKASDDDEEEEEDTEDDTEEETDGGPDPDDDGLSGTDNAVVPAAAAKPKD AKKGDRSHPDVVAEMETLAQLFMPIKLVPKQYDALVARVRGVQDDIRARERAIMQLCVRD ARMPRADFLRSFPGNETNEKWIDEVLAKKPAYADALAALQPDILRQQQQLIALEQDAQLT IAQV
>P1;1SIG.pdb structureX:1SIG.pdb:113:A:446:A:xxx:xxx: 2.00:-1.00 MEGEIDIAKRIEDGINQVQCSVAEYPEAITYLLEQYNRVEAEEARLSDLITGFVD----- ------------DLAPTATHVGSELSQE-------------DLDIDPELAREKFAELRAQ YVVTRDTIKA------HATAQEEILKLSEVFKQFRLVPKQFDYLVNSMRVMMDRVRTQER LIMKLCVEQCKMPKKNFITLFTGNETSDTWFNAAIAMNKPWSEKLHDVSEEVHRALQKLQ QIEEETGLTIEQVKDINRRMSIGEAKARRAKKEMVEANLRLVISIAKKYTNRGLQFLDLI QEGNIGLMKAVDKFEYRRGYKFSTYATWWIRQAITRSIADQ*
>P1;Ppertucintest sequence:Ppertucintest:1 :A:244 :A:xxxx:xxx: 2.00:-1.00 ---------RIEEGIREVMAAISMFPGTVDGILTEYQRIAEEGGRLTDIFNGYIDPDDDG LSGTDNAVVPAAAAKPKDEKKASDDDEEEEEDTEDDTEEETDGGPDPDDDGLSGTDNAVV PAAAAKPKDAKKGDRSHPDVVAEMETLAQLFMPIKLVPKQYDALVARVRGVQDDIRARER AIMQLCVRDARMPRADFLRSFPGNETNEKWIDEVLAKKPAYADALAALQPDILRQQQQLI ALEQDAQLTIAQV----------------------------------------------- -----------------------------------------*
Jairo Rocha wrote: > I have some sequences from real bacteria that correspond to one protein > in the pdb, 1SIG. > Of course, they have lots of mutations. I would like to know which > changes the structure would have, due to these mutations. > > I used Modeller to answer that question, for the first time. The > predicted structures are very very similar to the given structure. > I am surprised that some helices are not broken or bended just a bit. ... > As far as I know, to get a helix requires several energy states to > coincide, so it is difficult that by change I get the helix. > So it seems to me the program just copies the structure.
Modeller is a package for comparative modeling, so by construction your target model is going to look like the template(s). It doesn't exactly "copy" the structure, but certain structural features (e.g. dihedral angles, CA-CA distances, etc.) in aligned templates will be preserved in the model. So if you have a template aligned that has helical structure, it is practically certain that the model will generate will also be helical, regardless of the actual sequence. (Of course, something like poly-P probably wouldn't be helical, but then that's probably a failure of your alignment method if you've aligned poly-P with a helical sequence anyway.)
You probably don't want homology information if you're investigating mutations with structural changes. You can certainly use Modeller for this though - just don't use the homology information (see the mutate model script in the Modeller wiki for an example).
Ben Webb, Modeller Caretaker