---------- Forwarded message --------- From: Anwesha Mohapatra anwesha.mohapatra11@gmail.com Date: Tue, Mar 31, 2020 at 11:58 PM Subject: Re: [modeller_usage] reg: issue in importing modeller To: Modeller Caretaker modeller-care@salilab.org
Dear Sir, I read the modeller forum for multichain proteins however I am not sure whether I have understood correctly. I am a novice in the area of structural biology and hence have lots of queries. *1) In case of a homomeric protein.* For example protein with id *1sv6 *is my template which has 5 chains (A-E). In order to align my target protein with this template, I copied and pasted the sequence of my target 5 times separated by the delimiter '/'. The alignment of the template (1SV6) with my target gene ABM40476.1 gave the following result. * Is this the right way to proceed in case of homomeric proteins? * >P1;1sv6.pdb structureX:1sv6.pdb: 1 :A:+1305:E:MOL_ID 1; MOLECULE 2-KETO-4-PENTENOATE HYDRATASE; CHAIN A, B, C, D, E; SYNONYM 2-HYDROXYPENTADIENOIC ACID HYDRATASE; EC 4.2.1.-; ENGINEERED YES:MOL_ID 1; ORGANISM_SCIENTIFIC ESCHERICHIA COLI; ORGANISM_TAXID 562; GENE MHPD, B0350; EXPRESSION_SYSTEM ESCHERICHIA COLI; EXPRESSION_SYSTEM_TAXID 562: 2.90: 0.24 MT--KHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQHDVAQGRRVVGRKVGLTHPKVQQQLG VDQPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATDITFDELYNAIEWVLPALEVVGSRIRDWS IQFVDTVADNASCGVYVIGGPAQRPAGLDLKNCAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLR TGDIILTGALGPMVAVNAGDRFEAHIEGIGSVAATFSS/ MT--KHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQHDVAQGRRVVGRKVGLTHPKVQQQLGVD QPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATDITFDELYNAIEWVLPALEVVGSRIRDWSIQFVDT VADNASCGVYVIGGPAQRPAGLDLKNCAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLRTGDIILT-G ALGPMVAVNAGDRFEAHIEGIGSVAATFSS/ MTKHTLEQLAADLRRAAEQGEAIAPLRD XXXXXXXXXXX repetition of the chain XXXXXXXXXXXXX*
>P1;ABM40476.1_-_Acidovorax_sp._JS42 sequence:ABM40476.1_-_Acidovorax_sp._JS42: : : : ::: 0.00: 0.00 MTMTPALIEQLGDELYQALTQRRMLEPLTNRHADITIDDAYAIQQKMLARRLAAGEKVVGKKIGVTSKAVMDMLG VFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIAFVLKKTLKGPGITAADVLAATEGVMACFEIVDSRIRDWK IKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMVLEKNGDIVATGAGAA------------------------ --------ALGH-PA-NA-------------------V/ MTMTPALIEQLGDELYQALTQRRMLEPLTNRHADIT IDDAYAIQQKMLARRLAAGEKVVGKKIGVTSKAVMDMLGVFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIA FVLKKTLKGPGITAADVLAATEGVMACFEIVDSRIRDWKIKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMV LEKNGDIVATGAGAAALGHPANA----------------------V/ MTMTPALIEQLGDE-------------LYQA-LTQR---RMLEPLTNR------HADIT----IDD---AYAIQQKMLARRLAAGEKVVGK KIGVTSKAVMDMLGVFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIAFVLKKTLKGPGITAADVLAATEGVM ACFEIVDSRIRDWKIKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMVLEKNGDIVATGAGAAALGHPANA-------------- --------V/ (XXXX and so on )--------*
*2.) In order to further model this protein should I be using MyModel instead of automodel?* *If so do I need to make changes in this part of the code as per the number of chains in the template (highlighted below)?*
s1 = selection(self.chains['A']).only_atom_types('CA') s2 = selection(self.chains['B']).only_atom_types('CA') *#Should the C,D,E chains be appended to self.restraints.symmetry?* self.restraints.symmetry.append(symmetry(s1, s2, 1.0))
*3). I also have a heteromeric protein such as 1O7G which has a large and small subunit as chain A and B. The target gene that I want to model shows sequence similarity only with chain A. How should my alignment file be in this case?*
*for example if the template is*
>P1;1o7g
structure::::
MNOPTHSWYRTY....XXXXXXXXX...../
MRSF...........*
*then should my target gene which has aligned to only chainA be depicted as*
>P1,TargetGene
sequence:::::::
MNOPTYIKL--DFT--WANH--XXX--/ -----------------------------------------------*
Sir , request you to please guide me in this issue as I am stuck and unable to proceed further.
Thanks & Regards
Anwesha
On Sat, Mar 28, 2020 at 6:21 PM Anwesha Mohapatra < anwesha.mohapatra11@gmail.com> wrote:
> Hello Sir, > > Could you please suggest how to proceed . I have attached my files and > their corresponding scenarios in the previous mail. > > Kindly guide me. > > Thanks & Regards > Anwesha > > On Fri, Mar 27, 2020, 7:25 PM Anwesha Mohapatra < > anwesha.mohapatra11@gmail.com> wrote: > >> Dear Sir, >> >> Firstly thank you, the previous fortran error got resolved when I used >> relative path instead of absolute and a shorter name for files. >> I have a few queries which I have mentioned below: >> ----------------------------------------------------------------------- >> 1. As per our previous discussion regarding multi-chain protein . I have >> used the template 1o7g to align against my query gene sequence. >> In the alignment I do see a chain break '/' in the sequence for the >> template however the query protein is aligned to the entire of the template >> protein (To both the chains A and B). >> But I do now this same query sequence aligns to A chain of the template >> as per my Blastp result. I can see lots of gaps added to the query gene >> when aligned to entire protein. Is this correct? >> I have attached the alignment files in the '*query.zip file'*. One file >> shows alignment of target with all chains of the template and the other >> aligned to only chain A. >> >> 2. Since the PDB file(1o7g) that I am using as a template has all the >> HETATOMS after the chains I have added the hetatms information after the >> chains in the alignment file as well. >> On modelling however I dont see the HETATMS added to the pdb file >> created. Moreoever I am unable to open the file on PyMol. (*** buffer >> overflow detected ***: python2.7 terminated). >> I have attached the file *'ModelError_Test1.zip*' which contains the >> input alignment file and the model generated in .pdb format. Could you tell >> me whether I have made any mistakes in the alignment file. >> >> 3. Since modelller only reads ATOMS and HETATOMS ,I tried to rewrite the >> PDB template file where the HETATOMS come directly after the chain. >> I have changed the alignment files accordingly. >> However, in this case modeller is not able to open the file and says its >> a corrupt pdb file. I have attached the pdb ,alignment and log files and >> shared in the attached folder named *"Query3.zip" * >> >> 4. I am unable to understand the concept of missing residues. >> If some residues are missing in the template then how should I proceed. >> Do I need to check for the presence of missing residues for each template >> file? >> >> Kindly help me with these queries. >> Thanks & Regards >> Anwesha >> >> On Tue, Mar 24, 2020 at 7:51 AM Anwesha Mohapatra < >> anwesha.mohapatra11@gmail.com> wrote: >> >>> Thank you Sir, I will try with shorter names and relative paths >>> >>> On Tue, Mar 24, 2020, 7:48 AM Modeller Caretaker < >>> modeller-care@salilab.org> wrote: >>> >>>> On 3/21/20 11:15 PM, Anwesha Mohapatra wrote: >>>> > I am sending the input files and the code I have been using . >>>> >>>> I think that what's happening here is that your alignment file names >>>> are >>>> really long, and you're hitting an internal limit inside Modeller (your >>>> input files work for me, but I can reproduce your problem if I put the >>>> files inside a directory with a very long name). This will be fixed in >>>> the next Modeller release. In the meantime you can work around it by >>>> using shorter file or directory names, or using relative rather than >>>> absolute paths to your alignment files. >>>> >>>> Ben Webb, Modeller Caretaker >>>> -- >>>> modeller-care@salilab.org https://salilab.org/modeller/ >>>> Modeller mail list: https://salilab.org/mailman/listinfo/modeller_usage >>>> >>>