[modeller_usage] Fwd: reg: issue in importing modeller

1 Apr 2020


      ---------- Forwarded message ---------
From: Anwesha Mohapatra anwesha.mohapatra11@gmail.com
Date: Tue, Mar 31, 2020 at 11:58 PM
Subject: Re: [modeller_usage] reg: issue in importing modeller
To: Modeller Caretaker modeller-care@salilab.org
Dear Sir,
I read the modeller forum for multichain proteins however I am not sure
whether I have understood correctly. I am a novice in the area of
structural biology and hence have lots of queries.
*1) In case of a homomeric protein.*
For example protein with id *1sv6 *is my template which has 5 chains (A-E).
In order to align my target protein with this template, I copied and pasted
the sequence of my target 5 times separated by the delimiter '/'.
The alignment of the template (1SV6) with my target gene ABM40476.1 gave
the following result.
* Is this the right way to proceed in case of homomeric proteins? *
>P1;1sv6.pdb
structureX:1sv6.pdb:   1 :A:+1305:E:MOL_ID  1; MOLECULE
 2-KETO-4-PENTENOATE HYDRATASE; CHAIN  A, B, C, D, E; SYNONYM
 2-HYDROXYPENTADIENOIC ACID HYDRATASE; EC  4.2.1.-; ENGINEERED  YES:MOL_ID
 1; ORGANISM_SCIENTIFIC  ESCHERICHIA COLI; ORGANISM_TAXID  562; GENE  MHPD,
B0350; EXPRESSION_SYSTEM  ESCHERICHIA COLI; EXPRESSION_SYSTEM_TAXID  562:
2.90: 0.24
MT--KHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQHDVAQGRRVVGRKVGLTHPKVQQQLG
VDQPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATDITFDELYNAIEWVLPALEVVGSRIRDWS
IQFVDTVADNASCGVYVIGGPAQRPAGLDLKNCAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLR
TGDIILTGALGPMVAVNAGDRFEAHIEGIGSVAATFSS/
MT--KHTLEQLAADLRRAAEQGEAIAPLRDLIGIDNAEAAYAIQHINVQHDVAQGRRVVGRKVGLTHPKVQQQLGVD
QPDFGTLFADMCYGDNEIIPFSRVLQPRIEAEIALVLNRDLPATDITFDELYNAIEWVLPALEVVGSRIRDWSIQFVDT
VADNASCGVYVIGGPAQRPAGLDLKNCAMKMTRNNEEVSSGRGSECLGHPLNAAVWLARKMASLGEPLRTGDIILT-G
ALGPMVAVNAGDRFEAHIEGIGSVAATFSS/
MTKHTLEQLAADLRRAAEQGEAIAPLRD XXXXXXXXXXX repetition of the chain
XXXXXXXXXXXXX*
>P1;ABM40476.1_-_Acidovorax_sp._JS42
sequence:ABM40476.1_-_Acidovorax_sp._JS42:     : :     : ::: 0.00: 0.00
MTMTPALIEQLGDELYQALTQRRMLEPLTNRHADITIDDAYAIQQKMLARRLAAGEKVVGKKIGVTSKAVMDMLG
VFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIAFVLKKTLKGPGITAADVLAATEGVMACFEIVDSRIRDWK
IKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMVLEKNGDIVATGAGAA------------------------
--------ALGH-PA-NA-------------------V/
MTMTPALIEQLGDELYQALTQRRMLEPLTNRHADIT
IDDAYAIQQKMLARRLAAGEKVVGKKIGVTSKAVMDMLGVFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIA
FVLKKTLKGPGITAADVLAATEGVMACFEIVDSRIRDWKIKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMV
LEKNGDIVATGAGAAALGHPANA----------------------V/
MTMTPALIEQLGDE-------------LYQA-LTQR---RMLEPLTNR------HADIT----IDD---AYAIQQKMLARRLAAGEKVVGK
KIGVTSKAVMDMLGVFQPDFGWLTDGMVFNEGQAVQANTLIQPKAEGEIAFVLKKTLKGPGITAADVLAATEGVM
ACFEIVDSRIRDWKIKIQDTVADNASCGVFVLGDRLVDPRDVDLGTCGMVLEKNGDIVATGAGAAALGHPANA--------------
--------V/
(XXXX and so on )--------*
*2.) In order to further model this protein should I be using MyModel
instead of automodel?*
*If so do I need to make changes in this part of the code as per the number
of chains in the template (highlighted below)?*
s1 = selection(self.chains['A']).only_atom_types('CA')
  s2 = selection(self.chains['B']).only_atom_types('CA') *#Should the
C,D,E chains be appended to self.restraints.symmetry?*
  self.restraints.symmetry.append(symmetry(s1, s2, 1.0))
*3). I also have a heteromeric protein such as 1O7G which has a large
and small subunit as chain A and B. The target gene that I want to
model shows sequence similarity only with chain A. How should my
alignment file be in this case?*
*for example if the template is*
>P1;1o7g
structure::::
MNOPTHSWYRTY....XXXXXXXXX...../
MRSF...........*
*then should my target gene which has aligned to only chainA be depicted as*
>P1,TargetGene
sequence:::::::
MNOPTYIKL--DFT--WANH--XXX--/
-----------------------------------------------*
Sir , request you to please guide me in this issue as I am stuck and
unable to proceed further.
Thanks & Regards
Anwesha
On Sat, Mar 28, 2020 at 6:21 PM Anwesha Mohapatra <
anwesha.mohapatra11@gmail.com> wrote:
> Hello Sir,
>
> Could you please suggest how to proceed . I have attached my files and
> their corresponding scenarios in the previous mail.
>
> Kindly guide me.
>
> Thanks & Regards
> Anwesha
>
> On Fri, Mar 27, 2020, 7:25 PM Anwesha Mohapatra <
> anwesha.mohapatra11@gmail.com> wrote:
>
>> Dear Sir,
>>
>> Firstly thank you, the previous fortran error got resolved when I used
>> relative path instead of absolute and a shorter name for files.
>> I have a few queries which I have mentioned below:
>> -----------------------------------------------------------------------
>> 1. As per our previous discussion regarding multi-chain protein . I have
>> used the template 1o7g to align against my query gene sequence.
>> In the alignment I do see a chain break '/' in the sequence for the
>> template however the query protein is aligned to the entire of the template
>> protein (To both the chains A and B).
>> But I do now this same query sequence aligns to A chain of the template
>> as per my Blastp result. I can see lots of gaps added to the query gene
>> when aligned to entire protein. Is this correct?
>> I have attached the alignment files in the '*query.zip file'*. One file
>> shows alignment of target with all chains of the template and the other
>> aligned to only chain A.
>>
>> 2. Since the PDB file(1o7g) that I am using as a template  has all the
>> HETATOMS after the chains I have added the hetatms information after the
>> chains in the alignment file as well.
>> On modelling however I dont see the HETATMS added to the pdb file
>> created. Moreoever I am unable to open the file on PyMol. (*** buffer
>> overflow detected ***: python2.7 terminated).
>> I have attached the file *'ModelError_Test1.zip*' which contains the
>> input alignment file and the model generated in .pdb format. Could you tell
>> me whether I have made any mistakes in the alignment file.
>>
>> 3. Since modelller only reads ATOMS and HETATOMS ,I tried to rewrite the
>> PDB template file where the HETATOMS come directly after the chain.
>> I have changed the alignment files accordingly.
>> However, in this case modeller is not able to open the file and says its
>> a corrupt pdb file. I have attached the pdb ,alignment and log files and
>> shared in the attached folder named *"Query3.zip" *
>>
>> 4. I am unable to understand the concept of missing residues.
>> If some residues are missing in the template then how should I proceed.
>> Do I need to check for the presence of missing residues for each template
>> file?
>>
>> Kindly help me with these queries.
>> Thanks & Regards
>> Anwesha
>>
>> On Tue, Mar 24, 2020 at 7:51 AM Anwesha Mohapatra <
>> anwesha.mohapatra11@gmail.com> wrote:
>>
>>> Thank you Sir, I will try with shorter names and relative paths
>>>
>>> On Tue, Mar 24, 2020, 7:48 AM Modeller Caretaker <
>>> modeller-care@salilab.org> wrote:
>>>
>>>> On 3/21/20 11:15 PM, Anwesha Mohapatra wrote:
>>>> > I am sending the input files and the code I have been using .
>>>>
>>>> I think that what's happening here is that your alignment file names
>>>> are
>>>> really long, and you're hitting an internal limit inside Modeller (your
>>>> input files work for me, but I can reproduce your problem if I put the
>>>> files inside a directory with a very long name). This will be fixed in
>>>> the next Modeller release. In the meantime you can work around it by
>>>> using shorter file or directory names, or using relative rather than
>>>> absolute paths to your alignment files.
>>>>
>>>>         Ben Webb, Modeller Caretaker
>>>> --
>>>> modeller-care@salilab.org             https://salilab.org/modeller/
>>>> Modeller mail list: https://salilab.org/mailman/listinfo/modeller_usage
>>>>
>>>