Missing residues modeling in Modeller

3 Nov 2024


      Dear all,
I am trying to model the missing residues in a beta barrel shaped protein that contains 12 missing residues in the original pdb. As instructed for the 1qg8 tutorial, I also first obtained the full amino acid sequence of the protein using:
from modeller import *
# Get the sequence of the 1qg8 PDB file, and write to an alignment file
code = 'pdbid'
e = Environ()
m = Model(e, file=code)
aln = Alignment(e)
aln.append_model(m, align_codes=code)
aln.write(file=code+'.seq')
and I am obtaining an output file named pdbid.seq
The actual protein sequence contains 324 amino acid residues and 12 residues are missing in the pdbid.pdb file
This is the alignment.ali file I am preparing for missing residue addition:
>P1;pdbid
structureX:pdbid:1:A:+312:A:MOL_ID  1;
ASDQRGYKP------------GGHVGTSVEYEDKVTRGFNNTDKKEKTITNEVFNFFYNNPQWNFMGFYSFKIENREQKEPGYYENEDGIKQLFSLNKGHDLGNGWATGLIYELEYTRSKVYSPDVSGLRKNLAEHSIRPYLTYWNNDYNMGFYSNLEYLLSKEDRNAWGKRQEQGYSALFKPYKRFGNWEVGVEFYYQIKTNDEKQPDGTINEKSDFNERYIEPIVQYSFDDAGTLYTRVRVGKNETKNTDRSGGGNAGINYFKDIRKATVGYEQSIGESWVAKAEYEYANEVEKKSRLSGWEARNKSELTQHTFYAQALYRF*
>P1;pdb_fill
sequence:::::::::
ASDQRGYKPEDVAFDESFFSFGGHVGTSVEYEDKVTRGFNNTDKKEKTITNEVFNFFYNNPQWNFMGFYSFKIENREQKEPGYYENEDGIKQLFSLNKGHDLGNGWATGLIYELEYTRSKVYSPDVSGLRKNLAEHSIRPYLTYWNNDYNMGFYSNLEYLLSKEDRNAWGKRQEQGYSALFKPYKRFGNWEVGVEFYYQIKTNDEKQPDGTINEKSDFNERYIEPIVQYSFDDAGTLYTRVRVGKNETKNTDRSGGGNAGINYFKDIRKATVGYEQSIGESWVAKAEYEYANEVEKKSRLSGWEARNKSELTQHTFYAQALYRF*
But in the template I get residue number written as 312 instead of 324, I directly copied the entry from the pdbid.seq file that was provided by Modeller at the first step that I explained at the beginning. Due to 12 missing residues Modeller is reporting 312 residues in the second line of the alignment.ali file: structureX:pdbid:1:A:+312:A:MOL_ID  1;
However, I checked other examples on Modeller tutorial and it seems in some examples the total residue numbers (including the missing residue numbers) are written in the template section for the alignment.ali file, so I could have used structureX:pdbid:1:A:+324:A:MOL_ID  1; writing 324 as total number of residues for the template; so my question is that does it matter if write structureX:pdbid:1:A:+312:A:MOL_ID  1;  or structureX:pdbid:1:A:+324:A:MOL_ID  1;
I generated a new pdb with all missing residues modelled by Modeller using:
from modeller import *
from modeller.automodel import *    # Load the AutoModel class
log.verbose()
env = Environ()
# directories for input atom files
env.io.atom_files_directory = ['.', '../atom_files']
class MyModel(AutoModel):
    def select_atoms(self):
        return Selection(self.residue_range('10:A', '21:A'))
#a = MyModel(env, alnfile = 'alignment.ali',
#              knowns = 'pdbid', sequence = 'pdbid_fill')
a = AutoModel(env, alnfile = 'alignment.ali',
              knowns = 'pdbid', sequence = 'pdbid_fill')
a.starting_model= 1
a.ending_model  = 5
a.make()
I checked that using structureX:pdbid:1:A:+312:A:MOL_ID  1;  or structureX:pdbid:1:A:+324:A:MOL_ID  1; in the second line of alignment.ali for the template, I am obtaining a very similar or more or less identical modeled structure. Can anybody tell me if the number of residues written in the second line of alignment.ali file for template pdb matters or not? Or only the template sequence and the gaps really matter? Any help would be much appreciated, thank you

bbmpresi＠gmail.com

Modeller Caretaker

tags (0)

participants (2)