Missing Residues at the Start, As Well As in the Middle, of the Sequence of a Chain

28 May 2024


      Dear Modeller Discussion Forum Members,
I am trying to repair Chain B in the RCSB PDB 5BS8. 5BS8's structure is
that of DNA gyrase (from Mycobacterium tuberculosis). I used the example
scripts, for filling in missing residues with Modeller, which were given at
the URL https://salilab.org/modeller/wiki/Missing_residues (in Modeller
Wiki), as well as the "basic-example" tutorial at the main Modeller
website, and a YouTube tutorial video for guidance. *Chain B contains 2
missing residues at the start of the sequence associated with the chain in
the PDB file- S423 and N424. Thereafter, it contains the sequence
"A(425)LVRRK(430)" (with atom records/coordinates) and then a stretch of 6
missing residues- "S(431)ATDIG(436)". *I used the first script given at the
abovementioned URL to generate a sequence file extracted from the PDB. *I
then used the following as my alignment file (using the NCBI RefSeq
(NP_214519.2) for Mycobacterium tuberculosis gyrB (DNA Gyrase subunit B):*
>P1;5bs8
structure:5bs8.pdb:FIRST:B:LAST:B:DNA Gyrase:::
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ALVRRK------GLPGKLADCRSTDPRKSELYVVEGDSAGGSAKSGRDSMFQAILPLRGKIINVEKARIDRVLKNTEVQAIITALGTGIHDEFDIGKLRYHKIVLMADADVDGQHISTLLLTLLFRFMRPLIENGHVFLAQPPLYKLKWQRSDPEFAYSDRERDGLLEAGLKAGKKINKEDGIQRYKGLGEMDAKELWETTMDPSVRVLRQVTLDDAAAADELFSILMGEDVDARRSFITRNAKDVRFLDV*
>P1;5bs8B_fill
sequence:::::::::
MAAQKKKAQDEYGAASITILEGLEAVRKRPGMYIGSTGERGLHHLIWEVVDNAVDEAMAGYATTVNVVLLEDGGVEVADDGRGIPVATHASGIPTVDVVMTQLHAGGKFDSDAYAISGGLHGVGVSVVNALSTRLEVEIKRDGYEWSQVYEKSEPLGLKQGAPTKKTGSTVRFWADPAVFETTEYDFETVARRLQEMAFLNKGLTINLTDERVTQDEVVDEVVSDVAEAPKSASERAAESTAPHKVKSRTFHYPGGLVDFVKHINRTKNAIHSSIVDFSGKGTGHEVEIAMQWNAGYSESVHTFANTINTHEGGTHEEGFRSALTSVVNKYAKDRKLLKDKDPNLTGDDIREGLAAVISVKVSEPQFEGQTKTKLGNTEVKSFVQKVCNEQLTHWFEANPTDAKVVVNKAVSSAQARIAARKARELVRRKSATDIGGLPGKLADCRSTDPRKSELYVVEGDSAGGSAKSGRDSMFQAILPLRGKIINVEKARIDRVLKNTEVQAIITALGTGIHDEFDIGKLRYHKIVLMADADVDGQHISTLLLTLLFRFMRPLIENGHVFLAQPPLYKLKWQRSDPEFAYSDRERDGLLEAGLKAGKKINKEDGIQRYKGLGEMDAKELWETTMDPSVRVLRQVTLDDAAAADELFSILMGEDVDARRSFITRNAKDVRFLDV*
*I used the following as the script to run AutoModel to model only the
selected residues:*
from modeller import *
from modeller.automodel import *    # Load the AutoModel class
log.verbose()
env = Environ()
# directories for input atom files
env.io.atom_files_directory = ['.', '../atom_files']
class MyModel(AutoModel):
    def select_atoms(self):
        return Selection(self.residue_range('431:B', '436:B'))
a = MyModel(env, alnfile = '5bs8_B-alignment.ali',
            knowns = '5bs8', sequence = '5bs8B_fill')
a.starting_model= 1
a.ending_model  = 1
*This then raised the following error:*
return Selection(self.residue_range('431:B', '436:B'))
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files
(x86)\Modeller10.5\modlib\modeller\coordinates.py", line 385, in
residue_range
    start = self.residues[start]._num
            ~~~~~~~~~~~~~^^^^^^^
  File "C:\Program Files
(x86)\Modeller10.5\modlib\modeller\coordinates.py", line 302, in __getitem__
    ret = modutil.handle_seq_indx(self, indx, self.mdl._indxres,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files
(x86)\Modeller10.5\modlib\modeller\util\modutil.py", line 24, in
handle_seq_indx
    int_indx = lookup_func(*args)
               ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files
(x86)\Modeller10.5\modlib\modeller\coordinates.py", line 379, in _indxres
    self._report_bad_index(indx, suffix, "residue", 0)
  File "C:\Program Files
(x86)\Modeller10.5\modlib\modeller\coordinates.py", line 372, in
_report_bad_index
    raise KeyError("No such %s: %s" % (indxtyp, indx))
*KeyError: 'No such residue: 431:B'*
Next, I tried to run it again after deleting the 424 "-"s that preceded the
sequence in the structure-associated sequence portion of the alignment file
(>P1;5bs8
structure:5bs8.pdb:FIRST:B:LAST:B:DNA Gyrase:::) and replacing them with 2
"-"s for S423 and N424 and the again, without these 2 preceding "-"s. Both
times, I then got the same error:
(...... *KeyError: 'No such residue: 431:B'*)
*Please advise me on how to fill in missing residues for a chain that (a)
has coordinates only for a middle portion/domain of the entire possible
sequence (for the full-length protein) (because only the middle
portion/domain was crystallised and subjected to X-ray crystallography,
say) and (b) has missing residues at the start of this chain (due to high
B-factors, say) with respect to the sequence that is associated with the
solved structure of the chain in question (as can be seen in PDB viewer
softwares such as UCSF Chimera) (e.g.: chain B of RCSB PDB 5BS8).*
Thanks, and regards,
Siddhartha A. Barua, Ph.D.
-- 
Siddhartha A. Barua, Ph.D.
Mb.: +91 7777093994

Siddhartha Barua

Joel Subach

Joel Subach

Siddhartha Barua

Modeller Caretaker

Siddhartha Barua

Joel Subach

tags (0)

participants (3)