[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[modeller_usage] Building in missing residues from a known structure



I've looked through the mailing list archives and online documentation, but I'm having some trouble wrapping my head around the proper way to accomplish the following. I have two PDB files of a protein in different conformations, where in one of the two PDBs, a short stretch of residues are missing, but in the other the complete coordinates in that region are known. Both PDBs have different missing residues throughout their chain, but I'm only interested at this time in modeling that small section that's present in one and not in the other.

My alignment file looks something like:

>P1;A
structureX:A:   1 :A:149 :A:::-1.00:-1.00
---------FIRIMVFAIYVALPIGV--------------------AGKSLPWYAVGASLIAANISAEQFIGGMS
SAYSIGLAIASYRWMSALTLIIVGKYFLPIFIEKGIYTIPEFVRKRSNKKLLTILAVFWISLYIFVNLTSVLYL*

>P1;B
structureX:B:1   :A:149  :A:::-1.00:-1.00
--------SFIRIMVFAIYVALPIGVGLWV-------------------SLPWYAVGASLIAANISAEQFIGGMS
SAYSIGLAIASYRWMSALTLIIVGKYFLPIFIEKGIYTIPEFVRKRS----LTILAVFWISLYIFVNLTSVLYL*


In B the sequence `NKKL` is missing. Because they are in different conformations, I want to use the PDB for B as an initial model. 
In each of the PDB files, the residues follow the number in the above sequence alignment, and only contain coordinates for the residues that are not marked with a `-`.


My script looks like:
-----------------------------------------------------
-----------------------------------------------------
# Modeling using a provided initial structure file (inifile)
from modeller import *
from modeller.automodel import *    # Load the automodel class

log.verbose()
env = environ()

# directories for input atom files
env.io.atom_files_directory = ['.', '../structures']

class MyModel(automodel):
    def select_atoms(self):
        return selection(self.residue_range('123', '126'))


a = MyModel(env,
              alnfile='../sequences/alignments.ali',     # alignment filename
              knowns='A',                   # codes of the templates
              sequence='B',              # code of the target
              inifile='B.pdb')    # use 'my' initial structure
a.starting_model = 1                 # index of the first model
a.ending_model = 1                 # index of the last model
a.make()                            # do homology modeling
-----------------------------------------------------
-----------------------------------------------------

If in my alignment file, I specify that each structure starts at residue 1, I get an error:

No atoms were read from the specified input PDB file, since the starting residue number and/or chain id in MODEL_SEGMENT (or the alignment file header) was not found; requested starting position: residue number " 1", chain " A"; atom file name:  ../structures/A.pdb

However if I change the starting residue to 10 in A and 9 in B, Modeller runs, but it builds a different set of residues than I intended, because I think it is numbering the residues in sequential order as they are read in from the PDBs, ignoring the missing residues (determined by running the small script in FAQ #17).

I've seen advice on renumbering residues on the mailing list, but I'm not sure if I'm missing something more basic. 

Any suggestions or pointers would be most appreciate.

Josh