I'm trying to model a protein that looks like a member of CIS_IPPS with reasonable homlogy for a major portion of the sequence, but with a significant insertion region that has weak homology to a portion of a different protein. I tried to build with the .py & .ali files listed below, with the result that the CIS_IPPS regions of the protein were built just fine, but the insertion region was nothing but a random coil. When I extracted the insertion region & modeled it based on the extracted portion of the other protein, I was able to model something reasonable. I could use a GUI to meld the two models together, but I'd rather have recommendations as to how to get modeller to do the build for me - I think it'd be a lot better starting structure than the result of a hand-meld. Thanks in advance!
Irene Newhouse
.py file: from modeller import * # Load standard Modeller classes from modeller.automodel import * # Load the automodel class
log.verbose() # request verbose output # Override the 'special_restraints' and 'user_after_single_model' methods: class MyModel(automodel): def special_restraints(self, aln): # Constrain the A and B chains to be identical (but only restrain # the C-alpha atoms, to reduce the number of interatomic distances # that need to be calculated): s1 = selection(self.chains['A']).only_atom_types('CA') s2 = selection(self.chains['B']).only_atom_types('CA') self.restraints.symmetry.append(symmetry(s1, s2, 1.0)) def user_after_single_model(self): # Report on symmetry violations greater than 1A after building # each model: self.restraints.symmetry.report(1.0)
env = environ() # create a new MODELLER environment to build this model in
# directories for input atom files env.io.atom_files_directory = ['.', '/home/newhoir/rubber/']
#read HETATMS in template env.io.hetatm = True
# Be sure to use 'MyModel' rather than 'automodel' here! a = MyModel(env, alnfile = 'rub4b.ali' , # alignment filename knowns = ('2x06', '2erx'), # codes of the templates sequence = '1cp4', # code of the target assess_methods=assess.GA341) # request model assessment a.starting_model= 1 # index of the first model a.ending_model = 20 # index of the last model # (determines how many models to calculate) a.library_schedule = autosched.slow # thorough VTFM opt a.max_var_iterations = 300 a.md_level = refine.very_slow a.make() # do the actual homology modeling
The .ali file: >P1;2x06 structureX:2x06:12:A:+474:B:PDB::0.00:0.00 -------------------------KLPAHG--CRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSFAANNGIEALTLYAFSSENWNRPAQEVSALMELFVWALDSEVKS---LHRHNVRLRIIGDTSRFNSRLQERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVR--------------------------------------------------------------QLAEKVQQGNLQPDQIDEEMLN-----------------------------------------------QHVCMHELAPVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRE----------.../-------------------------KLPAHG--CRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSFAANNGIEALTLYAFSSENWNRPAQEVSALMELFVWALDSEVKS---LHRHNVRLRIIGDTSRFNSRLQERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVR--------------------------------------------------------------QLAEKVQQGNLQPDQIDEEMLN-----------------------------------------------QHVCMHELAPVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRE----------...* >P1;2erx structureX:2erx:16:A:+66:B:PDB::0.00:0.00 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------GGVGKSSLVLRFVKGTFRESYIPTVEDTYRQVI------------------------------------------------------------------------------------------------------/---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------GGVGKSSLVLRFVKGTFRESYIPTVEDTYRQVI------------------------------------------------------------------------------------------------------* >P1;1cp4 sequence:1cp4:1:A:+758:B:::0.00:0.00 MGKHSSSRVSELFGNLGSFIRACIFRVLSMGPIPNHFAFIMDGNRRYAKKENMKKGAGHRAGFLALISILKYCYELGVKYVTIYAFSIDNFKRNPDEVKDLMDLMLEKIEELLRDESIVNQYGIRVYFIGNLKLLSEPVRIAAEKVMRATAKNTNCTLLICIAYTSRDEIVHAVQGSCKNKREDILPLSFCKANNGDIEEVEDDKKVHGVSPFVFSESQKDEAGESQATIASVTCSCLARGVEGGGNKNSMVVRAVRGSYEDKW-----DNYQAVMENRTGSGVTPSEENKNMQGECSIVKLVDIEKQMYMAVAPEPDILIRSSGESRLSNFLLWQSSECLLYSPDALWPEIGLWHLVWAVLNFQRNHSYLERKKHQL.../MGKHSSSRVSELFGNLGSFIRACIFRVLSMGPIPNHFAFIMDGNRRYAKKENMKKGAGHRAGFLALISILKYCYELGVKYVTIYAFSIDNFKRNPDEVKDLMDLMLEKIEELLRDESIVNQYGIRVYFIGNLKLLSEPVRIAAEKVMRATAKNTNCTLLICIAYTSRDEIVHAVQGSCKNKREDILPLSFCKANNGDIEEVEDDKKVHGVSPFVFSESQKDEAGESQATIASVTCSCLARGVEGGGNKNSMVVRAVRGSYEDKW-----DNYQAVMENRTGSGVTPSEENKNMQGECSIVKLVDIEKQMYMAVAPEPDILIRSSGESRLSNFLLWQSSECLLYSPDALWPEIGLWHLVWAVLNFQRNHSYLERKKHQL...*