Chain designation in completing a structure.
Hello modeller
I am attempting to "complete" a structure by placing in a loop and doing loop refinement. The problem I am having is setting up the "alignment.ali" correctly.I think? The protein has two chains, A and B. I am using the "alignment.ali" file shown below: Chain C starts with "1", Chain D starts with "2", and ends with 1505. So, Chain D has 1504 residues. The "python script" used for running Modeller is shown below the alignment.ali file.
>P1;2PPB_CD
structureX:2PPB_CD: 1 :C:1504:D:undefined:undefined:-1.00:-1.00
MEIKRFGRIREVIPLPPLTEIQVESYRRALQADVPPEKRENVGIQAAFRETFPIEEEDKGKGGLVLDFLEYRLGE
PPFPQDECREKDLTYQAPLYARLQLIHKDTGLIKEDEVFLGHIPLMTEDGSFIINGADRVIVSQIHRSPGVYFTP
DPARPGRYIASIIPLPKRGPWIDLEVEPNGVVSMKVNKRKFPLVLLLRVLGYDQETLARELGAYGELVQGLMDES
VFAMRPEEALIRLFTLLRPGDPPKRDKAVAYVYGLIADPRRYDLGEAGRYKAEEKLGIRLSGRTLARFEDGEFKD
EVFLPTLRYLFALTAGVPGHEVDDIDHLGNRRIRTVGELMTDQFRVGLARLARGVRERMLMGSEDSLTPAKLVNS
RPLEAAIREFFSRSQLSQFKDETNPLSSLRHKRRISALGPGGLTRERAGFDVRDVHRTHYGRICPVETPEGANIG
LITSLAAYARVDELGFIRTPYRRVVGGVVTDEVVYMTATEEDRYTIAQANTPLEGNRIAAERVVARRKGEPVIVS
PEEVEFMDVSPKQVFSVNTNLIPFLEHDDANRALMGSNMQTQAVPLIRAQAPVVMTGLEERVVRDSLAALYAEED
GEVAKVDGNRIVVRYEDGRLVEYPLRRFYRSNQGTALDQRPRVVVGQRVRKGDLLADGPASENGFLALGQNVLVA
IMPFDGYNFEDAIVISEELLKRDFYTSIHIERYEIEARDTKLGPERITRDIPHLSEAALRDLDEEGVVRIGAEVK
PGDILVGRTSFKGESEPTPEERLLRSIFGEKARDVKDTSLRVPPGEGGIVVRTVRLRRGDPGVELKPGVREVVRV
YVAQKRKLQVGDKLANRHGNKGVVAKILPVEDMPHLPDGTPVDVILNPLGVPSRMNLGQILETHLGLAGYFLGQR
YISPIFDGAKEPEIKELLAQAFEVYFGKRKGEGFGVDKREVEVLRRAEKLGLVTPGKTPEEQLKELFLQGKVVLY
DGRTGEPIEGPIVVGQMFIMKLYHMVEDKMHARSTGPYSLITQQPLGGKAQFGGQRFGEMEVWALEAYGAAHTLQ
EMLTLKSDDIEGRNAAYEAIIKGEDVPEPSVPESFRVLVKELQALALDVQTLDEKDNPVDIFEGLASKRKKEVR
KVRIALASPEKIRSWSYGEVEKPETINYRTLKPERDGLFDERIFGPIKDYECACGKYKRQRFEGKVCERCGVEVT
KSIVRRYRMGHIELATPAAHIWFVKDVPSKIGTLLDLSATELEQVLYFSKYIVLDPKGAILNGVPVEKRQLLTDE
EYRELRYGKQETYPLPPGVDALVKDGEEVVKGQELAPGVVSRLDGVALYRFASILVVKARVYPFEDDVEVSTGDR
VAPGDVLADGGKVKSDVYGRVEVDLVRNVVRVVESYDIDARMGAEAIQQLLKELDLEALEKELLEEMKHPSRARR
AKARKRLEVVRAFLDSGNRPEWMILEAVPVLPPDLRPMVQVDGGRFATSDLNDLYRRLINRNNRLKKLLAQGAPE
IIIRNEKRMLQEAVDALLDNGRRGAPVTNPGSDRPLRSLTDILSGKQGRFRQNLLGKRVDYSGRSVIVVGPQLKL
HQCGLPKRMALELFKPFLLKKMEEKGIAPNVKAARRMLERQRDIKDEVWDALEEVIHGKVVLLNRAPTLHRLGIQ
AFQPVLVEGQSIQLHPLVCEAFNADFDGDQMAVHVPLSSFAQAEARIQMLSAHNLLSPASGEPLAKPSRDIILGL
YYITQVRKEKKGAGLEFATPEEALAAHERGEVALNAPIKVAGRETSVGRLKYVFANPDEALLAVAHGIVDLQDVV
TVRYMGKRLETSPGRILFARIVAEAVEDEKVAWELIQLDVPQEKNSLKDLVYQAFLRLGMEKTARLLDALKYYGF
TFSTTSGITIGIDDAVIPEEKKQYLEEADRKLLQIEQAYEMGFLTDRERYDQILQLWTETTEKVTQAVFKNFEEN
YPFNPLYVMAQSGARGNPQQIRQLCGLRGLMQKPSGETFEVPVRSSFREGLTVLEYFISSHGARKGGADTALRTA
DSGYLTRKLVDVTHEIVVREADCGTTNYISVPLFQPDEVTRSLRLRKRADIEAGLYGRVLAREVEVLGVRLEEGR
YLSMDDVHLLIKAAEAGEIQEVPVRSPLTCQTRYGVCQKCYGYDLSMARPVSIGEAVGIVAAQSIGEPGTQLTMR
TFHT-------DITQGLPRVIELFEARRPKAKAVISEIDGVVRIEETEEKLSVFVESEGFSKEYKLPKEARLLVK
DGDYVEAGQPLTRGAIDPHQLLEAKGPEAVERYLVEEIQKVYRAQGVKLHDKHIEIVVRQMMKYVEVTDPGDSRL
LEGQVLEKWDVEALNERLIAEGKTPVAWKPLLMGVTKSALSTKSWLSAASFQNTTHVLTEAAIAGKKDELIGLKE
NVILGRLIPAGTGSDFVRFTQVVDQKTLKAIEEARKEAVEA*
>P1;2PPB_CD_fill
>Sequence:::::::::
MEIKRFGRIREVIPLPPLTEIQVESYRRALQADVPPEKRENVGIQAAFRETFPIEEEDKGKGGLVLDFLEYRLGE
PPFPQDECREKDLTYQAPLYARLQLIHKDTGLIKEDEVFLGHIPLMTEDGSFIINGADRVIVSQIHRSPGVYFTP
DPARPGRYIASIIPLPKRGPWIDLEVEPNGVVSMKVNKRKFPLVLLLRVLGYDQETLARELGAYGELVQGLMDES
VFAMRPEEALIRLFTLLRPGDPPKRDKAVAYVYGLIADPRRYDLGEAGRYKAEEKLGIRLSGRTLARFEDGEFKD
EVFLPTLRYLFALTAGVPGHEVDDIDHLGNRRIRTVGELMTDQFRVGLARLARGVRERMLMGSEDSLTPAKLVNS
RPLEAAIREFFSRSQLSQFKDETNPLSSLRHKRRISALGPGGLTRERAGFDVRDVHRTHYGRICPVETPEGANIG
LITSLAAYARVDELGFIRTPYRRVVGGVVTDEVVYMTATEEDRYTIAQANTPLEGNRIAAERVVARRKGEPVIVS
PEEVEFMDVSPKQVFSVNTNLIPFLEHDDANRALMGSNMQTQAVPLIRAQAPVVMTGLEERVVRDSLAALYAEED
GEVAKVDGNRIVVRYEDGRLVEYPLRRFYRSNQGTALDQRPRVVVGQRVRKGDLLADGPASENGFLALGQNVLVA
IMPFDGYNFEDAIVISEELLKRDFYTSIHIERYEIEARDTKLGPERITRDIPHLSEAALRDLDEEGVVRIGAEVK
PGDILVGRTSFKGESEPTPEERLLRSIFGEKARDVKDTSLRVPPGEGGIVVRTVRLRRGDPGVELKPGVREVVRV
YVAQKRKLQVGDKLANRHGNKGVVAKILPVEDMPHLPDGTPVDVILNPLGVPSRMNLGQILETHLGLAGYFLGQR
YISPIFDGAKEPEIKELLAQAFEVYFGKRKGEGFGVDKREVEVLRRAEKLGLVTPGKTPEEQLKELFLQGKVVLY
DGRTGEPIEGPIVVGQMFIMKLYHMVEDKMHARSTGPYSLITQQPLGGKAQFGGQRFGEMEVWALEAYGAAHTLQ
EMLTLKSDDIEGRNAAYEAIIKGEDVPEPSVPESFRVLVKELQALALDVQTLDEKDNPVDIFEGLASKRKKEVR
KVRIALASPEKIRSWSYGEVEKPETINYRTLKPERDGLFDERIFGPIKDYECACGKYKRQRFEGKVCERCGVEVT
KSIVRRYRMGHIELATPAAHIWFVKDVPSKIGTLLDLSATELEQVLYFSKYIVLDPKGAILNGVPVEKRQLLTDE
EYRELRYGKQETYPLPPGVDALVKDGEEVVKGQELAPGVVSRLDGVALYRFASILVVKARVYPFEDDVEVSTGDR
VAPGDVLADGGKVKSDVYGRVEVDLVRNVVRVVESYDIDARMGAEAIQQLLKELDLEALEKELLEEMKHPSRARR
AKARKRLEVVRAFLDSGNRPEWMILEAVPVLPPDLRPMVQVDGGRFATSDLNDLYRRLINRNNRLKKLLAQGAPE
IIIRNEKRMLQEAVDALLDNGRRGAPVTNPGSDRPLRSLTDILSGKQGRFRQNLLGKRVDYSGRSVIVVGPQLKL
HQCGLPKRMALELFKPFLLKKMEEKGIAPNVKAARRMLERQRDIKDEVWDALEEVIHGKVVLLNRAPTLHRLGIQ
AFQPVLVEGQSIQLHPLVCEAFNADFDGDQMAVHVPLSSFAQAEARIQMLSAHNLLSPASGEPLAKPSRDIILGL
YYITQVRKEKKGAGLEFATPEEALAAHERGEVALNAPIKVAGRETSVGRLKYVFANPDEALLAVAHGIVDLQDVV
TVRYMGKRLETSPGRILFARIVAEAVEDEKVAWELIQLDVPQEKNSLKDLVYQAFLRLGMEKTARLLDALKYYGF
TFSTTSGITIGIDDAVIPEEKKQYLEEADRKLLQIEQAYEMGFLTDRERYDQILQLWTETTEKVTQAVFKNFEEN
YPFNPLYVMAQSGARGNPQQIRQLCGLRGLMQKPSGETFEVPVRSSFREGLTVLEYFISSHGARKGGADTALRTA
DSGYLTRKLVDVTHEIVVREADCGTTNYISVPLFQPDEVTRSLRLRKRADIEAGLYGRVLAREVEVLGVRLEEGR
YLSMDDVHLLIKAAEAGEIQEVPVRSPLTCQTRYGVCQKCYGYDLSMARPVSIGEAVGIVAAQSIGEPGTQLTMR
TFHTGGVAGAADITQGLPRVIELFEARRPKAKAVISEIDGVVRIEETEEKLSVFVESEGFSKEYKLPKEARLLVK
DGDYVEAGQPLTRGAIDPHQLLEAKGPEAVERYLVEEIQKVYRAQGVKLHDKHIEIVVRQMMKYVEVTDPGDSRL
LEGQVLEKWDVEALNERLIAEGKTPVAWKPLLMGVTKSALSTKSWLSAASFQNTTHVLTEAAIAGKKDELIGLKE
NVILGRLIPAGTGSDFVRFTQVVDQKTLKAIEEARKEAVEA*
----------------------------------------------------------------------------
The python script is:
from modeller import *
from modeller.automodel import * # Load the automodel class
log.verbose()
env = environ()
# directories for input atom files
env.io.atom_files_directory = './:../atom_files'
class MyModel(automodel):
def select_atoms(self):
return selection(self.residue_range('1244:D', '1250:D'))
a = MyModel(env, alnfile = 'alignment.ali',
knowns = '2PPB_CD', sequence = '2PPB_CD_fill')
a.starting_model= 1
a.ending_model = 1
#a.loop.starting_model = 1
#a.loop.ending_model = 1
#a.loop.md_level = refine.fast
a.make()
---------------------------------------------------------------------------- ----------------------------------------------------------------------
This procedure fails with the error printed to the screen:
........
.........
KeyError: 'No such residue: 1244:D'
Am I designating the residue, chain correctly in the python script or is it the alignment.ali file that is causing the problem? I have tried several things and looked this up in the archives, but have not found anything directly related. I could have missed it..?
Any help would be greatly appreciated.
Thanks, Steve
On 12/10/09 10:56 AM, Steve Seibold wrote: > I am attempting to “complete” a structure by placing in a loop and doing > loop refinement. The problem I am having is setting up the > “alignment.ali” correctly…I think? The protein has two chains, A and B. > I am using the “alignment.ali” file shown below: Chain C starts with > “1”, Chain D starts with “2”, and ends with 1505. So, Chain D has 1504 > residues. The “python script” used for running Modeller is shown below > the alignment.ali file. > >>P1;2PPB_CD > > structureX:2PPB_CD: 1 :C:1504:D:undefined:undefined:-1.00:-1.00 ... > return selection(self.residue_range('1244:D', '1250:D'))
When you make a selection, it is using the numbering of the target (model) not the template. For consistency, Modeller always numbers the target residues starting from 1, and the chains starting at A. So while your template may have chains C and D, the model will have chains A and B. So you need to modify that selection to use the model numbering - or, alternatively, renumber the model residues to your own preference (see http://salilab.org/archives/modeller_usage/2009/msg00245.html) Either of these changes are only in the Python script though - your alignment file looks fine to me.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
Steve Seibold