Building multi-chain models
Good day,
I would like to build a GABA A receptor with (two alpha2 subunits chain A, D), (two beta2 subunits chain B, E) and (one gamma subunit chain C). I create an .ali file with template (6HUG) and with target sequences. I have received the error: 'There should be 10 fields separated by colons, : This line actually contains 15 fields.' I don't understand how I should correctly change it, if I have 5 chains. Could you please advise, how I should correctly write this 10 field line. I have already read the manual https://salilab.org/modeller/manual/node501.html#alignmentformat, but there is information about 2 chains. I still can not understand how correctly I should write it for 5 chains.
The alignment for my template is look like: >P1;6HUG structureX:6HUG:FIRST:A 437:A 473:B 495:C 437:D 473:E:::3.5:-1.00 ------------------------------DYKDDD----DKQPSLQDEL---------K DNTTVFTRILDRLLDGYDNRL---------------RPG----LGERVTEVKTD-IFVTS FGPVSDHDMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTFFHNGKKSVA HNMTMPNKLLRITEDGTLLYTMRLTV----RAECPMHLEDFPMDAHACPLKFGSYAYTRA EVVYEWTREPARSVVVAEDGSRLNQYDLLGQTVDSGIVQSSTGEYVVMTTHFHLKRKIGY FVIQTYLPC---------------IMTVILSQVSFWLNRE-SVPARTV-FGVTTVLTMTT LSISA----RNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYAWDGKSVVP EKPKKVKDPLIKKNN---TY--------------APTA--------TSYT---------- -----------------------------PNLARGD------------------------ ---------------PGLATIAKSATIEPKEVK----------------------PETKP ---------PEPKKTFNSVSKIDRLSRIAFPLLFGIFNLVYWAT-----YLNREPQLKAP TPHQ----------/ -----MCSGLLEL-------------LLPIWLSWTLGTRGSEPRSV-----------NDP GNMSFVKETVDKLLKGYDIRL---------------RPD----FGGPPVCVGMN-IDIAS IDMVSEVNMDYTLTMYFQQYWRDKRLAYSGIPLNLTLDNRVADQLWVPDTYFLNDKKSFV HGVTVKNRMIRLHPDGTVLYGLRITT----TAACMMDLRRYPLDEQNCTLEIESYGYTTD DIEFYWRGGDKAV--TGVERIELPQFSIVEHRLVSRNVVFATGAYPRLSLSFRLKRNIGY FILQTYMPS---------------ILITILSWVSFWINYD-ASAARVA-LGITTVLTMTT INTHL----RETLPKIPYVKAIDMYLMGCFVFVFLALLEYAFVNYIFFGRGPQRQKKLAE KTAK-AKNDRSKSES--------------------------------------NRVDAHG NILLTSL--E-VHNE---------MNE--VSGGIGD------------------------ ---------------TRNSAISFDNSGI-QYRK-----QSMPREGHGRFLGDRSLPHKKT HLRRRSSQLKIKIPDLTDVNAIDRWSRIVFPFTFSLFNLVYWLY-----YVN-------- --------------/ MSSPNIWSTGSSVYSTPVFSQKMTVWILLLLSLYPGFTSQKSDDDYEDYASNKTWVLTPK VPEGDVTVILNNLLEGYDNKL---------------RPD----IGVKPTLIHTD-MYVNS IGPVNAINMEYTIDIFFAQTWYDRRLKFNSTIKVLRLNSNMVGKIWIPDTFFRNSKKADA HWITTPNRMLRIWNDGRVLYTLRLTI----DAECQLQLHNFPMDEHSCPLEFSSYGYPRE EIVYQWKRSSVEV--GDTRSWRLYQFSFVGLRNTTEVVKTTSGDYVVMSVYFDLSRRMGY FTIQTYIPC---------------TLIVVLSWVSFWINKD-AVPARTS-LGITTVLTMTT LSTIA----RKSLPKVSYVTAMDLFVSVCFIFVFSALVEYGTLHYFVSNRKPS------K DKDKKKKNPLLRMFS---FK--------------APTI--------D-I----------- -----------------------------------R------------------------ ---------------PRSATIQMNNATHLQERDEEYGYECLDGKDCASFFCCFEDCRTGA ---------WRHGRIHIRIAKMDSYARIFFPTAFCLFNLVYWVS-----YLYLGGSGGSG GSGKTETSQVAPA-/ ------------------------------DYKDDD----DKQPSLQDEL---------K DNTTVFTRILDRLLDGYDNRL---------------RPG----LGERVTEVKTD-IFVTS FGPVSDHDMEYTIDVFFRQSWKDERLKFKGPMTVLRLNNLMASKIWTPDTFFHNGKKSVA HNMTMPNKLLRITEDGTLLYTMRLTV----RAECPMHLEDFPMDAHACPLKFGSYAYTRA EVVYEWTREPARSVVVAEDGSRLNQYDLLGQTVDSGIVQSSTGEYVVMTTHFHLKRKIGY FVIQTYLPC---------------IMTVILSQVSFWLNRE-SVPARTV-FGVTTVLTMTT LSISA----RNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGYAWDGKSVVP EKPKKVKDPLIKKNN---TY--------------APTA--------TSYT---------- -----------------------------PNLARGD------------------------ ---------------PGLATIAKSATIEPKEVK----------------------PETKP ---------PEPKKTFNSVSKIDRLSRIAFPLLFGIFNLVYWAT-----YLNREPQLKAP TPHQ----------/ -----MCSGLLEL-------------LLPIWLSWTLGTRGSEPRSV-----------NDP GNMSFVKETVDKLLKGYDIRL---------------RPD----FGGPPVCVGMN-IDIAS IDMVSEVNMDYTLTMYFQQYWRDKRLAYSGIPLNLTLDNRVADQLWVPDTYFLNDKKSFV HGVTVKNRMIRLHPDGTVLYGLRITT----TAACMMDLRRYPLDEQNCTLEIESYGYTTD DIEFYWRGGDKAV--TGVERIELPQFSIVEHRLVSRNVVFATGAYPRLSLSFRLKRNIGY FILQTYMPS---------------ILITILSWVSFWINYD-ASAARVA-LGITTVLTMTT INTHL----RETLPKIPYVKAIDMYLMGCFVFVFLALLEYAFVNYIFFGRGPQRQKKLAE KTAK-AKNDRSKSES--------------------------------------NRVDAHG NILLTSL--E-VHNE---------MNE--VSGGIGD------------------------ ---------------TRNSAISFDNSGI-QYRK-----QSMPREGHGRFLGDRSLPHKKT HLRRRSSQLKIKIPDLTDVNAIDRWSRIVFPFTFSLFNLVYWLY-----YVN-------- --------------*
Thank you for your help.
On 6/6/24 4:56 AM, alina.s123567--- via modeller_usage wrote: > Could you please advise, how I should correctly write this 10 field > line. I have already read the manual > https://salilab.org/modeller/manual/node501.html#alignmentformat, but > there is information about 2 chains. I still can not understand how > correctly I should write it for 5 chains.... > The alignment for my template is look like: >> P1;6HUG > structureX:6HUG:FIRST:A 437:A 473:B 495:C 437:D 473:E:::3.5:-1.00
Modeller is quite simplistic here and can only read a single range of residues from the file. It reads the PDB or mmCIF file line by line, starting with the first residue you specify and ending with the last one. So you can't give it multiple ranges as you have here, i.e. FIRST:A 437:A 473:B 495:C 437:D 473:E
What might work instead, if the chains are organized alphabetically in the file, would be something like
structureX:6HUG:FIRST:A:473:E:::3.5:-1.00
i.e. read starting from the first residue in chain A until residue 473 in chain E. This won't work though if there are C-terminal or N-terminal residues in the intervening chains that you don't want in the model - there is no way to tell Modeller not to read them from the PDB file. In this case, either include the extra residues in your template sequence in the alignment file (and align them with gaps in the model, so they don't get used) or edit the PDB file and delete the residues you don't want.
Ben Webb, Modeller Caretaker
participants (2)
-
alina.s123567@gmail.com
-
Modeller Caretaker