Hello,
I am modelling a pentameric protein based on a template with missing residues, e.g. aaaaabbbbbdddddeeeee (except the original protein sequence has "ccccc" corresponding to a long disordered region between the "b" and "d", but this is not modelled, hence the FASTA of the protein looks like the above, and the .pdb file and 3D model of the protein would show "missing residues" between "b" and "d")
The original sequence of the target protein is: fffffggggghhhhhiiiiijjjjj except "iiiii" is a disordered region which aligns with "ccccc" in the template. If I remove "iiiii" from the sequence before aligning with the above template sequence (without the disordered region), "h" and "j" will be interpreted as continuous in the .pdb file of the target generated, affecting the structure of the protein. How would I make PDB interpret a gap between "h" and "j" that aligns with the gap between "b" and "d" in the template?
With thanks and kind regards, Amanda
Another note: both the template and target sequences are single chains.
On 3/12/24 11:14 AM, hmad3--- via modeller_usage wrote: > If I remove "iiiii" from the sequence before aligning with the above > template sequence (without the disordered region), "h" and "j" will > be interpreted as continuous in the .pdb file of the target > generated, affecting the structure of the protein. How would I make > PDB interpret a gap between "h" and "j" that aligns with the gap > between "b" and "d" in the template?
I think by "gap" here you mean "chain break", i.e. you don't want a peptide bond between the last residue in "h" and the first residue in "j". This is straightforward: add a chain break character ("/") between h and j in your target sequence. This simply tells Modeller to not construct that peptide bond. Normally one would align a chain break in the target with a chain break or a gap ("-") in the template.
> Another note: both the template and target sequences are single chains.
Perhaps you mean all residues have the same chain ID. The target will be two chains if you add a chain break. (Modeller will by default label them A and B, but you can call them both A if you like.)
Ben Webb, Modeller Caretaker
Thank you very much for your reply.
>I think by "gap" here you mean "chain break", i.e. you don't want a >peptide bond between the last residue in "h" and the first residue in >"j". This is straightforward: add a chain break character ("/") between >h and j in your target sequence. This simply tells Modeller to not >construct that peptide bond. Normally one would align a chain break in >the target with a chain break or a gap ("-") in the template.
Yes, that is correct. But would I also add a chain break character between "b" and "d" in the template?
>Perhaps you mean all residues have the same chain ID. The target will be >two chains if you add a chain break. (Modeller will by default label >them A and B, but you can call them both A if you like.)
I am modelling a homopentamer. "aaaaabbbbb[ccccc]dddddeeeee" is the sequence for the monomer making up the template, and "fffffggggghhhhh[iiiii]jjjjj" is for a monomer of the target (which is also a pentamer). The .ali file is currently of the form:
>P1;target sequence:target::A::E:::: fffffggggghhhhhjjjjj/fffffggggghhhhhjjjjj/fffffggggghhhhhjjjjj/fffffggggghhhhhjjjjj/fffffggggghhhhhjjjjj*
>P1;template structureX:Mus-5HT3RA-2-clean::A::E:::: aaaaabbbbbdddddeeeee/aaaaabbbbbdddddeeeee/aaaaabbbbbdddddeeeee/aaaaabbbbbdddddeeeee/aaaaabbbbbdddddeeeee*
Another note: the template monomer sequence is also missing the first few N-terminal residues of the precursor protein - i.e. the full sequence is zzzzzaaaaabbbbbcccccdddddeeeee, with zzzzz and ccccc not being included in the .pdb file, so that the N-terminal-most "a" is marked as residue 6, not 1. Likewise, the full target monomer sequence is yyyyyfffffggggghhhhhiiiiijjjjj, with yyyyy aligning with zzzzz. I deleted yyyyy, and in the .pdb file for the target, the N-terminal-most "f" is marked as residue 1. Does it matter - should I try to make the first "f" marked as residue 6?
I look forward to your advice.
On 3/12/24 3:33 PM, hmad3--- via modeller_usage wrote: >> add a chain break character ("/") between h and j in your target >> sequence. This simply tells Modeller to not construct that peptide >> bond. Normally one would align a chain break in the target with a >> chain break or a gap ("-") in the template. > > Yes, that is correct. But would I also add a chain break character > between "b" and "d" in the template?
Yes, or a gap, it makes no difference.
> Another note: the template monomer sequence is also missing the first > few N-terminal residues of the precursor protein - i.e. the full > sequence is zzzzzaaaaabbbbbcccccdddddeeeee, with zzzzz and ccccc not > being included in the .pdb file, so that the N-terminal-most "a" is > marked as residue 6, not 1. Likewise, the full target monomer > sequence is yyyyyfffffggggghhhhhiiiiijjjjj, with yyyyy aligning with > zzzzz. I deleted yyyyy, and in the .pdb file for the target, the > N-terminal-most "f" is marked as residue 1. Does it matter - should I > try to make the first "f" marked as residue 6?
You can model whatever target sequence you like. If it's aligned, Modeller will use information from the template; otherwise, it will be modeled using just the CHARMM forcefield. Modeller also doesn't care about the residue numbering; that is just for the human's benefit. If you want a different numbering, use rename_segments() as per https://salilab.org/modeller/10.5/manual/node30.html
Ben Webb, Modeller Caretaker