Mismatch between residues of alignment and pdb
I downloaded the pdb_95.pir files, updated on May 14th 2020, from here https://salilab.org/modeller/downloads/pdb_95.pir.gz. I used the tutorial https://salilab.org/modeller/tutorial/basic.html to model one chain of a protein. I downloaded the pdb structure from RCSB-PDB. No changes done from my side in either files. Firstly, I was receiving an error " No atoms were read from the specified input PDB file, since the starting residue number and/or chain id in MODEL_SEGMENT (or the alignment file header) was not found; requested starting position: residue number " 1", chain " "; atom file name: 1r42A.pdb".
I manually changed the chain id from blank to A and another error popped up stating that initial residue isn't matching, so I changed the initial residue from 1 to 19 and the error got resolved. Now, another error is there and isn't getting resolved anyhow. " After running model_single.py, I am getting the following error - _modeller.ModellerError: read_te_290E> Number of residues in the alignment and pdb files are different: 596 655 For alignment entry: 1 1r42A ". Ideally the pdb_95.pir file should have the same info as pdbs, yet this is popping up. Attaching the alignment section of the stated structure and the PDB is 1r42. Kindly can anyone help me in debugging what the exact issue is. I've been on it since days now. The alignment file has 596 residues while the pdb is supposed to have the same number because it has not been tampered with. Where it is going wrong? How can this be resolved?
The alignment section of stated structure -
>P1;1r42A structure:1r42A: 19:A: 596: :::-1.00:-1.00 ------------------STIEEQAKTFLDKFNHEAEDLFYQSSLASWNYNTNITEENVQNMNNAGDKWSAFLKE QSTLAQMYPLQEIQNLTVKLQLQALQQNGSSVLSEDKSKRLNTILNTMSTIYSTGKVCNPDNPQECLLLEPGLNE IMANSLDYNERLWAWESWRSEVGKQLRPLYEEYVVLKNEMARANHYEDYGDYWRGDYEVNGVDGYDYSRGQLIED VEHTFEEIKPLYEHLHAYVRAKLMNAYPSYISPIGCLPAHLLGDMWGRFWTNLYSLTVPFGQKPNIDVTDAMVDQ AWDAQRIFKEAEKFFVSVGLPNMTQGFWENSMLTDPGNVQKAVCHPTAWDLGKGDFRILMCTKVTMDDFLTAHHE MGHIQYDMAYAAQPFLLRNGANEGFHEAVGEIMSLSAATPKHLKSIGLLSPDFQEDNETEINFLLKQALTIVGTL PFTYMLEKWRWMVFKGEIPKDQWMKKWWEMKREIVGVVEPVPHDETYCDPASLFHVSNDYSFIRYYTRTLYQFQF QEALCQAAKHEGPLHKCDISNSTEAGQKLFNMLRLGKSEPWTLALENVVGAKNMNVRPLLNYFEPLFTWLKDQNK NSFVGWSTDWSPYA------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------------------------------------------- ------------------------------------------------------------------------*
Kindly help.
Thanks and Warm Regards Sunidhi
Center for Computational Biology
Indraprastha Institute of Information Technology (IIIT https://www.iiitd.ac.in/)
New Delhi 110020 India
On 10/30/20 1:01 AM, Sunidhi wrote: > I downloaded the pdb_95.pir files, updated on May 14th 2020, from here > https://salilab.org/modeller/downloads/pdb_95.pir.gz. I used the > tutorial https://salilab.org/modeller/tutorial/basic.html to model one > chain of a protein. I downloaded the pdb structure from RCSB-PDB. No > changes done from my side in either files. Firstly, I was receiving an > error " No atoms were read from the specified input PDB file, since the > starting residue number and/or chain id in MODEL_SEGMENT (or the > alignment file header) was not found; requested starting position: > residue number " 1", chain " "; atom file name: 1r42A.pdb".
I can't see how this could be possible because pdb_95.pir contains the following header for 1r42A:
>P1;1r42A structureX:1r42:19:A:615:A:[long name]:[long source]: 2.20: 0.23
This will instruct Modeller to read ATOM/HETATM records from the 1r42 structure file, starting at residue 19 in chain A and ending at residue 615 in chain A. The error message you show could only happen if the header were modified to read something like
>P1;1r42A structureX:1r42A:1::[res]:[chain]:[long name]:[long source]: 2.20: 0.23
That won't match of course because there is no residue 1 in a chain with no ID in that file.
> Ideally the pdb_95.pir file should have the same info as pdbs
It does - in fact it is generated from the PDB files. You can even run the scripts yourself; see https://salilab.org/modeller/wiki/Rebuilding%20sequence%20databases
> The alignment section of stated structure - > > >P1;1r42A > structure:1r42A: 19:A: 596: :::-1.00:-1.00
This is also not what's in the original file. It tells Modeller to read residues starting at 19 in chain A, and ending at residue 596 in a chain with no ID. Since there is no such residue, it will continue to the end of the file, reading the B through E chains, and that won't match your alignment sequence, of course.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
Sunidhi