Incorrect Number or residues/Starting position
To Whom It May Concern:
I'm attempting to use modeller to use two templates to model on short sequence. I'm also creating the alignment file (PIR format) on my own and I'm having some troubles with one of my sequences. My alignment file is as follows:
>P1;15c8 structure:15c8:1:H:217:H:::: EVQLQQSGAELVKPGASVKLSCTASGFNIKDTYMHWVKQKPEQGLEWIAQI DPANGNTKYDPKFQGKATITADTSSNTAYLHLSSLTSEDSAVYY-----CA ADPPYYG--------------HGDYWGQGTTLTVSSAKTTPPSVYPLAPGS AAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSS SVTVPSSTWPSETVTCNVAHPASSTKVDKKIV-----* >P1;1fns structure:1fns:1:H:225:H:::: QVQLKESGPGLVAPSQSLSITCTVSGFSLTDYGVDWVRQPPGKGLEWLGMI WGD-GSTDYNSALKSRLSITKDNSKSQVFLKMNSLQTDDTARYYCVRDP-- -------ADYGNYDYALDYWG------QGTSVTVSSAKTTPPSVYPLAPGS AAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSS SVTVPSSTWPSETVTCNVAHPASSTKVDKKIVPRDCG* >P1;1f11 sequence:1f11:1::15::::: --------------------------------------------------- --------------------------------------------CANDY-- ---GSTY--------GFAYWG------------------------------ --------------------------------------------------- -------------------------------------*
The script that was run (quite simple):
from modeller.automodel import *
log.verbose() env = environ() a = automodel(env, alnfile='modAln.ali', knowns=('15c8','1fns'), sequence='1f11') a.starting_model = 1 a.ending_model = 5 a.make()
And then the log file with the error:
openf5__224_> Open 11 OLD SEQUENTIAL $(LIB)/restyp.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/resdih.lib rdrdih__263_> Number of dihedral angle types : 9 Maximal number of dihedral angle optima: 3 Dihedral angle names : Alph Phi Psi Omeg chi1 chi2 chi3 chi4 chi5 openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/radii.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/radii14.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/af_mnchdef.lib rdwilmo_274_> Mainchain residue conformation classes: APBLE openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/mnch.lib rdclass_257_> Number of classes: 5 openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/mnch1.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/mnch2.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/mnch3.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL8v1}/modlib/xs4.mat rdrrwgh_268_> Number of residue types: 21 runcmd______> alignment.append(align_codes=['15c8', '1fns', '1f11'], atom_files=[], file='modAln.ali', (def)remove_gaps=True, (def)alignment_format='PIR', add_sequence=True, (def)rewind_file=False, (def)close_file=True)
openf___224_> Open modAln.ali
Dynamically allocated memory at amaxalignment [B,kB,MB]: 2124923 2075.120 2.026
Dynamically allocated memory at amaxalignment [B,kB,MB]: 2126623 2076.780 2.028
Dynamically allocated memory at amaxalignment [B,kB,MB]: 2130023 2080.101 2.031
Dynamically allocated memory at amaxalignment [B,kB,MB]: 2136823 2086.741 2.038
Dynamically allocated memory at amaxalignment [B,kB,MB]: 2457301 2399.708 2.343
Read the alignment from file : modAln.ali Total number of alignment positions: 241
# Code #_Res #_Segm PDB_code Name ------------------------------------------------------------------------------- 1 15c8 217 1 15c8 2 1fns 225 1 pdb1fns.ent 3 1f11 15 1 1f11 runcmd______> alignment.check()
check_a_343_> >> BEGINNING OF COMMAND openf5__224_> Open 11 OLD SEQUENTIAL pdb15c8.ent
Dynamically allocated memory at amaxstructure [B,kB,MB]: 2579199 2518.749 2.460 openf5__224_> Open 11 OLD SEQUENTIAL pdb15c8.ent openf5__224_> Open 11 OLD SEQUENTIAL pdb1fns.ent rdpdb___303E> No atoms were read from the specified input PDB file, since the starting residue number and/or chain id in MODEL_SEGMENT (or the alignment file header) was not found; requested starting position: 1: H rdabrk__288W> Protein not accepted: 2 check_a_337E> Structure not read in: 2
I'm running version 8v1 on Fedora core 3. The best that I can decipher is that modeller can't understand the starting position. Or perhaps it can't find the chain. The pdb file has the same exact sequence (minus the gaps of course) and chain id H.
Interestingly (or confusingly) I also tried replacing the residue start and end with the words 'FIRST' and 'LAST' leaving the chain H, thinking that Modeller could grab the chain without me telling it where to start and end explicitly. The end of the error file was as follows:
Total number of alignment positions: 241
# Code #_Res #_Segm PDB_code Name ------------------------------------------------------------------------------- 1 15c8 217 1 15c8 2 1fns 225 1 1fns 3 1f11 15 1 1f11 runcmd______> alignment.check()
check_a_343_> >> BEGINNING OF COMMAND openf5__224_> Open 11 OLD SEQUENTIAL pdb15c8.ent
Dynamically allocated memory at amaxstructure [B,kB,MB]: 2579199 2518.749 2.460 openf5__224_> Open 11 OLD SEQUENTIAL pdb15c8.ent openf5__224_> Open 11 OLD SEQUENTIAL pdb1fns.ent
Dynamically allocated memory at amaxstructure [B,kB,MB]: 2703099 2639.745 2.578 openf5__224_> Open 11 OLD SEQUENTIAL pdb1fns.ent rdabrk__290E> Number of residues in the alignment and pdb files are different: 225 220 For alignment entry: 2 rdabrk__288W> Protein not accepted: 2 check_a_337E> Structure not read in: 2
Okay, so modeller thinks there are 220 residues in the pdb file? Is it looking at the right chain? The other chains in that file are length 217 and 195 (which isn't 220). It's looking at the correct file, who knows what chain it's going for. Anything I try here ends up giving me different results in the log file (like FIRST and 225, modeller tells me it found 11 residues in the pdb file).
Perhaps I'm way off in my thinking and I have some huge error in my alignment file. I've tried modelling with three proteins (in the very same way, just a different protein for this problem one) and I was getting very similar errors. I can't make the second protein be able to be read in. Any help would be appreciated.
Kevin
Kevin Galens (RIT Student) wrote: > I'm attempting to use modeller to use two templates to model on short > sequence. I'm also creating the alignment file (PIR format) on my own > and I'm having some troubles with one of my sequences. My alignment > file is as follows: > > >P1;15c8 > structure:15c8:1:H:217:H:::: > EVQLQQSGAELVKPGASVKLSCTASGFNIKDTYMHWVKQKPEQGLEWIAQI > DPANGNTKYDPKFQGKATITADTSSNTAYLHLSSLTSEDSAVYY-----CA > ADPPYYG--------------HGDYWGQGTTLTVSSAKTTPPSVYPLAPGS > AAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSS > SVTVPSSTWPSETVTCNVAHPASSTKVDKKIV-----* > >P1;1fns > structure:1fns:1:H:225:H:::: > QVQLKESGPGLVAPSQSLSITCTVSGFSLTDYGVDWVRQPPGKGLEWLGMI > WGD-GSTDYNSALKSRLSITKDNSKSQVFLKMNSLQTDDTARYYCVRDP-- > -------ADYGNYDYALDYWG------QGTSVTVSSAKTTPPSVYPLAPGS > AAQTNSMVTLGCLVKGYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSS > SVTVPSSTWPSETVTCNVAHPASSTKVDKKIVPRDCG* > >P1;1f11 > sequence:1f11:1::15::::: > --------------------------------------------------- > --------------------------------------------CANDY-- > ---GSTY--------GFAYWG------------------------------ > --------------------------------------------------- > -------------------------------------*
...
> openf5__224_> Open 11 OLD SEQUENTIAL pdb1fns.ent > rdpdb___303E> No atoms were read from the specified input PDB file, > since the > starting residue number and/or chain id in MODEL_SEGMENT (or > the alignment file header) was not found; > requested starting position: 1: H > rdabrk__288W> Protein not accepted: 2 > check_a_337E> Structure not read in: 2 > > I'm running version 8v1 on Fedora core 3. The best that I can decipher > is that modeller can't understand the starting position. Or perhaps it > can't find the chain. The pdb file has the same exact sequence (minus > the gaps of course) and chain id H.
It can't find residue number '1' in chain 'H'. This isn't surprising, because PDB code 1fns has chain H numbered from 215 to 439. You should fix your alignment file.
> Interestingly (or confusingly) I also tried replacing the residue start > and end with the words 'FIRST' and 'LAST' leaving the chain > H, thinking that Modeller could grab the chain without me telling it > where to start and end explicitly. The end of the error file was as > follows: ... > rdabrk__290E> Number of residues in the alignment and pdb files are > different: > 225 220 > For alignment entry: 2 > rdabrk__288W> Protein not accepted: 2 > check_a_337E> Structure not read in: 2 > > > Okay, so modeller thinks there are 220 residues in the pdb file? Is it > looking at the right chain? The other chains in that file are length 217 > and 195 (which isn't 220). It's looking at the correct file, who knows > what chain it's going for.
There are 220 residues in chain H, since residues 352 through 356 were not located in the experiment (see the REMARK 465 lines). Since there is no structural information for these 5 residues, your alignment should probably have a gap of width 5 at that point.
Incidentally, there is a similar problem with your other structure (15c8); residues 131, 132, 155, and 158 through 161 are missing.
Ben Webb, Modeller Caretaker
participants (2)
-
Kevin Galens (RIT Student)
-
Modeller Caretaker