Re: [modeller_usage] Alignment errors model not running
sorry i forgot to attach the target sequence... it's the fasta sequence of the 1ATG protein...
On Mon, Apr 26, 2010 at 8:19 PM, Daniel Fernandez dfernan@gmail.com wrote:
> Hi, > > Please help me with this error. I have been more than a week trying to > solve this issue and I still can't solve it. > > Let me explain my approach to use modeller. > > 1. First I search for templates against the pdb database > 2. I select as templates the one with low e-value and reasonable > similarity percentage > 3. I use clustalW (or TCoffee) to align the target and the selected > templates. > 4. I use modeller to model the target based on the TCoffee alignment > file and the PDB files. > > I do the whole pipeline but modeller works for some sequences but for most > of them it gives me the following error and at this point I am clueless on > how to solve it. Here I attach my input files to modeller in case someone > wants to take a look at them and help me solve this issue. > > INPUT: finalseq.pir (as in modeller format, i tried all different formats > here, I attach my last approach that was to only save PDB files with the > data from a specific chain...) > template PDB files (the PDB files with the actual chain) > target.fasta > > OUTPUT: error: > get_ran_648E> Alignment sequence not found in PDB file: 3 > 2H5Y_A.pdb > (You didn't specify the starting and ending residue numbers > and > chain IDs in the alignment, so Modeller tried to guess these > from > the PDB file.) > Suggestion: put in the residue numbers and chain IDs (see the > manual) and run again for more detailed diagnostics. > You could also try running with allow_alternates=True to > accept > alternate one-letter code matches (e.g. B to N, Z to Q). > Traceback (most recent call last): > File "testclean.py", line 18, in <module> > a.make() # do the homollogy modelling > File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line > 98, in make > self.homcsr(exit_stage) > File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line > 411, in homcsr > aln = self.read_alignment() > File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line > 401, in read_alignment > aln.append(file=self.alnfile, align_codes=self.knowns+[self.sequence]) > File "/n/sw/modeller-9v7/modlib/modeller/alignment.py", line 79, in > append > allow_alternates) > _modeller.SequenceMismatchError: get_ran_648E> Alignment sequence not found > in PDB file: 3 2H5Y_A.pdb (You didn't specify the starting and > ending residue numbers and chain IDs in the alignment, so Modeller tried to > guess these from the PDB file.) Suggestion: put in the residue numbers and > chain IDs (see the manual) and run again for more detailed diagnostics. You > could also try running with allow_alternates=True to accept alternate > one-letter code matches (e.g. B to N, Z to Q). > > I am completely clueless on where to look the starting and ending residue > numbers and chain IDs in the alignment, clustalW does not give me that > information at all so not sure where to look that info and if possible with > the approach I am using... > > Thanks, > > Daniel F. >
Likely that you did not define the range in the comment line, more detailed information about the alignment format can be found here: http://salilab.org/modeller/manual/node445.html#alignmentformat
This might might work:
P1;1ATG structureX:1ATG:FIRST:@ END:::::: ELKVVTATNFLGTLEQLAGQFAKQTGHAVVISSGSSGPVYAQIVNGAPYNVFFSADEKSPEKLDNQGFALPGSRFTYAIG KLVLWSAKPGLVDNQGKVLAGNGWRHIAISNPQIAPYGLAGTQVLTHLGLLDKLTAQERIVEANSVGQAHSQTASGAADL GFVALAQIIQAAAKIPGSHWFPPANYYEPIVQQAVITKSTAEKANAEQFMSWMKGPKAVAIIKAAGYVLPQ
Cheers, Thomas
On Tue, Apr 27, 2010 at 02:21, Daniel Fernandez dfernan@gmail.com wrote: > sorry i forgot to attach the target sequence... it's the fasta sequence of > the 1ATG protein... > > On Mon, Apr 26, 2010 at 8:19 PM, Daniel Fernandez dfernan@gmail.com wrote: >> >> Hi, >> >> Please help me with this error. I have been more than a week trying to >> solve this issue and I still can't solve it. >> >> Let me explain my approach to use modeller. >> >> First I search for templates against the pdb database >> I select as templates the one with low e-value and reasonable similarity >> percentage >> I use clustalW (or TCoffee) to align the target and the selected >> templates. >> I use modeller to model the target based on the TCoffee alignment file and >> the PDB files. >> >> I do the whole pipeline but modeller works for some sequences but for most >> of them it gives me the following error and at this point I am clueless on >> how to solve it. Here I attach my input files to modeller in case someone >> wants to take a look at them and help me solve this issue. >> >> INPUT: finalseq.pir (as in modeller format, i tried all different formats >> here, I attach my last approach that was to only save PDB files with the >> data from a specific chain...) >> template PDB files (the PDB files with the actual chain) >> target.fasta >> >> OUTPUT: error: >> get_ran_648E> Alignment sequence not found in PDB file: 3 >> 2H5Y_A.pdb >> (You didn't specify the starting and ending residue numbers >> and >> chain IDs in the alignment, so Modeller tried to guess these >> from >> the PDB file.) >> Suggestion: put in the residue numbers and chain IDs (see >> the >> manual) and run again for more detailed diagnostics. >> You could also try running with allow_alternates=True to >> accept >> alternate one-letter code matches (e.g. B to N, Z to Q). >> Traceback (most recent call last): >> File "testclean.py", line 18, in <module> >> a.make() # do the homollogy modelling >> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >> 98, in make >> self.homcsr(exit_stage) >> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >> 411, in homcsr >> aln = self.read_alignment() >> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >> 401, in read_alignment >> aln.append(file=self.alnfile, align_codes=self.knowns+[self.sequence]) >> File "/n/sw/modeller-9v7/modlib/modeller/alignment.py", line 79, in >> append >> allow_alternates) >> _modeller.SequenceMismatchError: get_ran_648E> Alignment sequence not >> found in PDB file: 3 2H5Y_A.pdb (You didn't specify the starting and >> ending residue numbers and chain IDs in the alignment, so Modeller tried to >> guess these from the PDB file.) Suggestion: put in the residue numbers and >> chain IDs (see the manual) and run again for more detailed diagnostics. You >> could also try running with allow_alternates=True to accept alternate >> one-letter code matches (e.g. B to N, Z to Q). >> >> I am completely clueless on where to look the starting and ending residue >> numbers and chain IDs in the alignment, clustalW does not give me that >> information at all so not sure where to look that info and if possible with >> the approach I am using... >> >> Thanks, >> >> Daniel F. > > > > -- > Daniel F. > > Department of Statistics, Harvard University > 1 Oxford Street, Cambridge, MA 02138 > > _______________________________________________ > modeller_usage mailing list > modeller_usage@salilab.org > https://salilab.org/mailman/listinfo/modeller_usage > >
Also make sure that you use the sequence derived from the ATOM records rather then from the SEQRES records. Dunbracks S2C is an excellent resource for that:
http://dunbrack.fccc.edu/Guoli/s2c/index.php
If MODELLER still crashes have a look at the start and end of each chain. It maybe that on residue only has coordinates for 1 N. Delete that ATOM record in that case.
Cheers, Thomas
On Tue, Apr 27, 2010 at 10:17, Thomas Juettemann juettemann@gmail.com wrote: > Likely that you did not define the range in the comment line, more > detailed information about the alignment format can be found here: > http://salilab.org/modeller/manual/node445.html#alignmentformat > > This might might work: > > P1;1ATG > structureX:1ATG:FIRST:@ END:::::: > ELKVVTATNFLGTLEQLAGQFAKQTGHAVVISSGSSGPVYAQIVNGAPYNVFFSADEKSPEKLDNQGFALPGSRFTYAIG > KLVLWSAKPGLVDNQGKVLAGNGWRHIAISNPQIAPYGLAGTQVLTHLGLLDKLTAQERIVEANSVGQAHSQTASGAADL > GFVALAQIIQAAAKIPGSHWFPPANYYEPIVQQAVITKSTAEKANAEQFMSWMKGPKAVAIIKAAGYVLPQ > > Cheers, > Thomas > > On Tue, Apr 27, 2010 at 02:21, Daniel Fernandez dfernan@gmail.com wrote: >> sorry i forgot to attach the target sequence... it's the fasta sequence of >> the 1ATG protein... >> >> On Mon, Apr 26, 2010 at 8:19 PM, Daniel Fernandez dfernan@gmail.com wrote: >>> >>> Hi, >>> >>> Please help me with this error. I have been more than a week trying to >>> solve this issue and I still can't solve it. >>> >>> Let me explain my approach to use modeller. >>> >>> First I search for templates against the pdb database >>> I select as templates the one with low e-value and reasonable similarity >>> percentage >>> I use clustalW (or TCoffee) to align the target and the selected >>> templates. >>> I use modeller to model the target based on the TCoffee alignment file and >>> the PDB files. >>> >>> I do the whole pipeline but modeller works for some sequences but for most >>> of them it gives me the following error and at this point I am clueless on >>> how to solve it. Here I attach my input files to modeller in case someone >>> wants to take a look at them and help me solve this issue. >>> >>> INPUT: finalseq.pir (as in modeller format, i tried all different formats >>> here, I attach my last approach that was to only save PDB files with the >>> data from a specific chain...) >>> template PDB files (the PDB files with the actual chain) >>> target.fasta >>> >>> OUTPUT: error: >>> get_ran_648E> Alignment sequence not found in PDB file: 3 >>> 2H5Y_A.pdb >>> (You didn't specify the starting and ending residue numbers >>> and >>> chain IDs in the alignment, so Modeller tried to guess these >>> from >>> the PDB file.) >>> Suggestion: put in the residue numbers and chain IDs (see >>> the >>> manual) and run again for more detailed diagnostics. >>> You could also try running with allow_alternates=True to >>> accept >>> alternate one-letter code matches (e.g. B to N, Z to Q). >>> Traceback (most recent call last): >>> File "testclean.py", line 18, in <module> >>> a.make() # do the homollogy modelling >>> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >>> 98, in make >>> self.homcsr(exit_stage) >>> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >>> 411, in homcsr >>> aln = self.read_alignment() >>> File "/n/sw/modeller-9v7/modlib/modeller/automodel/automodel.py", line >>> 401, in read_alignment >>> aln.append(file=self.alnfile, align_codes=self.knowns+[self.sequence]) >>> File "/n/sw/modeller-9v7/modlib/modeller/alignment.py", line 79, in >>> append >>> allow_alternates) >>> _modeller.SequenceMismatchError: get_ran_648E> Alignment sequence not >>> found in PDB file: 3 2H5Y_A.pdb (You didn't specify the starting and >>> ending residue numbers and chain IDs in the alignment, so Modeller tried to >>> guess these from the PDB file.) Suggestion: put in the residue numbers and >>> chain IDs (see the manual) and run again for more detailed diagnostics. You >>> could also try running with allow_alternates=True to accept alternate >>> one-letter code matches (e.g. B to N, Z to Q). >>> >>> I am completely clueless on where to look the starting and ending residue >>> numbers and chain IDs in the alignment, clustalW does not give me that >>> information at all so not sure where to look that info and if possible with >>> the approach I am using... >>> >>> Thanks, >>> >>> Daniel F. >> >> >> >> -- >> Daniel F. >> >> Department of Statistics, Harvard University >> 1 Oxford Street, Cambridge, MA 02138 >> >> _______________________________________________ >> modeller_usage mailing list >> modeller_usage@salilab.org >> https://salilab.org/mailman/listinfo/modeller_usage >> >> >
participants (2)
-
Daniel Fernandez
-
Thomas Juettemann