Hi,
Thanks for your previous kind reply.
I came across a problem which is quite strange to me:
When running modeller to model a sequence on a template pdb 4QTA (obtained from BLAST, pdbaa database), I figured out that the sequence of the pdb from blast output is different from the original pdb sequence, Please see below the error I get from modeller because of this issue:
Alignment sequence: MAAAAAAGAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNVNKVRVAIKKISPFEHQTYCQRTLREIKIL LRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDL KPSNLLLNTTCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNR PIFPGKHYLDQLNHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHK RIEVEQALAHPYLEQYYDPSDEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS
PDB sequence matching range provided in alignment: GPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNVNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENII GINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNT TCDLKICDFGLARVADTRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQLNHILGILGSPS QEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSDEP IAEAPFKFDMELDDLPKEKLKELIFEETARFQPGY Traceback (most recent call last): File "modified-comp-align.py", line 11, in <module> a.auto_align() # get an automatic alignment File "/data/deepak-protein-modeling/modlib/modeller/automodel/automodel.py", line 146, in auto_align self.sequence, matrix_file, overhang, write_fit) File "/data/deepak-protein-modeling/modlib/modeller/scripts/align_strs_seq.py", line 8, in align_strs_seq aln = alignment(env, file=segfile, align_codes=knowns) File "/data/deepak-protein-modeling/modlib/modeller/alignment.py", line 20, in __init__ self.append(**vars) File "/data/deepak-protein-modeling/modlib/modeller/alignment.py", line 79, in append allow_alternates) _modeller.SequenceMismatchError: get_ran_648E> Alignment sequence does not match that in PDB file: 1 ./4QTA.pdb (You didn't specify the starting and ending residue numbers and chain IDs in the alignment, so Modeller tried to guess these from the PDB file.) Suggestion: put in the residue numbers and chain IDs (see the manual) and run again for more detailed diagnostics. You could also try running with allow_alternates=True to accept alternate one-letter code matches (e.g. B to N, Z to Q).
when checking the alignment file, I see that the alignment (obtained from BLAST result) and modified to PIR format is fine:
>P1;NP_002736.3.35214 sequence:NP_002736.3.35214:1 : :360 : :::-1.00:-1.00 MAAAAAAGAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNVNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNTTCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSDEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS* >P1;4QTA structure:4QTA: :A : :A :::-1.00:-1.00 MAAAAAAGAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNVNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNTTCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMLNSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQLNHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSDEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS*
The problem is arising because the original PDB file 4QTA has different chain A sequence than the chain A sequence obtained from BLAST.
May I please ask your comments and solution to how this problem can be solved? , and please let me know why is this problem occurring?
Thanks much, Deepak
On Sun, Oct 30, 2016 at 4:18 AM, Modeller Caretaker < modeller-care@salilab.org> wrote:
> On 10/28/16 9:33 AM, deepak kumar wrote: > >> However, does not >> mentioning the "start" and "end" residue of the PDB file influence the >> quality of the structure? or influence the model any way? >> > > No. > > Am I right, if I say, that modeller by default takes the residues of >> chain "B", finds the corresponding residues in chain B and correctly >> models the sequence as per the alignment. >> > > That's correct. > > > Ben Webb, Modeller Caretaker > -- > modeller-care@salilab.org https://salilab.org/modeller/ > Modeller mail list: https://salilab.org/mailman/listinfo/modeller_usage >