I'm trying run align2d on 3dxn chain a's pdb and it's sequence provided by the pdb.
My code is as below:
from modeller import * env = environ() aln = alignment(env) mdl = model(env, file="3dxn_nmin") aln.append_model(mdl, align_codes='structure', atom_files="3dxn_nmin.pdb") aln_block = len(aln) aln.append('3dxn.a.fasta', alignment_format='FASTA') aln.align2d(align_block=aln_block, output_weights_file='2dalign.mtx') aln.check_sequence_structure() aln.write(file='tmp.ali', alignment_format='PIR')
In the alignment, I get: >P1;structure structureX:3dxn_nmin.pdb: 19 :A:+258 :A:::-1.00:-1.00 ------------------LSDRYQRVKKLG--SYGEVLLCKDK-TGAERAIKIIKKSSVTTTSNSGALLDEVAVL KQLDHPNIMKLYEFFEDKRNYYLVMEVYRGGELFDEIILRQKFSEVDAAVIMKQVLSGTTYLHKHNIVHRDLKPE NLLLESKS-DALIKIVDFGLSAHFE-------RLGTAYYIAPEVLRKKYDEKCDVWSCGVILYILLCGYPPFGGQ TDQEILKRVEKGKFSFDPPDWTQVSDEAKQLVKLMLTYEPSKRISAEEALNHPWIVKFCSQK*
>P1;3DXN:A:sequence sequence:: : : : :::-1.00:-1.00 MHHHHHHSSGRENLYFQGLSDRYQRVKKLGSGAYGEVLLCKDKLTGAERAIKIIKKSSVTTTSNSGALLDEVAVL KQLDHPNIMKLYEFFEDKRNYYLVMEVYRGGELFDEIILRQKFSEVDAAVIMKQVLSGTTYLHKHNIVHRDLKPE NLLLESKSRDALIKIVDFGLSAHFEVGGKMKERLGTAYYIAPEVLRKKYDEKCDVWSCGVILYILLCGYPPFGGQ TDQEILKRVEKGKFSFDPPDWTQVSDEAKQLVKLMLTYEPSKRISAEEALNHPWIVKFCSQK*
If you look at the biggest gap, preceded by "HFE" in the alignment, that E should be at the other end of the gap. Sequence-wise, having it at either end is equivalent (there is an E in both places), but the actual pdb file has "ERL" at the other end and "HF" preceding.
When I run the check_sequence_structure(), it agrees, saying: Implied target CA(i)-CA(i+1) distances longer than 8.0 angstroms:
ALN_POS TMPL RID1 RID2 NAM1 NAM2 DIST ---------------------------------------------- 174 1 174 182 F E 12.004 END OF TABLE
I've placed the files at github if they are of any help They can be cloned at: git clone git://gist.github.com/114942.git
Thanks, David
David Hall wrote: > I'm trying run align2d on 3dxn chain a's pdb and it's sequence > provided by the pdb.
If I understand you correctly, you want to fill in the missing residues for a PDB file. The best way to do this is to create the alignment manually, using a text editor (although in principle you could write a script to parse the SEQRES and ATOM/HETATM records from the PDB file and use them to construct the alignment by comparison). This is the only real way to get the gaps where they are "supposed" to be. align2d will do a simple sequence-structure alignment, so it places the gaps wherever the dynamic programming decides is best. It does not know that one sequence happens to be the full PDB sequence and that it should therefore put the gaps where breaks in the PDB residue numbering occur.
Ben Webb, Modeller Caretaker
participants (2)
-
David Hall
-
Modeller Caretaker