automatic alignment pdb file with gap to the similar complete sequence
Hi all,
I want to fill a PDB files with missing residues. I use the help from : http://salilab.org/modeller/wiki/Missing%20residues
I just want to optimize by doing automatically the alignment. I try using salign_align2d.py, I change matrix, playing wih gap_penalties, ...:
Alignment file: ---------------------------------------------------------------------------------------------------------
>P1;3DPL structureX:3DPL.pdb: 401 :C:+465 :R:::-1.00:-1.00 ESKCPEELANYCDMLLRKTPLSKKLTSEEIEAKLKEVLKKLKYVQNKDVFMRYHKAHLTRRLILDISADSEIEEN MVEWLREVGMPADYVNKLARMFQDIKVSEDLNQAFKEMHKNNALPADSVNIKILNAGAWSRSSEKVFVSLPTELE DLIPEVEEFYKKNHSGRKLHWHHLMSNGIITFKNEVGQYDLEVTTFQLAVLFAWNQRPREKISFENLKLATELPD AELRRTLWSLVAFPKLKRQVLLYEPQVNSPKDFTEGTLFSVNQEFSLIKNAKVQKRGKINLIGRLQLTTERMREE ENEGIVQLRILRTQEAIIQIMKMRKKISNAQLQTELVEILKNMFLPQKKMIKEQIEWLIEHKYIRRDESDINTFI YMA/KKRFEVKKWNAVALWAWDIVVDNCAICRNHIMDLCIECQANQASAECTVAWGVCNHAFHFHCISRWLKTRQ VCPLDNREWEFQKYGH*
>P1;3DPL-full sequence:3DPL-full:::::::2.60: 0.24 GSESKCPEELANYCDMLLRKTPLSKKLTSEEIEAKLKEVLKKLKYVQNKDVFMRYHKAHLTRRLILDISADSEIEEN MVEWLREVGMPADYVNKLARMFQDIKVSEDLNQAFKEMHKNNKLALPADSVNIKILNAGAWSRSSEKVFVSLPTELE DLIPEVEEFYKKNHSGRKLHWHHLMSNGIITFKNEVGQYDLEVTTFQLAVLFAWNQRPREKISFENLKLATELPD AELRRTLWSLVAFPKLKRQVLLYEPQVNSPKDFTEGTLFSVNQEFSLIKNAKVQKRGKINLIGRLQLTTERMREE ENEGIVQLRILRTQEAIIQIMKMRKKISNAQLQTELVEILKNMFLPQKKMIKEQIEWLIEHKYIRRDESDINTFI YMA/GSMDVDTPSGTNSGAGKKRFEVKKWNAVALWAWDIVVDNCAICRNHIMDLCIECQANQASATSEECTVAWGVCNHAFHFHCISRWLKTRQ VCPLDNREWEFQKYGH* ------------------------------------------------------------------------------------------------ Missing residues: ('1:A', '2:A'), ('120:A', '121:A'), ('383:B', '398:B'), ('444:B', '446:B'))
Script file: ---------------------------------------------------------------------------------------------------- # align2d/align using salign
# parameters to be input by the user # 1. gap_penalties_1d # 2. gap_penalties_2d # 3. input alignment file
from modeller import * log.verbose() env = environ() env.io.atom_files_directory = ['../atom_files']
aln = alignment(env, file='align2d_in.ali', align_codes='all') aln.salign(rr_file='$(LIB)/id.sim.mat', # Substitution matrix used output='', max_gap_length=20, gap_function=True, # If False then align2d not done feature_weights=(1., 0., 0., 0., 0., 0.), gap_penalties_1d=(-200, 0), gap_penalties_2d=(3.5, 3.5, 3.5, 0.2, 4.0, 6.5, 2.0, 0.0, 0.0), # d.p. score matrix #output_weights_file='salign.mtx' similarity_flag=True) # Ensuring that the dynamic programming # matrix is not scaled to a difference matrix aln.write(file='align2d.ali', alignment_format='PIR') aln.write(file='align2d.pap', alignment_format='PAP') ------------------------------------------------------------------------------------------------
But the alignment in output is not perfect. Is there some parameters I can use to obtain the good alignment ?
Any suggestions or pointers would be most appreciate.
Aurélien
On 3/22/13 3:09 AM, THUREAU Aurelien wrote: > I want to fill a PDB files with missing residues. ... > I just want to optimize by doing _automatically_ the alignment.
We generally recommend creating the alignment manually for simple cases like these (it would be easy to script too if you're planning to do this for many cases). Global dynamic programming is going to try to make a 'sensible' alignment, which isn't what you want (since it will minimize the huge insertions that you want). You might be able to make it work by setting the gap penalties to zero and doing a regular sequence alignment (not align2d) but it's probably more trouble than it's worth.
Ben Webb, Modeller Caretaker
participants (2)
-
Modeller Caretaker
-
THUREAU Aurelien