sequence of salign/align2d

21 Oct 2016


      Hi
I want to make a homology model using the .pdb structure of a template and a target sequence (that is very similar to the template)
I have previosly used align2d (see script align2d_old.py below) to prepare homology models. but the current sequences are very long (a pentamer with 573 aa per chain) and align2d was taking a very long time (hours). so I wanted to use salign.
based on what I read in tutorials and forums I am however rather confused about the sequence of scritpts/commands to use
Essentially, I have three questions:
- what is the sequence of scripts / commands to use? do I first use salign to prepare an alignment and then align2d (or salign that emulates align2d).
- what is the purpose of first using salign and then align2d ?
- in the previous scripts of align2d the .pdb file was used as an input. in the new scripts that I used the .pdb file does not seem to be used. so how is the known structure of the template taken in to consideration for the alignment?
the scripts I have previously used and the ones I now use are below.
name of template: Glinker (known sequence and structure (pdb))
name of target: GSAlinker (known sequence )
thanks
Evelyne
*************************************************************
# profile-profile alignment using salign
# based on salign_profile_profile.py
# on https://salilab.org/modeller/9v7/manual/node288.html
from modeller import *
log.level(1, 0, 1, 1, 1)
env = environ()
aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.fasta', alignment_format='FASTA')
aln.salign(rr_file='${LIB}/blosum62.sim.mat',
gap_penalties_1d=(-500, 0), output='',
align_block=2,   # no. of seqs. in first MSA
align_what='PROFILE',
alignment_type='PAIRWISE',
comparison_type='PSSM',  # or 'MAT' (Caution: Method NOT benchmarked
# for 'MAT')
similarity_flag=True,    # The score matrix is not rescaled
substitution=True,       # The BLOSUM62 substitution values are
# multiplied to the corr. coef.
#write_weights=True,
#output_weights_file='test.mtx', # optional, to write weight matrix
smooth_prof_weight=10.0) # For mixing data with priors
#write out aligned profiles (MSA)
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR')
# Make a pairwise alignment of two sequences
aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR',
align_codes=('HA1onCtermVP1dc63_Glinker', 'HA1onCtermVP1dc63_GSAlinker'))
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', alignment_format='PIR')
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.pap', alignment_format='PAP')
**************************************************************
**************************************************************
# based on example on salign_align2d.py
# on https://salilab.org/modeller/9v7/manual/node288.html
# align2d/align using salign
# parameters to be input by the user
# 1.  gap_penalties_1d
# 2.  gap_penalties_2d
# 3.  input alignment file
from modeller import *
log.verbose()
env = environ()
env.io.atom_files_directory = ['../atom_files']
aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', align_codes='all')
aln.salign(rr_file='$(LIB)/as1.sim.mat',  # Substitution matrix used
           output='',
           max_gap_length=20,
           gap_function=True,              # If False then align2d not done
           feature_weights=(1., 0., 0., 0., 0., 0.),
           gap_penalties_1d=(-450, -50),
           gap_penalties_2d=(3.5, 3.5, 3.5, 0.2, 4.0, 6.5, 2.0, 0.0, 0.0),
           # d.p. score matrix
           #write_weights=True, output_weights_file='salign.mtx'
           similarity_flag=True)   # Ensuring that the dynamic programming
                                   # matrix is not scaled to a difference matrix
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.ali', alignment_format='PIR' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.pap', alignment_format='PAP' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')
**************************************************************
**************************************************************
from modeller import *
log.verbose()
# create evironment and new alignment
env = environ()
aln = alignment(env)
# create a model for the template HA1onCtermVP1dc63_Glinker
mdl = model(env, file='HA1onCtermVP1dc63_Glinker', model_segment=('FIRST:A','LAST:E'))
aln.append_model(mdl, align_codes='HA1onCtermVP1dc63_Glinker', atom_files='HA1onCtermVP1dc63_Glinker.pdb')
# create a model for the target HA1onCtermVP1dc63_GSAlinker
aln.append(file='HA1onCtermVP1dc63_GSAlinker.ali', align_codes='HA1onCtermVP1dc63_GSAlinker')
aln.align2d()
aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.ali', alignment_format='PIR')
aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.pap', alignment_format='PAP')
**************************************************************

Evelyne Deplazes

Modeller Caretaker

tags (0)

participants (2)