Hi
I want to make a homology model using the .pdb structure of a template and a target sequence (that is very similar to the template)
I have previosly used align2d (see script align2d_old.py below) to prepare homology models. but the current sequences are very long (a pentamer with 573 aa per chain) and align2d was taking a very long time (hours). so I wanted to use salign.
based on what I read in tutorials and forums I am however rather confused about the sequence of scritpts/commands to use
Essentially, I have three questions:
- what is the sequence of scripts / commands to use? do I first use salign to prepare an alignment and then align2d (or salign that emulates align2d).
- what is the purpose of first using salign and then align2d ?
- in the previous scripts of align2d the .pdb file was used as an input. in the new scripts that I used the .pdb file does not seem to be used. so how is the known structure of the template taken in to consideration for the alignment?
the scripts I have previously used and the ones I now use are below.
name of template: Glinker (known sequence and structure (pdb))
name of target: GSAlinker (known sequence )
thanks
Evelyne
*************************************************************
# profile-profile alignment using salign
# based on salign_profile_profile.py
# on https://salilab.org/modeller/9v7/manual/node288.html
from modeller import *
log.level(1, 0, 1, 1, 1)
env = environ()
aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.fasta', alignment_format='FASTA')
aln.salign(rr_file='${LIB}/blosum62.sim.mat',
gap_penalties_1d=(-500, 0), output='',
align_block=2, # no. of seqs. in first MSA
align_what='PROFILE',
alignment_type='PAIRWISE',
comparison_type='PSSM', # or 'MAT' (Caution: Method NOT benchmarked
# for 'MAT')
similarity_flag=True, # The score matrix is not rescaled
substitution=True, # The BLOSUM62 substitution values are
# multiplied to the corr. coef.
#write_weights=True,
#output_weights_file='test.mtx', # optional, to write weight matrix
smooth_prof_weight=10.0) # For mixing data with priors
#write out aligned profiles (MSA)
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR')
# Make a pairwise alignment of two sequences
aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR',
align_codes=('HA1onCtermVP1dc63_Glinker', 'HA1onCtermVP1dc63_GSAlinker'))
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', alignment_format='PIR')
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.pap', alignment_format='PAP')
**************************************************************
**************************************************************
**************************************************************
**************************************************************
from modeller import *
log.verbose()
# create evironment and new alignment
env = environ()
aln = alignment(env)
# create a model for the template HA1onCtermVP1dc63_Glinker
mdl = model(env, file='HA1onCtermVP1dc63_Glinker', model_segment=('FIRST:A','LAST:E'))
aln.append_model(mdl, align_codes='HA1onCtermVP1dc63_Glinker', atom_files='HA1onCtermVP1dc63_Glinker.pdb')
# create a model for the target HA1onCtermVP1dc63_GSAlinker
aln.append(file='HA1onCtermVP1dc63_GSAlinker.ali', align_codes='HA1onCtermVP1dc63_GSAlinker')
aln.align2d()
aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.ali', alignment_format='PIR')
aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.pap', alignment_format='PAP')
**************************************************************