[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[modeller_usage] sequence of salign/align2d



Hi


I want to make a homology model using the .pdb structure of a template and a target sequence (that is very similar to the template) 


I have previosly used align2d (see script align2d_old.py below) to prepare homology models. but the current sequences are very long (a pentamer with 573 aa per chain) and align2d was taking a very long time (hours). so I wanted to use salign.


based on what I read in tutorials and forums I am however rather confused about the sequence of scritpts/commands to use 


Essentially, I have three questions:


- what is the sequence of scripts / commands to use? do I first use salign to prepare an alignment and then align2d (or salign that emulates align2d).

- what is the purpose of first using salign and then align2d ?

- in the previous scripts of align2d the .pdb file was used as an input. in the new scripts that I used the .pdb file does not seem to be used. so how is the known structure of the template taken in to consideration for the alignment? 


the scripts I have previously used and the ones I now use are below. 

name of template: Glinker (known sequence and structure (pdb))

name of target: GSAlinker (known sequence ) 


thanks 

Evelyne 



*************************************************************

# profile-profile alignment using salign

# based on salign_profile_profile.py

# on https://salilab.org/modeller/9v7/manual/node288.html

from modeller import *


log.level(1, 0, 1, 1, 1)

env = environ()


aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.fasta', alignment_format='FASTA')


aln.salign(rr_file='${LIB}/blosum62.sim.mat',

           gap_penalties_1d=(-500, 0), output='',

           align_block=2,   # no. of seqs. in first MSA

           align_what='PROFILE',

           alignment_type='PAIRWISE',

           comparison_type='PSSM',  # or 'MAT' (Caution: Method NOT benchmarked

                                    # for 'MAT')

           similarity_flag=True,    # The score matrix is not rescaled

           substitution=True,       # The BLOSUM62 substitution values are

                                    # multiplied to the corr. coef.

           #write_weights=True,

           #output_weights_file='test.mtx', # optional, to write weight matrix

           smooth_prof_weight=10.0) # For mixing data with priors


#write out aligned profiles (MSA)

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR')


# Make a pairwise alignment of two sequences

aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR',

                align_codes=('HA1onCtermVP1dc63_Glinker', 'HA1onCtermVP1dc63_GSAlinker'))

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', alignment_format='PIR')

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.pap', alignment_format='PAP')

**************************************************************


**************************************************************

# based on example on salign_align2d.py
# on https://salilab.org/modeller/9v7/manual/node288.html
# align2d/align using salign
# parameters to be input by the user
# 1.  gap_penalties_1d
# 2.  gap_penalties_2d
# 3.  input alignment file

from modeller import *
log.verbose()
env = environ()
env.io.atom_files_directory = ['../atom_files']

aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', align_codes='all')
aln.salign(rr_file='$(LIB)/as1.sim.mat',  # Substitution matrix used
           output='',
           max_gap_length=20,
           gap_function=True,              # If False then align2d not done
           feature_weights=(1., 0., 0., 0., 0., 0.),
           gap_penalties_1d=(-450, -50),
           gap_penalties_2d=(3.5, 3.5, 3.5, 0.2, 4.0, 6.5, 2.0, 0.0, 0.0),
           # d.p. score matrix
           #write_weights=True, output_weights_file='salign.mtx'
           similarity_flag=True)   # Ensuring that the dynamic programming
                                   # matrix is not scaled to a difference matrix
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.ali', alignment_format='PIR' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')
aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.pap', alignment_format='PAP' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')

**************************************************************


**************************************************************

from modeller import *


log.verbose()


# create evironment and new alignment 

env = environ()

aln = alignment(env)


# create a model for the template HA1onCtermVP1dc63_Glinker

mdl = model(env, file='HA1onCtermVP1dc63_Glinker', model_segment=('FIRST:A','LAST:E'))

aln.append_model(mdl, align_codes='HA1onCtermVP1dc63_Glinker', atom_files='HA1onCtermVP1dc63_Glinker.pdb')


# create a model for the target HA1onCtermVP1dc63_GSAlinker

aln.append(file='HA1onCtermVP1dc63_GSAlinker.ali', align_codes='HA1onCtermVP1dc63_GSAlinker')


aln.align2d()


aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.ali', alignment_format='PIR')

aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.pap', alignment_format='PAP')

**************************************************************