I want to make a homology model using the .pdb structure of a template and a target sequence (that is very similar to the template)

I have previosly used align2d (see script align2d_old.py below) to prepare homology models. but the current sequences are very long (a pentamer with 573 aa per chain) and align2d was taking a very long time (hours). so I wanted to use salign.

based on what I read in tutorials and forums I am however rather confused about the sequence of scritpts/commands to use

Essentially, I have three questions:

- what is the sequence of scripts / commands to use? do I first use salign to prepare an alignment and then align2d (or salign that emulates align2d).

- what is the purpose of first using salign and then align2d ?

- in the previous scripts of align2d the .pdb file was used as an input. in the new scripts that I used the .pdb file does not seem to be used. so how is the known structure of the template taken in to consideration for the alignment?

the scripts I have previously used and the ones I now use are below.

name of template: Glinker (known sequence and structure (pdb))

name of target: GSAlinker (known sequence )

thanks

Evelyne

*************************************************************

# profile-profile alignment using salign

# based on salign_profile_profile.py

# on https://salilab.org/modeller/9v7/manual/node288.html

from modeller import *

log.level(1, 0, 1, 1, 1)

env = environ()

aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.fasta', alignment_format='FASTA')

aln.salign(rr_file='${LIB}/blosum62.sim.mat',

gap_penalties_1d=(-500, 0), output='',

align_block=2, # no. of seqs. in first MSA

align_what='PROFILE',

alignment_type='PAIRWISE',

comparison_type='PSSM', # or 'MAT' (Caution: Method NOT benchmarked

# for 'MAT')

similarity_flag=True, # The score matrix is not rescaled

substitution=True, # The BLOSUM62 substitution values are

# multiplied to the corr. coef.

#write_weights=True,

#output_weights_file='test.mtx', # optional, to write weight matrix

smooth_prof_weight=10.0) # For mixing data with priors

#write out aligned profiles (MSA)

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR')

# Make a pairwise alignment of two sequences

aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker.ali', alignment_format='PIR',

align_codes=('HA1onCtermVP1dc63_Glinker', 'HA1onCtermVP1dc63_GSAlinker'))

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', alignment_format='PIR')

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.pap', alignment_format='PAP')

**************************************************************

# based on example on salign_align2d.py

# on https://salilab.org/modeller/9v7/manual/node288.html

# align2d/align using salign

# parameters to be input by the user

# 1. gap_penalties_1d

# 2. gap_penalties_2d

# 3. input alignment file

from modeller import *

log.verbose()

env = environ()

env.io.atom_files_directory = ['../atom_files']

aln = alignment(env, file='HA1onCtermVP1dc63_Glinker_GSAlinker_pair.ali', align_codes='all')

aln.salign(rr_file='$(LIB)/as1.sim.mat', # Substitution matrix used

output='',

max_gap_length=20,

gap_function=True, # If False then align2d not done

feature_weights=(1., 0., 0., 0., 0., 0.),

gap_penalties_1d=(-450, -50),

gap_penalties_2d=(3.5, 3.5, 3.5, 0.2, 4.0, 6.5, 2.0, 0.0, 0.0),

# d.p. score matrix

#write_weights=True, output_weights_file='salign.mtx'

similarity_flag=True) # Ensuring that the dynamic programming

# matrix is not scaled to a difference matrix

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.ali', alignment_format='PIR' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')

aln.write(file='HA1onCtermVP1dc63_Glinker_GSAlinker_align2d.pap', alignment_format='PAP' , alignment_features=' INDICES CONSERVATION ACCURACY HELIX BETA')

**************************************************************

from modeller import *

log.verbose()

# create evironment and new alignment

env = environ()

aln = alignment(env)

# create a model for the template HA1onCtermVP1dc63_Glinker

mdl = model(env, file='HA1onCtermVP1dc63_Glinker', model_segment=('FIRST:A','LAST:E'))

aln.append_model(mdl, align_codes='HA1onCtermVP1dc63_Glinker', atom_files='HA1onCtermVP1dc63_Glinker.pdb')

# create a model for the target HA1onCtermVP1dc63_GSAlinker

aln.append(file='HA1onCtermVP1dc63_GSAlinker.ali', align_codes='HA1onCtermVP1dc63_GSAlinker')

aln.align2d()

aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.ali', alignment_format='PIR')

aln.write(file='HA1onCtermVP1dc63_Glinker_HA1onCtermVP1dc63_GSAlinker.pap', alignment_format='PAP')

**************************************************************