|io = <io_data>||Options for reading atom files|
|gap_penalties_3d = <float:2>||0.0 1.75||gap creation and extension penalties for structure/structure superposition|
|fit_atoms = <str:1>||'CA'||atom type used for superposition|
|fit = <bool:1>||True||whether to align|
|output = <str:1>||'LONG'||'SHORT' | 'LONG' | 'VERY_LONG'|
|align3d_trf = <bool:1>||False||whether to transform the distances before dynamic programming|
|align3d_repeat = <bool:1>||False||do several starts to maximize number of equivalent positions|
|off_diagonal = <int:1>||100||to speed up the alignment|
|matrix_offset = <float:1>||0.00||substitution matrix offset for local alignment|
|overhang = <int:1>||0||un-penalized overhangs in protein comparisons|
|local_alignment = <bool:1>||False||whether to do local as opposed to global alignment|
This command uses the current alignment as the starting point for an iterative least-squares superposition of two 3D structures. This results in a new pairwise structural alignment. A good initial alignment may be obtained by sequence alignment (alignment.align()). For superpositions, only one atom per residue is used, as specified by fit_atoms.
The alignment algorithm is as follows. First, structure 2 is least-squares fit on structure 1 using all the equivalent residue positions in the initial alignment that have the specified atom type. Next, the residue-residue distance matrix is obtained by calculating Euclidean distances between all pairs of selected atoms from the two structures. The alignment of the two structures is then obtained by the standard dynamic programming optimization based on the residue-residue distance matrix.
gap_penalties_3d is a gap creation penalty (usually 0), and gap_penalties_3d is a gap extension penalty, say 1.75. This procedure identifies pairs of residues as equivalent when they have their selected atoms at most 2 times gap_penalties_3d angstroms apart in the current orientation (this is so when the gap initiation penalty is 0). The reason is that an equivalence costs the distance between the two residues while an alternative, the gap-residue and residue-gap matches, costs twice the gap extension penalty.
From the dynamic programming run, a new alignment is obtained. Thus, structure 2 can be fitted onto structure 1 again, using this new alignment, and the whole cycle is repeated until there is no change in the number of equivalent positions and until the difference in the rotation matrices for the last two superpositions is very small. At the end, the framework, that is the alignment positions without gaps, is written to the log file.
If fit is False, no alignment is done.
If output contains 'SHORT', only the best alignment and its summary are displayed. If output contains 'LONG', summaries are displayed for all initial alignments in each framework cycle. If output contains 'VERY_LONG', all alignments are displayed.
If align3d_trf is True, the weights in the weight matrix are modified distances [Subbiah et al., 1993].
If align3d_repeat is True, three additional initial alignments are tried and the one resulting in the largest number of equivalent positions is selected.
# Example for: alignment.align3d(), selection.superpose() # This will align 3D structures of two proteins: from modeller import * log.verbose() env = environ() env.io.atom_files_directory = '../atom_files' # First example: read sequences from a sequence file: aln = alignment(env) aln.append(file='toxin.ali', align_codes=['1fas', '2ctx']) aln.align(gap_penalties_1d=[-600, -400]) aln.align3d(gap_penalties_3d=[0, 4.0]) aln.write(file='toxin-str.ali') # Second example: read sequences from PDB files to eliminate the # need for the toxin.ali sequence file: mdl = model(env) aln = alignment(env) for code in ['1fas', '2ctx']: mdl.read(file=code) aln.append_model(mdl, align_codes=code, atom_files=code) aln.align(gap_penalties_1d=(-600, -400)) aln.align3d(gap_penalties_3d=(0, 2.0)) aln.write(file='toxin-str.ali') # And now superpose the two structures using current alignment to get # various RMS's: mdl = model(env, file='1fas') atmsel = selection(mdl).only_atom_types('CA') mdl2 = model(env, file='2ctx') atmsel.superpose(mdl2, aln)