This command reads the sequence(s) and/or their alignment from a text file. Only sequences with the specified codes are read in; align_codes = 'all' can be used to read all sequences. The sequences are added to any currently in the alignment.
file can be either a file name or a readable file handle (see modfile.File()).
There are several alignment formats:
If remove_gaps = True, positions with gaps in all selected sequences are removed from the alignment.
The io argument is required since PIR files can contain empty sequences or ranges; in this case, the sequence or range is read from the corresponding PDB file.
If allow_alternates = True, and reading a 'PIR' file where `.' is used to force MODELLER to read the sequence range from the corresponding PDB file (see Section B.1), then the search for matches between the alignment sequence and PDB is made a little more flexible. Not only will an exact equivalence of one-letter codes be considered a match, but each residue's alternate (as defined by the STD column in 'modlib/restyp.lib') will also count as a match; for example, B (ASX) in the alignment will be considered a match for N (ASN) in the PDB, while G (GLY) in the alignment will match any non-standard residue in the PDB for which an explicit equivalence has not been defined (the DEFATM behavior in 'modlib/restyp.lib'). The alignment sequence will be modified to match the exact sequence from the PDB. This is useful if the alignment sequence is extracted from a database containing 'cleaned' sequences, e.g. that created by sequence_db.read().
For 'PIR' and 'FASTA' files, the end_of_file variable is set to 1 if MODELLER reached the end of the file during the read, or 0 otherwise.
This command can raise a FileFormatError if the alignment file format is invalid, or a SequenceMismatchError if a 'PIR' sequence does not match that read from PDB (when an empty range is given).
# Example for: alignment.append(), alignment.write(), # alignment.check() # Read an alignment, write it out in the 'PAP' format, and # check the alignment of the N-1 structures as well as the # alignment of the N-th sequence with each of the N-1 structures. from modeller import * log.level(output=1, notes=1, warnings=1, errors=1, memory=0) env = environ() env.io.atom_files_directory = ['../atom_files'] aln = alignment(env) aln.append(file='toxin.ali', align_codes='all') aln.write(file='toxin.pap', alignment_format='PAP') aln.write(file='toxin.fasta', alignment_format='FASTA') aln.check()