clean_sequences = <bool:1> | True | whether or not clean non-standard residues |
If clean_sequences is set to 'on', then the non-standard residues in the sequences will be cleaned before transferring into the profile format. Specifically, ASX (B) will be replaced with ASN (N), GLX (Z) will be replaced with GLN (Q) and UNK (X) will be replaced with ALA (A).
env = environ() # Read in the alignment file aln = alignment(env) aln.append(file='toxin.ali', alignment_format='PIR', align_codes='ALL') # Convert the alignment to profile format prf = aln.to_profile(clean_sequences=True) # Write out the profile # in text file prf.write(file='alntoprof.prf', profile_format='TEXT') # in binary format prf.write(file='alntoprof.bin', profile_format='BINARY')