The target_profile_file and all the profiles listed in profile_list_file should be in a format that is understood by profile.read().
The profile_list_file should contain absolute or relative paths to the individual template profiles, one per line.
See documentation under profile.read() for help on profile_format.
rr_file is the residue-residue substitution matrix to use when calculating the position-specific scoring matrix (PSSM). The current implemenation is optimized only for the BLOSUM62 matrix.
gap_penalties_1d are the gap penalties to use for the dynamic programming. matrix_offset is the value to be used to offset the substitution matrix. The most optimal values for these parameters are: matrix_offset = -200 gap_penalties_1d = -1900 -95
max_aln_evalue sets the threshold for the E-values. Alignments with e-values better than the threshold will be written out.
aln_base_filename sets the base filename for the alignments. The output alignment filenames will be of the form ALN_BASE_FILENAME_XXXX.ali. The XXXX is a 4-digit integer (prefixed with sufficient zeroes) that is incremented for each alignment. For example, alignment_0001.ali
score_statistics is a flag that triggers the calculation of e-values. If set to OFF, the significance estimates for the alignments will not be calculated. The calculation of alignment significance is similar to that used for profile.build(). This option can be useful when there are only a very small number of template profiles in profile_list_file, insufficient to calculate reliable statistics. Also see profile.build().
output_scores is a flag to write out the raw alignment scores, zscores and e-values for all the comparisons. output_score_file sets the name of the file to which this output should be written to.
write_summary is a flag to output a summary of all the significant alignments into the file specified by summary_file.
If output_alignments is set to OFF, alignments will not be written out.
# Example for: profile.scan() env = environ() # Read in the target profile prf = profile(env, file='T3lzt-uniprot90.prf', profile_format='TEXT') # Scan against all profiles in the 'profiles.list' file prf.scan(profile_list_file = 'profiles.list', matrix_offset = -200, rr_file = '${LIB}/blosum62.sim.mat', gap_penalties_1d = (-1900, -95), score_statistics = False, output_alignments = True, output_scores = False, output_score_file = 'T3lzt-ppscan.scores', profile_format = 'TEXT', max_aln_evalue = 1, aln_base_filename = 'T3lzt-ppscan', pssm_weights_type = 'HH1', write_summary = True, summary_file = 'T3lzt-ppscan.sum')