The format of the profile file (text) is as follows:
# Number of sequences: 4 # Length of profile : 20 # N_PROF_ITERATIONS : 3 # GAP_PENALTIES_1D : -900.0 -50.0 # MATRIX_OFFSET : 0.0 # RR_FILE : ${MODINSTALLCVS}/modlib//as1.sim.mat 1 2ctx X 0 71 1 71 0 0 0 0. 0.0 IRCFITPDITS---KDCPN- 2 2abx X 0 74 1 74 0 0 0 0. 0.0 IVCHTTATIPS-SAVTCPPG 3 2nbt X 0 66 1 66 0 0 0 0. 0.0 RTCLISPSS---TPQTCPNG 4 1fas X 0 61 1 61 0 0 0 0. 0.0 TMCYSHTTTSRAILTNCG--
The first six lines begin with a '#' in the first column and give a few general details of the profile.
The first line gives the number of sequences in the profile. The line should be in the following format: '(24x,i6)'.
The second line gives the number of positions in the profile. This should be in '(24x,i6)' format also.
The third line gives the value of the n_prof_iterations variable. The fourth line gives the value of the gap_penalties_1d variable. The fifth line gives the value of the matrix_offset variable. The sixth line gives the value of the rr_file variable.
The number of sequences in the profile and its length are used to allocate memory for the profile arrays, so they should provide an accurate description of the profile.
The values of the variables described in lines 3 through 6 are not used internally by MODELLER. But the Profile.read() command expects to find a total of six header lines. These records represent useful information when Profile.build() was used to construct the profile.
The remaining lines consist of the alignment of the sequences in the profile. The format of these lines is of the form: '(i5,1x,a40,1x,a1,1x,7(i5,1x),f5.0,1x,g10.2,1x,32767a1)'
The various columns that precede the sequence are:
Many of the fields described above are valid only when the profile that is written out is the result of Profile.build().