The ModPipe configuration file¶
The configuration file, specified by the --conf_file
command line
argument to many ModPipe programs, provides file locations, such as the
location of the template sequence and profile files, and the location of
ModPipe output files. The configuration file also provides a number of run
parameters, such as whether template sequences will be clustered before
building models. The variables in the configuration file are
described here.
Environment variables can be used in the configuration file using familiar
Unix syntax - e.g. $FOO
is replaced with the value of the environment
variable FOO
.
See also Databases used by ModPipe for information on the databases pointed to by this file, and information on setting up these databases to run ModPipe if you are not in the Sali lab.
See also a sample configuration file.
TMPDIR
ModPipe will create a new directory here for every sequence it processes and will use this as scratch space for all calculations. This directory should be local to the machine running ModPipe in order to reduce network traffic.
DATDIR
This is the base directory in which the ModPipe filesystem will be created.
TEMPLATESEQDB
The name of the file of template sequences to use in searches for matches. Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example,
/netapp/sali/ModPipe/database/PDB95/db/pdb_95.hdf5
.XPRF_LIST
Name of file containing list of template profile (
.prf
) files – one for each template in theTEMPLATESEQDB
database file.XPRF_PSSMDB
Name of the file containing position-specific scoring matrix data for each template sequence.
PDB_REPOSITORY
The name of the directory containing PDB files.
NRSEQDB
The name of the file of non-redundant sequences to use for construction of profiles by Modeller’s
Profile.build()
. Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example,/netapp/database/uniprot/sequences/uniprot90.hdf5
.NCBISEQDB
The name of the file of non-redundant sequences to use for construction of profiles by PSI-BLAST.
NRDBTAG
A short-name for the non-redundant sequence database that ModPipe will use as part of the name of profile files (multiple sequence alignments) constructed using that database. Usually it will be
uniprot90
.PRFUPDATE
If this flag is set to ON, irrespective of the existence of a profile for the target sequence, a new profile will be calculated. If
PRFUPDATE
is set to OFF, it will calculate a new target-sequence profile only if one does not already exist.NUMMODELS
Number of models to calculate for each alignment.
CLUSTERALI
When using multiple fold assignment methods, since they are used independently they will typically find a number of templates in common. By setting this variable to ON, redundant and highly similar hits in the alignment of all templates found will be removed.