The ModPipe configuration file

The configuration file, specified by the --conf_file command line argument to many ModPipe programs, provides file locations, such as the location of the template sequence and profile files, and the location of ModPipe output files. The configuration file also provides a number of run parameters, such as whether template sequences will be clustered before building models. The variables in the configuration file are described here.

Environment variables can be used in the configuration file using familiar Unix syntax - e.g. $FOO is replaced with the value of the environment variable FOO.

See also Databases used by ModPipe for information on the databases pointed to by this file, and information on setting up these databases to run ModPipe if you are not in the Sali lab.

See also a sample configuration file.

TMPDIR

ModPipe will create a new directory here for every sequence it processes and will use this as scratch space for all calculations. This directory should be local to the machine running ModPipe in order to reduce network traffic.

DATDIR

This is the base directory in which the ModPipe filesystem will be created.

TEMPLATESEQDB

The name of the file of template sequences to use in searches for matches. Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example, /netapp/sali/ModPipe/database/PDB95/db/pdb_95.hdf5.

XPRF_LIST

Name of file containing list of template profile (.prf) files – one for each template in the TEMPLATESEQDB database file.

XPRF_PSSMDB

Name of the file containing position-specific scoring matrix data for each template sequence.

PDB_REPOSITORY

The name of the directory containing PDB files.

NRSEQDB

The name of the file of non-redundant sequences to use for construction of profiles by Modeller’s Profile.build(). Unless you have some special need and know what you are doing you should use a binary (HDF5) database file; for example, /netapp/database/uniprot/sequences/uniprot90.hdf5.

NCBISEQDB

The name of the file of non-redundant sequences to use for construction of profiles by PSI-BLAST.

NRDBTAG

A short-name for the non-redundant sequence database that ModPipe will use as part of the name of profile files (multiple sequence alignments) constructed using that database. Usually it will be uniprot90.

PRFUPDATE

If this flag is set to ON, irrespective of the existence of a profile for the target sequence, a new profile will be calculated. If PRFUPDATE is set to OFF, it will calculate a new target-sequence profile only if one does not already exist.

NUMMODELS

Number of models to calculate for each alignment.

CLUSTERALI

When using multiple fold assignment methods, since they are used independently they will typically find a number of templates in common. By setting this variable to ON, redundant and highly similar hits in the alignment of all templates found will be removed.