Preparing input files

Next: Running MODELLER Up: Tutorial on using MODELLER Previous: Tutorial on using MODELLER Contents Index

Subsections

Preparing input files

The sample input files in this tutorial can be found in the examples/tutorial-model directory of the MODELLER distribution.

There are three kinds of input files: Protein Data Bank atom files with coordinates for the template structures, the alignment file with the alignment of the template structures with the target sequence, and the MODELLER command or script file that tells MODELLER what to do.

Atom files

Each atom file is named code.atm where code is a short protein code, preferably the PDB code; for example, Peptococcus aerogenes ferredoxin would be in a file 1fdx.atm. The code must be used as that protein's identifier throughout the modeling. The atom sets do not have to be superposed by the user before comparative modeling is done.

Alignment file

One of the formats for the alignment file is related to the PIR database format; this is the preferred format for comparative modeling:

C; A sample alignment in the PIR format; used in tutorial
>P1;5fd1
structureX:5fd1:1    : :106  : :ferredoxin:Azotobacter vinelandii: 1.90: 0.19
AFVVTDNCIKCKYTDCVEVCPVDCFYEGPNFLVIHPDECIDCALCEPECPAQAIFSEDEVPEDMQEFIQLNAELA
EVWPNITEKKDPLPDAEDWDGVKGKLQHLER*
>P1;1fdx
sequence:1fdx:1    : :54   : :ferredoxin:Peptococcus aerogenes: 2.00:-1.00
AYVINDSC--IACGACKPECPVNIIQGS--IYAIDADSCIDCGSCASVCPVGAPNPED-----------------
-------------------------------*

See Section 2.4.1 for a detailed description of the alignment file format and Section 1.7.3 for the meaning of the alignment in MODELLER. Influence of the alignment on the quality of the model cannot be overemphasized. Command CHECK_ALIGNMENT can be used to find some trivial alignment mistakes.

Script file

The script file contains commands for MODELLER, in the TOP language (Chapter 4). A sample script file model-default.top to produce one model of sequence 1fdx from the known structure of 5fd1 and from the alignment between the two sequences is

# Homology modelling by the MODELLER TOP routine 'model'.

INCLUDE                             # Include the predefined TOP routines

SET OUTPUT_CONTROL = 1 1 1 1 1      # uncomment to produce a large log file
SET ALNFILE  = 'alignment.ali'      # alignment filename
SET KNOWNS   = '5fd1'               # codes of the templates
SET SEQUENCE = '1fdx'               # code of the target
SET ATOM_FILES_DIRECTORY = './:../atom_files' # directories for input atom files
SET STARTING_MODEL= 1               # index of the first model 
SET ENDING_MODEL  = 1               # index of the last model
                                    # (determines how many models to calculate)

CALL ROUTINE = 'model'              # do homology modelling

See Section 3.2 for information on the model script and its arguments.

Next: Running MODELLER Up: Tutorial on using MODELLER Previous: Tutorial on using MODELLER Contents Index

Ben Webb 2004-04-20