#################################################################
PLEASE READ THE WHOLE README FILE BEFORE USING THIS SET OF MODELS
#################################################################

NOTE: this model set was generated by Roberto S�nchez at The Rockefeller
University, New York, USA, in 1997. Some parsing, rearrangements, sorting,
etc. has been carried out by Francisco Melo since then.


GOOD and BAD comparative protein structure models.

In this context GOOD models attempt to correspond to models based on
correct templates (correct fold) and approximately correct alignments.
BAD models would correspond to models based on incorrect templates (wrong
fold) or very bad alignments. More quantitatively, GOOD models must have
at least 30% of their CA atoms within 3.5 Angstrom of their corresponding
CAs in the experimental structure. BAD models should have no more than 15%
equivalent CA atoms.

--

Three file types can be found:

- The *.list files contain the names of the models.
- The *.dat files contain important model information.
- The *.rms files contain structural information (comparison between target
  and model).

To simplify the calculations and minimize possible errors when working with
these sets, the 'good.list' and 'good.dat' files are sorted by model name
(i.e. each line in one file matches the same line in the other file). The
same occurs for the 'bad.list' and 'bad.dat' files. The .rms files however
are sorted by model number, thus they do not match line positions with these
two other files.

--

The model names can be mapped to the data in the *.dat files
in the following way:

1bw3_2.B99990001 is the second model for 1bw3

in bad.dat the corresponding entry can be found using columns 3 and 2:

170 2 1bw3 125 41 106 66 2cba 120 180 28 18.9 0.7960940 -0.377119380303828 0.11

The first two columns in bad.dat or good.dat correspond to the first two
columns in the *.rms files.

--

In good.dat and bad.dat the meaning of the columns is the following:

column 1 : target number
column 2 : model number
column 3 : target PDB code (corresponds to one PDB chain)
column 4 : target length
column 5 : starting residue of modeled segment
column 6 : ending residue of modeled segment
column 7 : size of model
column 8 : PDB template
column 9 : starting residue of template region used to build the model
column 10 : ending residue of template region used to build the model
column 11 : target - template sequence identity
column 12 : target - template alignment significance (in nats)
column 13 : ignore this column
column 14 : normalized ProsaII Z-Score
column 15 : pG ( see pG server at http://guitar.rockefeller.edu/pg/ )

--

The *.rms files contain data on the comparison of each model with its
corresponding experimental structure. The meaning of the columns is the
following:

column 1 : target number
column 2 : model number
column 3 : RMSD cutoff (always 3.5 Angstrom)
column 4 : RMSD (CA only)
column 5 : % of equivalent CA atoms
column 6 : RMSD (All heavy atoms)
column 7 : % of equivalent heavy atoms
column 8 : distance RMS (CA only)
column 9 : % of equivalent CA - CA distances
column 10 : distance RMS (All heavy atoms)
column 11 : % of equivalent distances

--

The two *.rms files contain data for more models than the set of good and bad
models (this is because many of the models were eliminated based on this data).

--

WARNING: It is possible that a good and a bad model share the same name. Thus,
good and bad models must be written into different directories.

--