GA341 = 1 - [ cos(sequence_identity) ]^(compactness+sequence_identity)/exp(z-score)
Sequence identity is the fraction of positions with identical residues in the target-
template alignment. Structural compactness is the ratio between the sum of the
standard volumes of the amino acid residues in the protein and the volume of the
sphere with the diameter equal to the largest dimension of the model. The Z-score
is calculated for the combined statistical potential energy of a model, using the
mean and standard deviation of the 200 random sequences with the same composi-
tion and structure as the model (Melo et al., 2002). The combined statistical poten-
tial energy of a model is the sum of the solvent accessibility terms for all Cβ atoms
and distance-dependent terms for all pairs of Cα and Cβ atoms. The solvent acces-
sibility term for a Cβ atom depends on its residue type and the number of other Cβ
atoms within 10Å; the non-bonded terms depend on the atom and residue types
spanning the distance, the distance itself, and the number of residues separating
the distance-spanning atoms in sequence. These potential terms reflect the statisti-
cal preferences observed in 760 non-redundant proteins of known structure. The
GA341 scoring function was evolved by a genetic algorithm that explored many
combinations of a variety of mathematical functions and model features, to opti-
mize the discrimination between good and bad models in a training set of models.
The GA341 score ranges from 0 for models that tend to have an incorrect fold to 1
for models that tend to be comparable to at least low-resolution x-ray structures.
GA341 scores greater than 0.7 indicate a correct fold with more than 35% of the
backbone atoms superposable to better than 3.5Å.
Eswar.