ModBase database of comparative protein structure models Roberto Sanchez and Andrej Sali
ModBase is a queryable database of many annotated comparative protein structure models. The models consist of coordinates for all non-hydrogen atoms in the modeled part of a protein. They are derived by an automated modeling pipeline relying mainly on the program MODELLER. The database currently contains 3D models for substantial segments of 15-23% of proteins in the genomes of M. genitalium, M. jannaschii, E. coli, S. cerevisiae, and C. elegans. In total, there are models for 3,732 proteins. The database also includes fold assignments and alignments on which the models were based. In addition, special care is taken to assess the overall quality of the models and their accuracy at the residue level. In the future, ModBase will grow to reflect (i) the growth of the sequence databases, (ii) the growth of the database of known protein structures, (iii) and improvements in the software for calculating the models. It is expected that the Swiss-Prot+TrEMBL protein sequence database will be processed by the end of 1999. ModBase is introduced in R. Sanchez & A. Sali. Proc. Natl. Acad. Sci. USA 95, 13597-13602, 1998.