Supplementary tables for the paper

"Alignment of protein sequences by their profiles."


Marc A. Marti-Renom, M.S. Madhusudhan and Andej Sali

Testing and Training reference sets. The reference alignments are pairwise, structure-based alignments. They were extracted from our comprehensive database of pairwise structure-based alignments, DBAli. The alignments in DBAli were calculated by superposing all pairs of proteins of known structure in the Protein Data Bank (Feb. 1999) that are classified into the same H class in the CATH database, using the program CE. There are 33,920 such alignments with a Z-score higher than 3.8. They cover the entire spectrum of sequence and structure similarities.
First, 387 alignments were extracted from DBAli by requiring up to 40% sequence identity, at least 100 aligned residues, at least 50% of the residues aligned, and at least 90% of one chain spanned by the first and last aligned residue. Second, structure pairs that did not have at least 50% of the residues in the shorter chain aligned by MAMMOTH were also eliminated, resulting in the final set of 300 reference alignments. These 300 alignments were randomly divided into the training and testing sets of 100 and 200 alignments, respectively. The training set of alignments was used to optimize the gap initiation and gap extension penalties for all of our alignment protocols and the parameter s for the two posterior substitution matrix protocols, while the testing set was used to assess the performance of all examined alignment methods. The PDB chain identifiers, chain lengths, percentage sequence identities, RMSD’s for the aligned Cα atoms, average percentages of the aligned Cα atoms, and percentages of structurally equivalent residues are listed separately for the training and testing alignments.
Testing set of pairs of structures
Chain A Chain B % Seq. Id. RMSD
1qfeA1rlcL4.383.79
3adk1nstA4.743.34
1barA1xyfA4.841.84
1f6yA1reqA6.203.81
1dioA1qfeA6.252.95
1ez3A1fewA6.612.39
1dioA1d9eA6.673.30
1qfeA1dxeA6.763.15
1qfeA 1ald6.913.84
1nsj1reqA7.323.16
1qfeA 1nsj7.352.91
1dxeA1f61A7.593.17
2asr1occC7.812.73
1fioA1dn1B7.823.23
1cb8A1qazA7.974.01
1qfeA1f6yA8.022.76
1nal11qtwA8.073.72
1rpxA1dioA8.172.83
1dtyA1b9hA8.173.38
1czfA 1tsp8.432.98
1bs2A 1a8h8.502.94
1dorB1nal18.523.07
1pii1dxeA8.702.84
1aw1B1dosA8.733.86
1fq0A1qrqA8.783.22
1nsj2reqB8.783.11
5rubB1thfD8.982.97
1cqqA1cu1A9.092.82
1dxeA1rpxA9.132.97
1wkd1qfeA9.213.52
2mnr1dorB9.354.00
1ak11qgoA9.383.13
1de5A1a0cA9.473.22
1fq0A1pscA9.503.74
1dae1nipA9.593.01
1a801fq0A9.763.24
1nal11fq0A9.802.61
1qhtA1noyA10.003.11
1nal11ezwA10.093.17
1dxeA1fq0A10.113.12
2ercA1eizA10.123.13
1nal11aw1B10.603.20
1mpf1a0tP10.733.29
2thiA 1anf10.863.63
1aw1B1f6yA11.063.44
1cu1A7lprA11.112.52
1fohB1qq2A11.113.12
1dfoA1b9hA11.203.12
1vjs1bvzA11.233.39
1fq0A2tpsA11.282.62
1igs1ad1B11.282.69
1udrB1qo0D11.302.39
1b9hA1qgnA11.362.92
4kbpB1qhwA11.372.89
2tpsA1nal111.372.67
1nksF1nipA11.403.10
1ad1B1qr7B11.523.08
1noyA1tgoA11.543.27
1qgxA1fpkA11.583.01
1cj0A1b9hA11.892.93
1fts 1dae12.092.74
1eokA1d2kA12.102.59
1dfoA1bs0A12.212.88
1dfoA1elqA12.292.81
1qgnA1dfoA12.322.84
1dqyA1broA12.362.98
1fofA 1skf12.453.35
1d2fB1cj0A12.503.84
1nzyA1tyfB12.502.48
1qlwA1auoA12.502.89
1aw1B1fq0A12.563.04
1edt 1ctn12.603.05
1chmA1xgmA12.612.29
1b5l1evsA12.752.52
1c3qA1bx4A12.763.35
1aw1B 1ald12.863.10
1c0aA12asA12.972.57
1ad1B1dioA13.223.59
1b9hA1bs0A13.292.75
1b3uA1ibrB13.404.00
1jud1fezA13.433.36
1a9nA1d0bA13.553.49
1d9eA1fq0A13.573.07
1igs1fq0A13.722.65
1amoA 4nll13.772.85
1plq1dmlA13.833.65
1ad1B2tpsA13.882.55
1qj2C1fiqB13.882.49
1gal 3cox14.033.20
1cj0A1qgnA14.042.96
2hrvA1cu1A14.292.49
1b541qu4D14.353.08
1ahuA1f0xA14.373.82
1eut1czvA14.482.38
1dubA1tyfB14.543.18
1rkd1c3qA14.633.03
1ez0C1ad3A14.722.71
1vpt1eizA14.862.74
2tmdA1f6mA15.033.42
1gtxA2gsaA15.292.17
1cqqA2hrvA15.382.47
1d9eA1de5A15.383.90
1ad1B1qfeA15.482.84
1qu4D1sftB15.493.81
1aj61b62A15.622.64
1qrrA1a4uA15.882.54
1itg 1bco15.942.56
3adk 1gky16.132.94
1havA2hrvA16.152.69
1qu0A1c4rE16.181.90
3minA1mioB16.272.80
1bci2isdA16.392.44
1taq 1a7616.413.62
1ldcA1dorB16.423.07
1eizA2admA16.483.20
2pueA1bykA16.601.76
1chmA 1mat16.881.97
1ciu 1vjs17.093.06
1dcnB1c3cA17.212.63
1rpt 1ihp17.593.15
12asA1b8aA17.672.63
1smpA 1kuh17.692.62
1xgmA 1a1617.701.85
1d9eA 1nsj17.733.15
1dm0L 1tcs17.832.65
1qr7B1d9eA17.932.35
1gnwA1eemA18.041.96
1dpe 1rkm18.093.51
1ar1B1fftB18.611.82
1ttpB1oasA18.672.29
1nal11thfD18.722.64
1ag8A1ez0C18.742.14
1qgiA1chkA18.942.53
1cqxA1qfjA19.463.26
1dil 1eut19.602.27
1ad1B1d9eA19.673.06
1d0bA1yrgA19.792.82
1c3cA1fuoA19.852.88
1dfjI1yrgA19.882.46
1tdj1oasA19.931.89
1eemA1gsdB20.002.95
1havA1cqqA20.112.27
1uok 1ciu20.122.59
1lrv1b3uA20.463.40
1dgd1gtxA20.472.05
1bw9A1ch6A20.482.36
1occC1fftC20.651.65
1xel1db3A20.732.31
1bqg1fhuA20.852.75
1mat 1bn520.971.92
1ahn1amoA21.052.44
1fbl 1kuh21.542.66
1dfjI1d0bA21.882.59
1daaA1et0A22.051.90
1imaA1qgxA22.052.00
1ciu 2aaa22.512.18
1db3A1bxkA22.842.12
1whsB1ivyA22.921.00
1ovaA 1sek23.581.82
1eur 1sll23.581.99
7ahlA 1pvl23.901.75
1gpmA1qdlB24.101.94
1whsB 1cpy24.312.04
1ipsB 1dcs24.912.60
1b5fA1fknA25.001.92
1dar1dpfA25.002.29
2reb1cr2A25.002.75
1tlfC2pueA25.371.72
1dxy1psdA25.632.29
1psdA1gdhA25.631.99
4tf4B1nbcA25.872.10
1lyaD1fknA26.252.18
1gdoB1ct9A26.842.00
1fiqC 1alo27.421.87
1ac5 1cpy27.832.40
2shpA1yptA27.912.53
1iov1ehiA28.711.86
4mhtA1dctA28.941.95
1reqA1cb7A29.131.62
1cpy1ivyA29.641.86
2plc2isdA29.774.05
1d2kA 1ctn29.951.50
1xgmA 1mat30.291.53
1ag8A1euhA30.561.51
1whsA 1cpy30.682.37
1ahsB1bvp430.951.19
1au1B 1b5l31.132.19
1agrE1emuA32.001.54
1atiA1qf6A32.142.16
1ahuA1diiA32.681.33
1hrdA1gtmA33.491.89
1ciy 1dlc33.621.92
1kobB 1a0634.502.05
1froA1fa6B36.001.78
1etu 1dar37.212.33
1tcs1apgA37.701.35
1larA 1pty38.691.68
1alo1qj2A38.851.13
1alo1fiqA38.931.17
1eepA1dorB39.292.74
Training set of pairs of structures
Chain A Chain B % Seq. Id. RMSD
2mnr1wkd5.804.02
1a801bf6A5.864.05
12asA1qf6A7.312.89
1evkA12asA7.343.09
5rubB1qfeA7.603.58
1wkd1dxeA7.793.11
1cb7B1qfeA8.063.00
1dioA1f6yA8.333.46
1plq1b77A8.483.59
1fq0A1f6yA8.543.11
1cm5A1b8bA8.553.03
1bmfD1cbuC8.722.88
1wba1af99.412.49
1tyfB1ef9A9.582.49
1han1cjxA9.743.30
1dob1f8rA9.873.93
1b9hA1d2fB9.973.55
1cb7B1d9eA10.003.29
2hrvA7lprA10.452.57
1exfA1cqqA10.462.65
1zen1ttpA10.592.91
1dfoA1b8gA10.743.80
1noyA1qqcA11.023.28
1cb7B1f6yA11.163.00
1cqqA1jxpA11.192.75
1czvA1gog11.192.47
1c3d1dceB11.233.68
1aw1B1nsj11.282.96
5xinB1de5A11.333.17
1b78A1ex2A11.542.93
1nal11qfeA12.222.99
1qprA1b5412.283.19
1qr7B1f6yA12.353.14
1pda1gr2A12.443.59
1d9vA2thiA12.702.98
1d9eA1f6yA13.283.40
1elqA1cj0A13.333.09
1qfeA1d9eA13.492.91
2pth1cfzA13.512.90
1jxpA2hrvA13.532.47
1dae1ffh13.792.42
1cj0A1bs0A13.952.89
1ebmA1mpgA14.913.19
1qqtA1ile15.013.10
1mat1a1615.023.31
1bg91bf215.142.84
1bvzA1bg915.142.75
1qrrA1db3A15.202.43
2hvm1e15A15.383.13
1ad1B1cb7B15.773.07
1bg91ciu16.422.74
1aw1B1d9eA16.463.32
1ad1B1fq0A16.582.58
1d9eA1d3gA16.603.78
1ez3A1fioA16.671.45
1dorB1d3gA16.723.48
1cy9A1d6mA16.883.58
1bavC1qmuA17.571.83
1ciu1bplB17.612.67
1b37A1f8rA17.662.95
1bmtA1cb7A18.111.81
1d0bA1ds9A18.183.68
1bw9A1aup18.212.47
1bvzA2aaa18.442.41
1tcs1qi7A18.932.23
1dtyA1gtxA18.962.63
1ck7A1hxn19.592.48
1ac51whsB19.862.00
1jud1cr6A19.883.32
2pgi1dqrA19.912.03
1sll1dil20.862.76
1rkd1bx4A21.212.26
1dcqA1myo21.242.34
1a8h1qqtA21.642.24
1rnl1srrC21.672.14
3pte1ei5A22.122.65
1eepA1dosA22.673.05
1ekjA1ddzA22.682.38
1c0mA1itg22.732.07
1uok1bvzA23.962.21
1udrB1rnl24.142.12
2isdA1ptd24.453.83
1kbcA1smpA24.542.54
1sebH1eu4A25.671.94
1aup1ch6A25.731.87
1fknA1psaB26.402.20
1gph41gdoB26.552.43
1ihp1qfxA27.392.34
1dubA1nzyA27.782.19
1gdoB1ecfB28.001.81
1ft1B1dceB30.101.92
1c8oA3caaA30.671.88
1atiA1evkA31.632.20
1a061tkiA32.421.68
2pgd1pgjA32.621.54
1bx4A1dgyA34.341.35
1apyB9gaaA34.751.11
1dvpA1elkA35.041.79
1whsA1ivyA35.571.70
2fcbA1f2qA39.632.05