Hi Min-yi Shen, and thanks for sharing your experience,
As a matter of facts, I was considering the possibility to use such a scheme as the one you describe (RMSD + linkage ). I didn't realize there could be a symmetry problem with the RMSD distance, but I guess guiding this computation with the multiple alignment should leverage the issue.
--Ben
Le 24 oct. 2011 à 19:38, Min-yi Shen a écrit :
> My own workaround long time ago (2005) was > (1) generate N models > (2) filter out some bad models with objective values > (3) compute all-against-all RMSDs as a N-by-N matrix (just write a > double loop in py) > (4) Do hierarchical agglomerative clustering in R. > > The problem with (4) is that 1. the distance measure from step 3 may > or may not obey the triangular inequality, and 2. it varies with > cluster linkage methods. > > so, I finally opted to use some graph-based clustering methods like > MCL http://micans.org/mcl.