good points. Here is the proposed interface.
class Clusterer{
// use explicit vector embedding to initialize the data Clusterer(Embedding data, params);
// use implicit conversion of data from e.g. FloatsList to data Clusterer(EmbeddingAdaptor data, params);
// run the clustering on the stored data virtual PartitioningClusteringResults do_clustering() = 0; };
class XXXXIncrementalClusterer : Clusterer{ // use explicit vector embedding to initialize the data XXXXIncrementalClusterer(Embedding data, params);
// use implicit conversion of data from e.g. FloatsList to data XXXXIncremenralClusterer(EmbeddingAdaptor data, params);
// PartitioningClusteringResults add_data(XXXX data); };
On Thu, Jun 14, 2012 at 1:34 PM, Daniel Russel drussel@gmail.com wrote:
> Sorry, I didn't see this before. I think my previous comments still stand. > As a couple additional ones: > - having "execute" style methods on objects isn't a very nice practice. It > destroys type safety, since for class A you really have two different types > of A, the pre-execute A and the post-execute A and some contexts require > one and some the other (and the compiler can't check). And doesn't really > give you anything that you can't get from producing a new object as the > result. Ideally, classes should have what I have seen called the "no > protocols" property: you can call any function of the class in any order. > > - I kind of prefer Clusterer to Clustering when referring to the algorithm > as the latter very much means the result of running a clustering algorithm > to me as opposed to something that does clustering. But that may be my > eccentricity (but if you google "clusterer" the results seem consistent > with that usage). In any case, we need to make sure that distinct terms are > used for distinct things. > > > On Thu, Jun 14, 2012 at 12:38 PM, Barak Raveh barak.raveh@gmail.comwrote: > >> Now the full version... >> Daniel and I discussed a little bit consolidation of clustering things in >> statistics and kmeans modules. Please tell me if it is agreed that things >> will work with the following interface: >> * The Embedding family of classes (used to embed data in vector form) >> will remain as is >> * There will be a "Clustering" class from which all clustering algorithms >> will derive, with a constructor that takes either Embedding class, or >> EmbeddingAdaptor for implicit conversions from e.g., FloatsList >> * The Clustering classes will also have a *void ::execute()* method and a >> * ::get_clustering results() *method, that will return the clustering >> results (using the exisiting PartitioningClustering class, perhaps we can >> change its name to PartitioningClusteringResults). >> >> So bottom line, if you will want to cluster some data, you will do >> something like >> *FloatsList data; // or create an Embedding object* >> *KMeansClustering kmeans(data, params);* >> *kmeans.execute();* >> *PartitioningClustering clustering_results = >> kmeans.get_clustering_results()* >> >> Makes sense? For backward compatibility, I will add a DEPRECATED warning >> to existing clustering methods, so they will be removed within a few months >> completely. >> >> Barak >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> >> >