On Dec 9, 2008, at 4:58 PM, Keren Lasker wrote:
> looks reasonable. > Q1: can you indicate the relationship between the attributes ( and/ > or) ? "And" as that is the only thing that can be meaningfully accelerated using a hash map. You can write accelerating structures for other sorts of queries as needed, but I doubt we could accelerated complicated boolean queries beyond linear search perhaps paired with some hash maps.
> > Q2: can it support range ( all residues in a range for example) One-D ranges would be trivial to add if I use a std::map internally. Two-D or higher ranges require, obviously, completely different structures internally which are expensive to build and so should be explicitly chosen. I can easily write arbitrary dimensional range searches using CGAL.
> > Q3: how do you support hierarchy? should the container hold all the > leaves ? The Map takes a SingletonContainer. The container can do all sorts of fancy things if you want (you could, for example, write a container that contained all the leaves of the particles you gave it, but I haven't yet).
> On Tue, 9 Dec 2008, Daniel Russel wrote: > >> So does something like this look to be along the lines of what you >> all have in mind? >> //! Create a map from attributes to particles >> /** The map can be built with up to 4 different attributes of type >> Int or >> String. Particles missing the named attributes are skipped. >> */ >> template <class TypeA, class TypeB=NullType, class TypeC=NullType, >> class TypeD=NullType> >> class MapScoreState: public RefCountedObject { >> public: >> typedef fancy_crap Key; >> typedef more_fancy_crap Value; >> //! Initialize the table to index on the attributes in at >> /** \param[in] pcs The container of particles to index in the table >> \param[in] at The set of attributes to index on*/ >> MapScoreState(ParticleContainer *pcs, const Attributes &at); >> //! Get the particles matching v >> Particles get_particles(const Values &v) const; >> //! Get the particle matching v >> /** \throw InvalidStateExpeption if more than one particle matches >> */ >> Particle *get_particle(const Values &v) const; >> >> //! Also would have the score state update method to force an update >> }; >> >> >> typedef MapScoreState<Int> AtomIndexMap; >> >> AtomIndexMap *myindexmap= new AtomIndexMap(myparticles, >> >> AtomIndexMap::Key(AtomDecorator::get_index_key())); >> myindexmap->before_evaluate(-1); // we might want to add an update >> method to ScoreState >> >> Particle *atom10 = myindexmap->get_particle(AtomIndexMap::Value(10)); >> >> For python, we would have to create a bunch of the maps with pre- >> chosen types (i.e. IntMapScoreState, IntStringMapScoreState). >> >> >> On Dec 9, 2008, at 11:08 AM, Dina Schneidman wrote: >> >>> but you don't always want to update you indexing before/after >>> scoring >>> On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com >>> wrote: >>>> Score states don't have anything to do with scoring either :-) >>>> they are just >>>> updated before scoring since that is when things can change during >>>> optimization. They used to just be called States which is perhaps >>>> clearer. >>>> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" >>>> duhovka@gmail.com wrote: >>>>> maybe it's a simple solution in order to have it in a model, but >>>>> conceptually this indexing has nothing to do with scoring >>>>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel >>>>> drussel@gmail.com wrote: >>>>>> From what I understand, what you want is a way of specifying >>>>>> what indexes >>>>>> we >>>>>> want build (not a way of specifying queries). We could easily >>>>>> provide >>>>>> ScoreStates for indexes based on: >>>>>> - set of discrete valued attributes >>>>>> - D-dimensional interval queries on float values >>>>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>>>>> Daniel - we all know how to run for loops ;) >>>>>>> I just thought it make sense to have something more efficient :) >>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>>>>> For the 26S project we currently do: >>>>>>>>> get particle by name >>>>>>>> How could we beat something like: >>>>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>>>> x.get_value(name) >>>>>>>> == >>>>>>>> "myname"] >>>>>>>> with an SQL query? >>>>>>>>> get a set of particles within a residue number range >>>>>>>> again, some variant on: >>>>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>>>> ResidueDecorator(x).get_index() > lb and >>>>>>>> ResidueDecorator(x).get_index() >>>>>>>> <ub] >>>>>>>> or C++, something like >>>>>>>> BOOST_FOREACH(Particle *p, >>>>>>>> molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>>>> // do something >>>>>>>> } >>>>>>>> } >>>>>>>>> the solution of a function that gets (key,value,container) >>>>>>>>> seems like >>>>>>>>> a >>>>>>>>> good solution. >>>>>>>>> However - it can be more complicated : >>>>>>>>> 1. it can interact with the hierarchy - give me the >>>>>>>>> residues range >>>>>>>>> within this protein for example - so we should probably also >>>>>>>>> allow for >>>>>>>>> a >>>>>>>>> hierarchy handle in the interface. >>>>>>>>> 2. we might want to ask residue range + some other property >>>>>>>>> such as >>>>>>>>> have >>>>>>>>> structural coverage or do not. Therefore I think that a sql >>>>>>>>> type >>>>>>>>> string can >>>>>>>>> be more general than a list of attributes - because you do >>>>>>>>> not know >>>>>>>>> how they >>>>>>>>> are related. >>>>>>>> But the added complication is why I would suggest sticking >>>>>>>> with C++ or >>>>>>>> python. Lambda functions or list comprehensions support very >>>>>>>> general >>>>>>>> logic >>>>>>>> (more so than SQL) and allow you to leverage existing code. >>>>>>>> SQL would >>>>>>>> make >>>>>>>> it really hard to use any of the existing functionality and >>>>>>>> require >>>>>>>> lots of >>>>>>>> things be exposed in another language. For example, try to >>>>>>>> find all >>>>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>>>> In general, I think having queries on the whole collection of >>>>>>>>>> particles >>>>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>>>> restraints or >>>>>>>>>> states can add particles to the model and you can never be >>>>>>>>>> sure what >>>>>>>>>> those >>>>>>>>>> look like). >>>>>>>>>> There is already functionality to search a Hierarchy >>>>>>>>>> (although it is >>>>>>>>>> more aimed at C++-- we could use a python interface which >>>>>>>>>> takes takes >>>>>>>>>> a >>>>>>>>>> python lambda function to make it more convenient to use in >>>>>>>>>> python). >>>>>>>>>> And >>>>>>>>>> python has all sorts of features for searching a list (and C >>>>>>>>>> ++ has a >>>>>>>>>> few). >>>>>>>>>> It is not clear to me that we could provide an interface >>>>>>>>>> that is >>>>>>>>>> general and much more concise. >>>>>>>>>> As a slight simplification for python users, we could >>>>>>>>>> provide a >>>>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>>>> arbitrary >>>>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>>>> interface >>>>>>>>>> in C++ >>>>>>>>>> as we would have to have a separate list per type. >>>>>>>>>> Another thing to simplify such search would be a >>>>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend >>>>>>>>>> it has >>>>>>>>>> all >>>>>>>>>> attributes, just providing default values for missing ones. >>>>>>>>>> This >>>>>>>>>> would >>>>>>>>>> obviate the need to check for an attribute before matching >>>>>>>>>> against >>>>>>>>>> it. >>>>>>>>>> What sort of queries do you all do? >>>>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> I need this too (surprisingly). Usually I do it with >>>>>>>>>>> mapping between >>>>>>>>>>> the particle and the attribute. >>>>>>>>>>> It is simple. however it is unclear where should we put >>>>>>>>>>> such a >>>>>>>>>>> mapping. Putting it in a model >>>>>>>>>>> could be the best, however not everyone needs it. So it >>>>>>>>>>> means >>>>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>>>> Dina >>>>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker <kerenl@salilab.org >>>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>>> hi all, >>>>>>>>>>>> Frido and I find ourselves many times need to query >>>>>>>>>>>> particles based >>>>>>>>>>>> on >>>>>>>>>>>> attribute values. >>>>>>>>>>>> Few such examples: a protein with a specific name, >>>>>>>>>>>> particles with a >>>>>>>>>>>> specific >>>>>>>>>>>> residue range. >>>>>>>>>>>> I think that it would be very useful to have something >>>>>>>>>>>> similar to >>>>>>>>>>>> SQL >>>>>>>>>>>> queries on the particles DB. >>>>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>>>> probably >>>>>>>>>>>> obsolete. >>>>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>>>> thank you, >>>>>>>>>>>> Keren. >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev