hi all,
Frido and I find ourselves many times need to query particles based on attribute values. Few such examples: a protein with a specific name, particles with a specific residue range.
I think that it would be very useful to have something similar to SQL queries on the particles DB. Bret might had something similar implemented - but it is probably obsolete. IMP.Atom will probably need such functionality as well.
has anyone took a look at that before ?
thank you, Keren.
i trust that the work on the cross-linking restraints code in IMP is based on the generalized description / classification of such restraints in the recent NCDIR proposal?
thanks, andrej
-- Andrej Sali, Ph.D. Professor and Vice Chair, Department of Biopharmaceutical Sciences Department of Pharmaceutical Chemistry California Institute for Quantitative Biosciences University of California at San Francisco UCSF MC 2552 Byers Hall Room 503B 1700 4th Street San Francisco, CA 94158-2330, USA Tel +1 (415) 514-4227; Fax +1 (415) 514-4231 Assistant, Ms. Karin Asensio, Tel +1 (415)514-4228; Lab +1 (415) 514-4232, 4233, 4258 Email sali@salilab.org; Web http://salilab.org
we use cross-linking data as simple distance restraints - nothing fancy :)
On Dec 9, 2008, at 4:18 PM, Andrej Sali wrote:
> i trust that the work on the cross-linking restraints code in IMP is > based on the generalized description / classification of such > restraints in the recent NCDIR proposal? > > thanks, andrej > > -- > Andrej Sali, Ph.D. > Professor and Vice Chair, Department of Biopharmaceutical Sciences > Department of Pharmaceutical Chemistry > California Institute for Quantitative Biosciences > University of California at San Francisco > UCSF MC 2552 > Byers Hall Room 503B > 1700 4th Street > San Francisco, CA 94158-2330, USA > Tel +1 (415) 514-4227; Fax +1 (415) 514-4231 > Assistant, Ms. Karin Asensio, Tel +1 (415)514-4228; Lab +1 (415) > 514-4232, 4233, 4258 > Email sali@salilab.org; Web http://salilab.org > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Hi,
I need this too (surprisingly). Usually I do it with mapping between the particle and the attribute. It is simple. however it is unclear where should we put such a mapping. Putting it in a model could be the best, however not everyone needs it. So it means somewhere else or extending the Model to ProteinModel?
Dina
P.S. skype me, we can talk about it
On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: > hi all, > > Frido and I find ourselves many times need to query particles based on > attribute values. > Few such examples: a protein with a specific name, particles with a specific > residue range. > > I think that it would be very useful to have something similar to SQL > queries on the particles DB. > Bret might had something similar implemented - but it is probably obsolete. > IMP.Atom will probably need such functionality as well. > > has anyone took a look at that before ? > > thank you, > Keren. > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
In general, I think having queries on the whole collection of particles in the model is not a good idea (since other people's code, restraints or states can add particles to the model and you can never be sure what those look like).
There is already functionality to search a Hierarchy (although it is more aimed at C++-- we could use a python interface which takes takes a python lambda function to make it more convenient to use in python). And python has all sorts of features for searching a list (and C++ has a few). It is not clear to me that we could provide an interface that is general and much more concise.
As a slight simplification for python users, we could provide a function which takes a list of key, value pairs (with keys of arbitrary type) and a container. It is a bit messier to provide this interface in C++ as we would have to have a separate list per type.
Another thing to simplify such search would be a "DefaultValuesDecorator" which wraps a particle and pretend it has all attributes, just providing default values for missing ones. This would obviate the need to check for an attribute before matching against it.
What sort of queries do you all do?
On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote:
> Hi, > > I need this too (surprisingly). Usually I do it with mapping between > the particle and the attribute. > It is simple. however it is unclear where should we put such a > mapping. Putting it in a model > could be the best, however not everyone needs it. So it means > somewhere else or extending the Model to ProteinModel? > > Dina > > P.S. skype me, we can talk about it > > > On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org > wrote: >> hi all, >> >> Frido and I find ourselves many times need to query particles based >> on >> attribute values. >> Few such examples: a protein with a specific name, particles with a >> specific >> residue range. >> >> I think that it would be very useful to have something similar to >> SQL >> queries on the particles DB. >> Bret might had something similar implemented - but it is probably >> obsolete. >> IMP.Atom will probably need such functionality as well. >> >> has anyone took a look at that before ? >> >> thank you, >> Keren. >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
For the 26S project we currently do: get particle by name get a set of particles within a residue number range
the solution of a function that gets (key,value,container) seems like a good solution. However - it can be more complicated : 1. it can interact with the hierarchy - give me the residues range within this protein for example - so we should probably also allow for a hierarchy handle in the interface. 2. we might want to ask residue range + some other property such as have structural coverage or do not. Therefore I think that a sql type string can be more general than a list of attributes - because you do not know how they are related.
On Tue, 9 Dec 2008, Daniel Russel wrote:
> In general, I think having queries on the whole collection of particles in > the model is not a good idea (since other people's code, restraints or states > can add particles to the model and you can never be sure what those look > like). > > There is already functionality to search a Hierarchy (although it is more > aimed at C++-- we could use a python interface which takes takes a python > lambda function to make it more convenient to use in python). And python has > all sorts of features for searching a list (and C++ has a few). > It is not clear to me that we could provide an interface that is general and > much more concise. > > As a slight simplification for python users, we could provide a function > which takes a list of key, value pairs (with keys of arbitrary type) and a > container. It is a bit messier to provide this interface in C++ as we would > have to have a separate list per type. > > Another thing to simplify such search would be a "DefaultValuesDecorator" > which wraps a particle and pretend it has all attributes, just providing > default values for missing ones. This would obviate the need to check for an > attribute before matching against it. > > What sort of queries do you all do? > > > > On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: > >> Hi, >> >> I need this too (surprisingly). Usually I do it with mapping between >> the particle and the attribute. >> It is simple. however it is unclear where should we put such a >> mapping. Putting it in a model >> could be the best, however not everyone needs it. So it means >> somewhere else or extending the Model to ProteinModel? >> >> Dina >> >> P.S. skype me, we can talk about it >> >> >> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >>> hi all, >>> >>> Frido and I find ourselves many times need to query particles based on >>> attribute values. >>> Few such examples: a protein with a specific name, particles with a >>> specific >>> residue range. >>> >>> I think that it would be very useful to have something similar to SQL >>> queries on the particles DB. >>> Bret might had something similar implemented - but it is probably >>> obsolete. >>> IMP.Atom will probably need such functionality as well. >>> >>> has anyone took a look at that before ? >>> >>> thank you, >>> Keren. >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote:
> For the 26S project we currently do: > get particle by name How could we beat something like: [x for x in myparticles if x.has_attribute(name) and x.get_value(name) == "myname"] with an SQL query? > > get a set of particles within a residue number range again, some variant on: [x for x in molecular_hierarchy_get_by_type(root, MolecularHierarchyDecorator.RESIDUE) if ResidueDecorator(x).get_index() > lb and ResidueDecorator(x).get_index() <ub]
or C++, something like BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, MolecularHierarchyDecorator.RESIDUE)) { if (ResidueDecorator(p).get_index() > lb and ResidueDecorator(p).get_index() <ub) { // do something } } > > > > the solution of a function that gets (key,value,container) seems > like a good solution. > However - it can be more complicated : > 1. it can interact with the hierarchy - give me the residues range > within this protein for example - so we should probably also allow > for a hierarchy handle in the interface. > 2. we might want to ask residue range + some other property such as > have structural coverage or do not. Therefore I think that a sql > type string can be more general than a list of attributes - because > you do not know how they are related. But the added complication is why I would suggest sticking with C++ or python. Lambda functions or list comprehensions support very general logic (more so than SQL) and allow you to leverage existing code. SQL would make it really hard to use any of the existing functionality and require lots of things be exposed in another language. For example, try to find all particles close to a point in SQL? It is kind of ugly.
> > > > On Tue, 9 Dec 2008, Daniel Russel wrote: > >> In general, I think having queries on the whole collection of >> particles in the model is not a good idea (since other people's >> code, restraints or states can add particles to the model and you >> can never be sure what those look like). >> >> There is already functionality to search a Hierarchy (although it >> is more aimed at C++-- we could use a python interface which takes >> takes a python lambda function to make it more convenient to use in >> python). And python has all sorts of features for searching a list >> (and C++ has a few). >> It is not clear to me that we could provide an interface that is >> general and much more concise. >> >> As a slight simplification for python users, we could provide a >> function which takes a list of key, value pairs (with keys of >> arbitrary type) and a container. It is a bit messier to provide >> this interface in C++ as we would have to have a separate list per >> type. >> >> Another thing to simplify such search would be a >> "DefaultValuesDecorator" which wraps a particle and pretend it has >> all attributes, just providing default values for missing ones. >> This would obviate the need to check for an attribute before >> matching against it. >> >> What sort of queries do you all do? >> >> >> >> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >> >>> Hi, >>> I need this too (surprisingly). Usually I do it with mapping between >>> the particle and the attribute. >>> It is simple. however it is unclear where should we put such a >>> mapping. Putting it in a model >>> could be the best, however not everyone needs it. So it means >>> somewhere else or extending the Model to ProteinModel? >>> Dina >>> P.S. skype me, we can talk about it >>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>> wrote: >>>> hi all, >>>> Frido and I find ourselves many times need to query particles >>>> based on >>>> attribute values. >>>> Few such examples: a protein with a specific name, particles with >>>> a specific >>>> residue range. >>>> I think that it would be very useful to have something similar >>>> to SQL >>>> queries on the particles DB. >>>> Bret might had something similar implemented - but it is >>>> probably obsolete. >>>> IMP.Atom will probably need such functionality as well. >>>> has anyone took a look at that before ? >>>> thank you, >>>> Keren. >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Daniel - we all know how to run for loops ;) I just thought it make sense to have something more efficient :)
On Tue, 9 Dec 2008, Daniel Russel wrote:
> > On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: > >> For the 26S project we currently do: >> get particle by name > How could we beat something like: > [x for x in myparticles if x.has_attribute(name) and x.get_value(name) == > "myname"] > with an SQL query? >> >> get a set of particles within a residue number range > again, some variant on: > [x for x in molecular_hierarchy_get_by_type(root, > MolecularHierarchyDecorator.RESIDUE) if > ResidueDecorator(x).get_index() > lb and ResidueDecorator(x).get_index() > <ub] > > or C++, something like > BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, > MolecularHierarchyDecorator.RESIDUE)) { > if (ResidueDecorator(p).get_index() > lb and ResidueDecorator(p).get_index() > <ub) { > // do something > } > } >> >> >> >> the solution of a function that gets (key,value,container) seems like a >> good solution. >> However - it can be more complicated : >> 1. it can interact with the hierarchy - give me the residues range within >> this protein for example - so we should probably also allow for a hierarchy >> handle in the interface. >> 2. we might want to ask residue range + some other property such as have >> structural coverage or do not. Therefore I think that a sql type string can >> be more general than a list of attributes - because you do not know how >> they are related. > But the added complication is why I would suggest sticking with C++ or > python. Lambda functions or list comprehensions support very general logic > (more so than SQL) and allow you to leverage existing code. SQL would make it > really hard to use any of the existing functionality and require lots of > things be exposed in another language. For example, try to find all particles > close to a point in SQL? It is kind of ugly. > >> >> >> >> On Tue, 9 Dec 2008, Daniel Russel wrote: >> >>> In general, I think having queries on the whole collection of particles in >>> the model is not a good idea (since other people's code, restraints or >>> states can add particles to the model and you can never be sure what those >>> look like). >>> >>> There is already functionality to search a Hierarchy (although it is more >>> aimed at C++-- we could use a python interface which takes takes a python >>> lambda function to make it more convenient to use in python). And python >>> has all sorts of features for searching a list (and C++ has a few). >>> It is not clear to me that we could provide an interface that is general >>> and much more concise. >>> >>> As a slight simplification for python users, we could provide a function >>> which takes a list of key, value pairs (with keys of arbitrary type) and a >>> container. It is a bit messier to provide this interface in C++ as we >>> would have to have a separate list per type. >>> >>> Another thing to simplify such search would be a "DefaultValuesDecorator" >>> which wraps a particle and pretend it has all attributes, just providing >>> default values for missing ones. This would obviate the need to check for >>> an attribute before matching against it. >>> >>> What sort of queries do you all do? >>> >>> >>> >>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>> >>>> Hi, >>>> I need this too (surprisingly). Usually I do it with mapping between >>>> the particle and the attribute. >>>> It is simple. however it is unclear where should we put such a >>>> mapping. Putting it in a model >>>> could be the best, however not everyone needs it. So it means >>>> somewhere else or extending the Model to ProteinModel? >>>> Dina >>>> P.S. skype me, we can talk about it >>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >>>>> hi all, >>>>> Frido and I find ourselves many times need to query particles based on >>>>> attribute values. >>>>> Few such examples: a protein with a specific name, particles with a >>>>> specific >>>>> residue range. >>>>> I think that it would be very useful to have something similar to SQL >>>>> queries on the particles DB. >>>>> Bret might had something similar implemented - but it is probably >>>>> obsolete. >>>>> IMP.Atom will probably need such functionality as well. >>>>> has anyone took a look at that before ? >>>>> thank you, >>>>> Keren. >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote:
> Daniel - we all know how to run for loops ;) > I just thought it make sense to have something more efficient :) Ahhh, I thought you were aiming for conciseness :-) My bad. Trying to pay attention to too many things at once...
How about a ScoreState which builds an index using an arbitrary tuple of IntKeys (or an arbitrary tuple of keys). It would be very trivial to write (using templates in C++, perhaps an alternate python version would be good).
> > > On Tue, 9 Dec 2008, Daniel Russel wrote: > >> >> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >> >>> For the 26S project we currently do: >>> get particle by name >> How could we beat something like: >> [x for x in myparticles if x.has_attribute(name) and >> x.get_value(name) == "myname"] >> with an SQL query? >>> get a set of particles within a residue number range >> again, some variant on: >> [x for x in molecular_hierarchy_get_by_type(root, >> MolecularHierarchyDecorator.RESIDUE) if >> ResidueDecorator(x).get_index() > lb and >> ResidueDecorator(x).get_index() <ub] >> >> or C++, something like >> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >> MolecularHierarchyDecorator.RESIDUE)) { >> if (ResidueDecorator(p).get_index() > lb and >> ResidueDecorator(p).get_index() <ub) { >> // do something >> } >> } >>> the solution of a function that gets (key,value,container) seems >>> like a good solution. >>> However - it can be more complicated : >>> 1. it can interact with the hierarchy - give me the residues >>> range within this protein for example - so we should probably also >>> allow for a hierarchy handle in the interface. >>> 2. we might want to ask residue range + some other property such >>> as have structural coverage or do not. Therefore I think that a >>> sql type string can be more general than a list of attributes - >>> because you do not know how they are related. >> But the added complication is why I would suggest sticking with C++ >> or python. Lambda functions or list comprehensions support very >> general logic (more so than SQL) and allow you to leverage existing >> code. SQL would make it really hard to use any of the existing >> functionality and require lots of things be exposed in another >> language. For example, try to find all particles close to a point >> in SQL? It is kind of ugly. >> >>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>> In general, I think having queries on the whole collection of >>>> particles in the model is not a good idea (since other people's >>>> code, restraints or states can add particles to the model and you >>>> can never be sure what those look like). >>>> There is already functionality to search a Hierarchy (although it >>>> is more aimed at C++-- we could use a python interface which >>>> takes takes a python lambda function to make it more convenient >>>> to use in python). And python has all sorts of features for >>>> searching a list (and C++ has a few). >>>> It is not clear to me that we could provide an interface that is >>>> general and much more concise. >>>> As a slight simplification for python users, we could provide a >>>> function which takes a list of key, value pairs (with keys of >>>> arbitrary type) and a container. It is a bit messier to provide >>>> this interface in C++ as we would have to have a separate list >>>> per type. >>>> Another thing to simplify such search would be a >>>> "DefaultValuesDecorator" which wraps a particle and pretend it >>>> has all attributes, just providing default values for missing >>>> ones. This would obviate the need to check for an attribute >>>> before matching against it. >>>> What sort of queries do you all do? >>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>> Hi, >>>>> I need this too (surprisingly). Usually I do it with mapping >>>>> between >>>>> the particle and the attribute. >>>>> It is simple. however it is unclear where should we put such a >>>>> mapping. Putting it in a model >>>>> could be the best, however not everyone needs it. So it means >>>>> somewhere else or extending the Model to ProteinModel? >>>>> Dina >>>>> P.S. skype me, we can talk about it >>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker >>>>> kerenl@salilab.org wrote: >>>>>> hi all, >>>>>> Frido and I find ourselves many times need to query particles >>>>>> based on >>>>>> attribute values. >>>>>> Few such examples: a protein with a specific name, particles >>>>>> with a specific >>>>>> residue range. >>>>>> I think that it would be very useful to have something similar >>>>>> to SQL >>>>>> queries on the particles DB. >>>>>> Bret might had something similar implemented - but it is >>>>>> probably obsolete. >>>>>> IMP.Atom will probably need such functionality as well. >>>>>> has anyone took a look at that before ? >>>>>> thank you, >>>>>> Keren. >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
From what I understand, what you want is a way of specifying what indexes we want build (not a way of specifying queries). We could easily provide ScoreStates for indexes based on: - set of discrete valued attributes - D-dimensional interval queries on float values
On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote:
> Daniel - we all know how to run for loops ;) > I just thought it make sense to have something more efficient :) > > On Tue, 9 Dec 2008, Daniel Russel wrote: > >> >> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >> >>> For the 26S project we currently do: >>> get particle by name >> How could we beat something like: >> [x for x in myparticles if x.has_attribute(name) and >> x.get_value(name) == "myname"] >> with an SQL query? >>> get a set of particles within a residue number range >> again, some variant on: >> [x for x in molecular_hierarchy_get_by_type(root, >> MolecularHierarchyDecorator.RESIDUE) if >> ResidueDecorator(x).get_index() > lb and >> ResidueDecorator(x).get_index() <ub] >> >> or C++, something like >> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >> MolecularHierarchyDecorator.RESIDUE)) { >> if (ResidueDecorator(p).get_index() > lb and >> ResidueDecorator(p).get_index() <ub) { >> // do something >> } >> } >>> the solution of a function that gets (key,value,container) seems >>> like a good solution. >>> However - it can be more complicated : >>> 1. it can interact with the hierarchy - give me the residues >>> range within this protein for example - so we should probably also >>> allow for a hierarchy handle in the interface. >>> 2. we might want to ask residue range + some other property such >>> as have structural coverage or do not. Therefore I think that a >>> sql type string can be more general than a list of attributes - >>> because you do not know how they are related. >> But the added complication is why I would suggest sticking with C++ >> or python. Lambda functions or list comprehensions support very >> general logic (more so than SQL) and allow you to leverage existing >> code. SQL would make it really hard to use any of the existing >> functionality and require lots of things be exposed in another >> language. For example, try to find all particles close to a point >> in SQL? It is kind of ugly. >> >>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>> In general, I think having queries on the whole collection of >>>> particles in the model is not a good idea (since other people's >>>> code, restraints or states can add particles to the model and you >>>> can never be sure what those look like). >>>> There is already functionality to search a Hierarchy (although it >>>> is more aimed at C++-- we could use a python interface which >>>> takes takes a python lambda function to make it more convenient >>>> to use in python). And python has all sorts of features for >>>> searching a list (and C++ has a few). >>>> It is not clear to me that we could provide an interface that is >>>> general and much more concise. >>>> As a slight simplification for python users, we could provide a >>>> function which takes a list of key, value pairs (with keys of >>>> arbitrary type) and a container. It is a bit messier to provide >>>> this interface in C++ as we would have to have a separate list >>>> per type. >>>> Another thing to simplify such search would be a >>>> "DefaultValuesDecorator" which wraps a particle and pretend it >>>> has all attributes, just providing default values for missing >>>> ones. This would obviate the need to check for an attribute >>>> before matching against it. >>>> What sort of queries do you all do? >>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>> Hi, >>>>> I need this too (surprisingly). Usually I do it with mapping >>>>> between >>>>> the particle and the attribute. >>>>> It is simple. however it is unclear where should we put such a >>>>> mapping. Putting it in a model >>>>> could be the best, however not everyone needs it. So it means >>>>> somewhere else or extending the Model to ProteinModel? >>>>> Dina >>>>> P.S. skype me, we can talk about it >>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker >>>>> kerenl@salilab.org wrote: >>>>>> hi all, >>>>>> Frido and I find ourselves many times need to query particles >>>>>> based on >>>>>> attribute values. >>>>>> Few such examples: a protein with a specific name, particles >>>>>> with a specific >>>>>> residue range. >>>>>> I think that it would be very useful to have something similar >>>>>> to SQL >>>>>> queries on the particles DB. >>>>>> Bret might had something similar implemented - but it is >>>>>> probably obsolete. >>>>>> IMP.Atom will probably need such functionality as well. >>>>>> has anyone took a look at that before ? >>>>>> thank you, >>>>>> Keren. >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
maybe it's a simple solution in order to have it in a model, but conceptually this indexing has nothing to do with scoring
On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com wrote: > From what I understand, what you want is a way of specifying what indexes we > want build (not a way of specifying queries). We could easily provide > ScoreStates for indexes based on: > - set of discrete valued attributes > - D-dimensional interval queries on float values > > > On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: > >> Daniel - we all know how to run for loops ;) >> I just thought it make sense to have something more efficient :) >> >> On Tue, 9 Dec 2008, Daniel Russel wrote: >> >>> >>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>> >>>> For the 26S project we currently do: >>>> get particle by name >>> >>> How could we beat something like: >>> [x for x in myparticles if x.has_attribute(name) and x.get_value(name) == >>> "myname"] >>> with an SQL query? >>>> >>>> get a set of particles within a residue number range >>> >>> again, some variant on: >>> [x for x in molecular_hierarchy_get_by_type(root, >>> MolecularHierarchyDecorator.RESIDUE) if >>> ResidueDecorator(x).get_index() > lb and ResidueDecorator(x).get_index() >>> <ub] >>> >>> or C++, something like >>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>> MolecularHierarchyDecorator.RESIDUE)) { >>> if (ResidueDecorator(p).get_index() > lb and >>> ResidueDecorator(p).get_index() <ub) { >>> // do something >>> } >>> } >>>> >>>> the solution of a function that gets (key,value,container) seems like a >>>> good solution. >>>> However - it can be more complicated : >>>> 1. it can interact with the hierarchy - give me the residues range >>>> within this protein for example - so we should probably also allow for a >>>> hierarchy handle in the interface. >>>> 2. we might want to ask residue range + some other property such as have >>>> structural coverage or do not. Therefore I think that a sql type string can >>>> be more general than a list of attributes - because you do not know how they >>>> are related. >>> >>> But the added complication is why I would suggest sticking with C++ or >>> python. Lambda functions or list comprehensions support very general logic >>> (more so than SQL) and allow you to leverage existing code. SQL would make >>> it really hard to use any of the existing functionality and require lots of >>> things be exposed in another language. For example, try to find all >>> particles close to a point in SQL? It is kind of ugly. >>> >>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>> >>>>> In general, I think having queries on the whole collection of particles >>>>> in the model is not a good idea (since other people's code, restraints or >>>>> states can add particles to the model and you can never be sure what those >>>>> look like). >>>>> There is already functionality to search a Hierarchy (although it is >>>>> more aimed at C++-- we could use a python interface which takes takes a >>>>> python lambda function to make it more convenient to use in python). And >>>>> python has all sorts of features for searching a list (and C++ has a few). >>>>> It is not clear to me that we could provide an interface that is >>>>> general and much more concise. >>>>> As a slight simplification for python users, we could provide a >>>>> function which takes a list of key, value pairs (with keys of arbitrary >>>>> type) and a container. It is a bit messier to provide this interface in C++ >>>>> as we would have to have a separate list per type. >>>>> Another thing to simplify such search would be a >>>>> "DefaultValuesDecorator" which wraps a particle and pretend it has all >>>>> attributes, just providing default values for missing ones. This would >>>>> obviate the need to check for an attribute before matching against it. >>>>> What sort of queries do you all do? >>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>> >>>>>> Hi, >>>>>> I need this too (surprisingly). Usually I do it with mapping between >>>>>> the particle and the attribute. >>>>>> It is simple. however it is unclear where should we put such a >>>>>> mapping. Putting it in a model >>>>>> could be the best, however not everyone needs it. So it means >>>>>> somewhere else or extending the Model to ProteinModel? >>>>>> Dina >>>>>> P.S. skype me, we can talk about it >>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>>>> wrote: >>>>>>> >>>>>>> hi all, >>>>>>> Frido and I find ourselves many times need to query particles based >>>>>>> on >>>>>>> attribute values. >>>>>>> Few such examples: a protein with a specific name, particles with a >>>>>>> specific >>>>>>> residue range. >>>>>>> I think that it would be very useful to have something similar to >>>>>>> SQL >>>>>>> queries on the particles DB. >>>>>>> Bret might had something similar implemented - but it is probably >>>>>>> obsolete. >>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>> has anyone took a look at that before ? >>>>>>> thank you, >>>>>>> Keren. >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
Score states don't have anything to do with scoring either :-) they are just updated before scoring since that is when things can change during optimization. They used to just be called States which is perhaps clearer.
On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com wrote:
> maybe it's a simple solution in order to have it in a model, but > conceptually this indexing has nothing to do with scoring > > On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com > wrote: >> From what I understand, what you want is a way of specifying what >> indexes we >> want build (not a way of specifying queries). We could easily provide >> ScoreStates for indexes based on: >> - set of discrete valued attributes >> - D-dimensional interval queries on float values >> >> >> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >> >>> Daniel - we all know how to run for loops ;) >>> I just thought it make sense to have something more efficient :) >>> >>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>> >>>> >>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>> >>>>> For the 26S project we currently do: >>>>> get particle by name >>>> >>>> How could we beat something like: >>>> [x for x in myparticles if x.has_attribute(name) and >>>> x.get_value(name) == >>>> "myname"] >>>> with an SQL query? >>>>> >>>>> get a set of particles within a residue number range >>>> >>>> again, some variant on: >>>> [x for x in molecular_hierarchy_get_by_type(root, >>>> MolecularHierarchyDecorator.RESIDUE) if >>>> ResidueDecorator(x).get_index() > lb and >>>> ResidueDecorator(x).get_index() >>>> <ub] >>>> >>>> or C++, something like >>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>> MolecularHierarchyDecorator.RESIDUE)) { >>>> if (ResidueDecorator(p).get_index() > lb and >>>> ResidueDecorator(p).get_index() <ub) { >>>> // do something >>>> } >>>> } >>>>> >>>>> the solution of a function that gets (key,value,container) seems >>>>> like a >>>>> good solution. >>>>> However - it can be more complicated : >>>>> 1. it can interact with the hierarchy - give me the residues >>>>> range >>>>> within this protein for example - so we should probably also >>>>> allow for a >>>>> hierarchy handle in the interface. >>>>> 2. we might want to ask residue range + some other property such >>>>> as have >>>>> structural coverage or do not. Therefore I think that a sql type >>>>> string can >>>>> be more general than a list of attributes - because you do not >>>>> know how they >>>>> are related. >>>> >>>> But the added complication is why I would suggest sticking with C+ >>>> + or >>>> python. Lambda functions or list comprehensions support very >>>> general logic >>>> (more so than SQL) and allow you to leverage existing code. SQL >>>> would make >>>> it really hard to use any of the existing functionality and >>>> require lots of >>>> things be exposed in another language. For example, try to find all >>>> particles close to a point in SQL? It is kind of ugly. >>>> >>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>> >>>>>> In general, I think having queries on the whole collection of >>>>>> particles >>>>>> in the model is not a good idea (since other people's code, >>>>>> restraints or >>>>>> states can add particles to the model and you can never be sure >>>>>> what those >>>>>> look like). >>>>>> There is already functionality to search a Hierarchy (although >>>>>> it is >>>>>> more aimed at C++-- we could use a python interface which takes >>>>>> takes a >>>>>> python lambda function to make it more convenient to use in >>>>>> python). And >>>>>> python has all sorts of features for searching a list (and C++ >>>>>> has a few). >>>>>> It is not clear to me that we could provide an interface that is >>>>>> general and much more concise. >>>>>> As a slight simplification for python users, we could provide a >>>>>> function which takes a list of key, value pairs (with keys of >>>>>> arbitrary >>>>>> type) and a container. It is a bit messier to provide this >>>>>> interface in C++ >>>>>> as we would have to have a separate list per type. >>>>>> Another thing to simplify such search would be a >>>>>> "DefaultValuesDecorator" which wraps a particle and pretend it >>>>>> has all >>>>>> attributes, just providing default values for missing ones. >>>>>> This would >>>>>> obviate the need to check for an attribute before matching >>>>>> against it. >>>>>> What sort of queries do you all do? >>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>> >>>>>>> Hi, >>>>>>> I need this too (surprisingly). Usually I do it with mapping >>>>>>> between >>>>>>> the particle and the attribute. >>>>>>> It is simple. however it is unclear where should we put such a >>>>>>> mapping. Putting it in a model >>>>>>> could be the best, however not everyone needs it. So it means >>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>> Dina >>>>>>> P.S. skype me, we can talk about it >>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker >>>>>>> kerenl@salilab.org >>>>>>> wrote: >>>>>>>> >>>>>>>> hi all, >>>>>>>> Frido and I find ourselves many times need to query particles >>>>>>>> based >>>>>>>> on >>>>>>>> attribute values. >>>>>>>> Few such examples: a protein with a specific name, particles >>>>>>>> with a >>>>>>>> specific >>>>>>>> residue range. >>>>>>>> I think that it would be very useful to have something >>>>>>>> similar to >>>>>>>> SQL >>>>>>>> queries on the particles DB. >>>>>>>> Bret might had something similar implemented - but it is >>>>>>>> probably >>>>>>>> obsolete. >>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>> has anyone took a look at that before ? >>>>>>>> thank you, >>>>>>>> Keren. >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
but you don't always want to update you indexing before/after scoring
On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com wrote: > Score states don't have anything to do with scoring either :-) they are just > updated before scoring since that is when things can change during > optimization. They used to just be called States which is perhaps clearer. > > > > On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com wrote: > >> maybe it's a simple solution in order to have it in a model, but >> conceptually this indexing has nothing to do with scoring >> >> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com wrote: >>> >>> From what I understand, what you want is a way of specifying what indexes >>> we >>> want build (not a way of specifying queries). We could easily provide >>> ScoreStates for indexes based on: >>> - set of discrete valued attributes >>> - D-dimensional interval queries on float values >>> >>> >>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>> >>>> Daniel - we all know how to run for loops ;) >>>> I just thought it make sense to have something more efficient :) >>>> >>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>> >>>>> >>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>> >>>>>> For the 26S project we currently do: >>>>>> get particle by name >>>>> >>>>> How could we beat something like: >>>>> [x for x in myparticles if x.has_attribute(name) and x.get_value(name) >>>>> == >>>>> "myname"] >>>>> with an SQL query? >>>>>> >>>>>> get a set of particles within a residue number range >>>>> >>>>> again, some variant on: >>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>> ResidueDecorator(x).get_index() > lb and >>>>> ResidueDecorator(x).get_index() >>>>> <ub] >>>>> >>>>> or C++, something like >>>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>> if (ResidueDecorator(p).get_index() > lb and >>>>> ResidueDecorator(p).get_index() <ub) { >>>>> // do something >>>>> } >>>>> } >>>>>> >>>>>> the solution of a function that gets (key,value,container) seems like >>>>>> a >>>>>> good solution. >>>>>> However - it can be more complicated : >>>>>> 1. it can interact with the hierarchy - give me the residues range >>>>>> within this protein for example - so we should probably also allow for >>>>>> a >>>>>> hierarchy handle in the interface. >>>>>> 2. we might want to ask residue range + some other property such as >>>>>> have >>>>>> structural coverage or do not. Therefore I think that a sql type >>>>>> string can >>>>>> be more general than a list of attributes - because you do not know >>>>>> how they >>>>>> are related. >>>>> >>>>> But the added complication is why I would suggest sticking with C++ or >>>>> python. Lambda functions or list comprehensions support very general >>>>> logic >>>>> (more so than SQL) and allow you to leverage existing code. SQL would >>>>> make >>>>> it really hard to use any of the existing functionality and require >>>>> lots of >>>>> things be exposed in another language. For example, try to find all >>>>> particles close to a point in SQL? It is kind of ugly. >>>>> >>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>> >>>>>>> In general, I think having queries on the whole collection of >>>>>>> particles >>>>>>> in the model is not a good idea (since other people's code, >>>>>>> restraints or >>>>>>> states can add particles to the model and you can never be sure what >>>>>>> those >>>>>>> look like). >>>>>>> There is already functionality to search a Hierarchy (although it is >>>>>>> more aimed at C++-- we could use a python interface which takes takes >>>>>>> a >>>>>>> python lambda function to make it more convenient to use in python). >>>>>>> And >>>>>>> python has all sorts of features for searching a list (and C++ has a >>>>>>> few). >>>>>>> It is not clear to me that we could provide an interface that is >>>>>>> general and much more concise. >>>>>>> As a slight simplification for python users, we could provide a >>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>> arbitrary >>>>>>> type) and a container. It is a bit messier to provide this interface >>>>>>> in C++ >>>>>>> as we would have to have a separate list per type. >>>>>>> Another thing to simplify such search would be a >>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend it has >>>>>>> all >>>>>>> attributes, just providing default values for missing ones. This >>>>>>> would >>>>>>> obviate the need to check for an attribute before matching against >>>>>>> it. >>>>>>> What sort of queries do you all do? >>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> I need this too (surprisingly). Usually I do it with mapping between >>>>>>>> the particle and the attribute. >>>>>>>> It is simple. however it is unclear where should we put such a >>>>>>>> mapping. Putting it in a model >>>>>>>> could be the best, however not everyone needs it. So it means >>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>> Dina >>>>>>>> P.S. skype me, we can talk about it >>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> hi all, >>>>>>>>> Frido and I find ourselves many times need to query particles based >>>>>>>>> on >>>>>>>>> attribute values. >>>>>>>>> Few such examples: a protein with a specific name, particles with a >>>>>>>>> specific >>>>>>>>> residue range. >>>>>>>>> I think that it would be very useful to have something similar to >>>>>>>>> SQL >>>>>>>>> queries on the particles DB. >>>>>>>>> Bret might had something similar implemented - but it is probably >>>>>>>>> obsolete. >>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>> has anyone took a look at that before ? >>>>>>>>> thank you, >>>>>>>>> Keren. >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
True, but is our only current hook for keeping things up to date. One alternative would be to add hooks to monitor attribute changes, but that would be tricky to make not expensive. You don't have to add states to a model though, so you could update it manually if you want.
On Dec 9, 2008, at 11:08 AM, "Dina Schneidman" duhovka@gmail.com wrote:
> but you don't always want to update you indexing before/after scoring > > On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com > wrote: >> Score states don't have anything to do with scoring either :-) they >> are just >> updated before scoring since that is when things can change during >> optimization. They used to just be called States which is perhaps >> clearer. >> >> >> >> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com >> wrote: >> >>> maybe it's a simple solution in order to have it in a model, but >>> conceptually this indexing has nothing to do with scoring >>> >>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com >>> wrote: >>>> >>>> From what I understand, what you want is a way of specifying what >>>> indexes >>>> we >>>> want build (not a way of specifying queries). We could easily >>>> provide >>>> ScoreStates for indexes based on: >>>> - set of discrete valued attributes >>>> - D-dimensional interval queries on float values >>>> >>>> >>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>> >>>>> Daniel - we all know how to run for loops ;) >>>>> I just thought it make sense to have something more efficient :) >>>>> >>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>> >>>>>> >>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>> >>>>>>> For the 26S project we currently do: >>>>>>> get particle by name >>>>>> >>>>>> How could we beat something like: >>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>> x.get_value(name) >>>>>> == >>>>>> "myname"] >>>>>> with an SQL query? >>>>>>> >>>>>>> get a set of particles within a residue number range >>>>>> >>>>>> again, some variant on: >>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>> ResidueDecorator(x).get_index() > lb and >>>>>> ResidueDecorator(x).get_index() >>>>>> <ub] >>>>>> >>>>>> or C++, something like >>>>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>> // do something >>>>>> } >>>>>> } >>>>>>> >>>>>>> the solution of a function that gets (key,value,container) >>>>>>> seems like >>>>>>> a >>>>>>> good solution. >>>>>>> However - it can be more complicated : >>>>>>> 1. it can interact with the hierarchy - give me the residues >>>>>>> range >>>>>>> within this protein for example - so we should probably also >>>>>>> allow for >>>>>>> a >>>>>>> hierarchy handle in the interface. >>>>>>> 2. we might want to ask residue range + some other property >>>>>>> such as >>>>>>> have >>>>>>> structural coverage or do not. Therefore I think that a sql type >>>>>>> string can >>>>>>> be more general than a list of attributes - because you do not >>>>>>> know >>>>>>> how they >>>>>>> are related. >>>>>> >>>>>> But the added complication is why I would suggest sticking with >>>>>> C++ or >>>>>> python. Lambda functions or list comprehensions support very >>>>>> general >>>>>> logic >>>>>> (more so than SQL) and allow you to leverage existing code. SQL >>>>>> would >>>>>> make >>>>>> it really hard to use any of the existing functionality and >>>>>> require >>>>>> lots of >>>>>> things be exposed in another language. For example, try to find >>>>>> all >>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>> >>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>> >>>>>>>> In general, I think having queries on the whole collection of >>>>>>>> particles >>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>> restraints or >>>>>>>> states can add particles to the model and you can never be >>>>>>>> sure what >>>>>>>> those >>>>>>>> look like). >>>>>>>> There is already functionality to search a Hierarchy >>>>>>>> (although it is >>>>>>>> more aimed at C++-- we could use a python interface which >>>>>>>> takes takes >>>>>>>> a >>>>>>>> python lambda function to make it more convenient to use in >>>>>>>> python). >>>>>>>> And >>>>>>>> python has all sorts of features for searching a list (and C+ >>>>>>>> + has a >>>>>>>> few). >>>>>>>> It is not clear to me that we could provide an interface that >>>>>>>> is >>>>>>>> general and much more concise. >>>>>>>> As a slight simplification for python users, we could provide a >>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>> arbitrary >>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>> interface >>>>>>>> in C++ >>>>>>>> as we would have to have a separate list per type. >>>>>>>> Another thing to simplify such search would be a >>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend >>>>>>>> it has >>>>>>>> all >>>>>>>> attributes, just providing default values for missing ones. >>>>>>>> This >>>>>>>> would >>>>>>>> obviate the need to check for an attribute before matching >>>>>>>> against >>>>>>>> it. >>>>>>>> What sort of queries do you all do? >>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> I need this too (surprisingly). Usually I do it with mapping >>>>>>>>> between >>>>>>>>> the particle and the attribute. >>>>>>>>> It is simple. however it is unclear where should we put such a >>>>>>>>> mapping. Putting it in a model >>>>>>>>> could be the best, however not everyone needs it. So it means >>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>> Dina >>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker <kerenl@salilab.org >>>>>>>>> > >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> hi all, >>>>>>>>>> Frido and I find ourselves many times need to query >>>>>>>>>> particles based >>>>>>>>>> on >>>>>>>>>> attribute values. >>>>>>>>>> Few such examples: a protein with a specific name, >>>>>>>>>> particles with a >>>>>>>>>> specific >>>>>>>>>> residue range. >>>>>>>>>> I think that it would be very useful to have something >>>>>>>>>> similar to >>>>>>>>>> SQL >>>>>>>>>> queries on the particles DB. >>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>> probably >>>>>>>>>> obsolete. >>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>> thank you, >>>>>>>>>> Keren. >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Daniel - I am sorry I was a way for a few hours - but in the end I did not really follow from this thread what type of queries would be possible.
On Tue, 9 Dec 2008, Daniel Russel wrote:
> True, but is our only current hook for keeping things up to date. One > alternative would be to add hooks to monitor attribute changes, but that > would be tricky to make not expensive. You don't have to add states to a > model though, so you could update it manually if you want. > > > > On Dec 9, 2008, at 11:08 AM, "Dina Schneidman" duhovka@gmail.com wrote: > >> but you don't always want to update you indexing before/after scoring >> >> On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com wrote: >>> Score states don't have anything to do with scoring either :-) they are >>> just >>> updated before scoring since that is when things can change during >>> optimization. They used to just be called States which is perhaps >>> clearer. >>> >>> >>> >>> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com >>> wrote: >>> >>>> maybe it's a simple solution in order to have it in a model, but >>>> conceptually this indexing has nothing to do with scoring >>>> >>>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com >>>> wrote: >>>>> >>>>> From what I understand, what you want is a way of specifying what >>>>> indexes >>>>> we >>>>> want build (not a way of specifying queries). We could easily provide >>>>> ScoreStates for indexes based on: >>>>> - set of discrete valued attributes >>>>> - D-dimensional interval queries on float values >>>>> >>>>> >>>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>>> >>>>>> Daniel - we all know how to run for loops ;) >>>>>> I just thought it make sense to have something more efficient :) >>>>>> >>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>> >>>>>>> >>>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>>> >>>>>>>> For the 26S project we currently do: >>>>>>>> get particle by name >>>>>>> >>>>>>> How could we beat something like: >>>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>>> x.get_value(name) >>>>>>> == >>>>>>> "myname"] >>>>>>> with an SQL query? >>>>>>>> >>>>>>>> get a set of particles within a residue number range >>>>>>> >>>>>>> again, some variant on: >>>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>>> ResidueDecorator(x).get_index() > lb and >>>>>>> ResidueDecorator(x).get_index() >>>>>>> <ub] >>>>>>> >>>>>>> or C++, something like >>>>>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>>> // do something >>>>>>> } >>>>>>> } >>>>>>>> >>>>>>>> the solution of a function that gets (key,value,container) seems >>>>>>>> like >>>>>>>> a >>>>>>>> good solution. >>>>>>>> However - it can be more complicated : >>>>>>>> 1. it can interact with the hierarchy - give me the residues range >>>>>>>> within this protein for example - so we should probably also allow >>>>>>>> for >>>>>>>> a >>>>>>>> hierarchy handle in the interface. >>>>>>>> 2. we might want to ask residue range + some other property such as >>>>>>>> have >>>>>>>> structural coverage or do not. Therefore I think that a sql type >>>>>>>> string can >>>>>>>> be more general than a list of attributes - because you do not know >>>>>>>> how they >>>>>>>> are related. >>>>>>> >>>>>>> But the added complication is why I would suggest sticking with C++ >>>>>>> or >>>>>>> python. Lambda functions or list comprehensions support very general >>>>>>> logic >>>>>>> (more so than SQL) and allow you to leverage existing code. SQL >>>>>>> would >>>>>>> make >>>>>>> it really hard to use any of the existing functionality and require >>>>>>> lots of >>>>>>> things be exposed in another language. For example, try to find all >>>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>>> >>>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>>> >>>>>>>>> In general, I think having queries on the whole collection of >>>>>>>>> particles >>>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>>> restraints or >>>>>>>>> states can add particles to the model and you can never be sure >>>>>>>>> what >>>>>>>>> those >>>>>>>>> look like). >>>>>>>>> There is already functionality to search a Hierarchy (although it >>>>>>>>> is >>>>>>>>> more aimed at C++-- we could use a python interface which takes >>>>>>>>> takes >>>>>>>>> a >>>>>>>>> python lambda function to make it more convenient to use in >>>>>>>>> python). >>>>>>>>> And >>>>>>>>> python has all sorts of features for searching a list (and C++ has >>>>>>>>> a >>>>>>>>> few). >>>>>>>>> It is not clear to me that we could provide an interface that is >>>>>>>>> general and much more concise. >>>>>>>>> As a slight simplification for python users, we could provide a >>>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>>> arbitrary >>>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>>> interface >>>>>>>>> in C++ >>>>>>>>> as we would have to have a separate list per type. >>>>>>>>> Another thing to simplify such search would be a >>>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend it has >>>>>>>>> all >>>>>>>>> attributes, just providing default values for missing ones. This >>>>>>>>> would >>>>>>>>> obviate the need to check for an attribute before matching against >>>>>>>>> it. >>>>>>>>> What sort of queries do you all do? >>>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I need this too (surprisingly). Usually I do it with mapping >>>>>>>>>> between >>>>>>>>>> the particle and the attribute. >>>>>>>>>> It is simple. however it is unclear where should we put such a >>>>>>>>>> mapping. Putting it in a model >>>>>>>>>> could be the best, however not everyone needs it. So it means >>>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>>> Dina >>>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> hi all, >>>>>>>>>>> Frido and I find ourselves many times need to query particles >>>>>>>>>>> based >>>>>>>>>>> on >>>>>>>>>>> attribute values. >>>>>>>>>>> Few such examples: a protein with a specific name, particles >>>>>>>>>>> with a >>>>>>>>>>> specific >>>>>>>>>>> residue range. >>>>>>>>>>> I think that it would be very useful to have something similar >>>>>>>>>>> to >>>>>>>>>>> SQL >>>>>>>>>>> queries on the particles DB. >>>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>>> probably >>>>>>>>>>> obsolete. >>>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>>> thank you, >>>>>>>>>>> Keren. >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
So does something like this look to be along the lines of what you all have in mind? //! Create a map from attributes to particles /** The map can be built with up to 4 different attributes of type Int or String. Particles missing the named attributes are skipped. */ template <class TypeA, class TypeB=NullType, class TypeC=NullType, class TypeD=NullType> class MapScoreState: public RefCountedObject { public: typedef fancy_crap Key; typedef more_fancy_crap Value; //! Initialize the table to index on the attributes in at /** \param[in] pcs The container of particles to index in the table \param[in] at The set of attributes to index on*/ MapScoreState(ParticleContainer *pcs, const Attributes &at); //! Get the particles matching v Particles get_particles(const Values &v) const; //! Get the particle matching v /** \throw InvalidStateExpeption if more than one particle matches */ Particle *get_particle(const Values &v) const;
//! Also would have the score state update method to force an update };
typedef MapScoreState<Int> AtomIndexMap;
AtomIndexMap *myindexmap= new AtomIndexMap(myparticles,
AtomIndexMap::Key(AtomDecorator::get_index_key())); myindexmap->before_evaluate(-1); // we might want to add an update method to ScoreState
Particle *atom10 = myindexmap->get_particle(AtomIndexMap::Value(10));
For python, we would have to create a bunch of the maps with pre- chosen types (i.e. IntMapScoreState, IntStringMapScoreState).
On Dec 9, 2008, at 11:08 AM, Dina Schneidman wrote:
> but you don't always want to update you indexing before/after scoring > > On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com > wrote: >> Score states don't have anything to do with scoring either :-) they >> are just >> updated before scoring since that is when things can change during >> optimization. They used to just be called States which is perhaps >> clearer. >> >> >> >> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com >> wrote: >> >>> maybe it's a simple solution in order to have it in a model, but >>> conceptually this indexing has nothing to do with scoring >>> >>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com >>> wrote: >>>> >>>> From what I understand, what you want is a way of specifying what >>>> indexes >>>> we >>>> want build (not a way of specifying queries). We could easily >>>> provide >>>> ScoreStates for indexes based on: >>>> - set of discrete valued attributes >>>> - D-dimensional interval queries on float values >>>> >>>> >>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>> >>>>> Daniel - we all know how to run for loops ;) >>>>> I just thought it make sense to have something more efficient :) >>>>> >>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>> >>>>>> >>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>> >>>>>>> For the 26S project we currently do: >>>>>>> get particle by name >>>>>> >>>>>> How could we beat something like: >>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>> x.get_value(name) >>>>>> == >>>>>> "myname"] >>>>>> with an SQL query? >>>>>>> >>>>>>> get a set of particles within a residue number range >>>>>> >>>>>> again, some variant on: >>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>> ResidueDecorator(x).get_index() > lb and >>>>>> ResidueDecorator(x).get_index() >>>>>> <ub] >>>>>> >>>>>> or C++, something like >>>>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>> // do something >>>>>> } >>>>>> } >>>>>>> >>>>>>> the solution of a function that gets (key,value,container) >>>>>>> seems like >>>>>>> a >>>>>>> good solution. >>>>>>> However - it can be more complicated : >>>>>>> 1. it can interact with the hierarchy - give me the residues >>>>>>> range >>>>>>> within this protein for example - so we should probably also >>>>>>> allow for >>>>>>> a >>>>>>> hierarchy handle in the interface. >>>>>>> 2. we might want to ask residue range + some other property >>>>>>> such as >>>>>>> have >>>>>>> structural coverage or do not. Therefore I think that a sql type >>>>>>> string can >>>>>>> be more general than a list of attributes - because you do not >>>>>>> know >>>>>>> how they >>>>>>> are related. >>>>>> >>>>>> But the added complication is why I would suggest sticking with >>>>>> C++ or >>>>>> python. Lambda functions or list comprehensions support very >>>>>> general >>>>>> logic >>>>>> (more so than SQL) and allow you to leverage existing code. SQL >>>>>> would >>>>>> make >>>>>> it really hard to use any of the existing functionality and >>>>>> require >>>>>> lots of >>>>>> things be exposed in another language. For example, try to find >>>>>> all >>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>> >>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>> >>>>>>>> In general, I think having queries on the whole collection of >>>>>>>> particles >>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>> restraints or >>>>>>>> states can add particles to the model and you can never be >>>>>>>> sure what >>>>>>>> those >>>>>>>> look like). >>>>>>>> There is already functionality to search a Hierarchy >>>>>>>> (although it is >>>>>>>> more aimed at C++-- we could use a python interface which >>>>>>>> takes takes >>>>>>>> a >>>>>>>> python lambda function to make it more convenient to use in >>>>>>>> python). >>>>>>>> And >>>>>>>> python has all sorts of features for searching a list (and C+ >>>>>>>> + has a >>>>>>>> few). >>>>>>>> It is not clear to me that we could provide an interface that >>>>>>>> is >>>>>>>> general and much more concise. >>>>>>>> As a slight simplification for python users, we could provide a >>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>> arbitrary >>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>> interface >>>>>>>> in C++ >>>>>>>> as we would have to have a separate list per type. >>>>>>>> Another thing to simplify such search would be a >>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend >>>>>>>> it has >>>>>>>> all >>>>>>>> attributes, just providing default values for missing ones. >>>>>>>> This >>>>>>>> would >>>>>>>> obviate the need to check for an attribute before matching >>>>>>>> against >>>>>>>> it. >>>>>>>> What sort of queries do you all do? >>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> I need this too (surprisingly). Usually I do it with mapping >>>>>>>>> between >>>>>>>>> the particle and the attribute. >>>>>>>>> It is simple. however it is unclear where should we put such a >>>>>>>>> mapping. Putting it in a model >>>>>>>>> could be the best, however not everyone needs it. So it means >>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>> Dina >>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker <kerenl@salilab.org >>>>>>>>> > >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> hi all, >>>>>>>>>> Frido and I find ourselves many times need to query >>>>>>>>>> particles based >>>>>>>>>> on >>>>>>>>>> attribute values. >>>>>>>>>> Few such examples: a protein with a specific name, >>>>>>>>>> particles with a >>>>>>>>>> specific >>>>>>>>>> residue range. >>>>>>>>>> I think that it would be very useful to have something >>>>>>>>>> similar to >>>>>>>>>> SQL >>>>>>>>>> queries on the particles DB. >>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>> probably >>>>>>>>>> obsolete. >>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>> thank you, >>>>>>>>>> Keren. >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
looks reasonable. Q1: can you indicate the relationship between the attributes ( and/or) ? Q2: can it support range ( all residues in a range for example) Q3: how do you support hierarchy? should the container hold all the leaves ?
On Tue, 9 Dec 2008, Daniel Russel wrote:
> So does something like this look to be along the lines of what you all have > in mind? > //! Create a map from attributes to particles > /** The map can be built with up to 4 different attributes of type Int or > String. Particles missing the named attributes are skipped. > */ > template <class TypeA, class TypeB=NullType, class TypeC=NullType, class > TypeD=NullType> > class MapScoreState: public RefCountedObject { > public: > typedef fancy_crap Key; > typedef more_fancy_crap Value; > //! Initialize the table to index on the attributes in at > /** \param[in] pcs The container of particles to index in the table > \param[in] at The set of attributes to index on*/ > MapScoreState(ParticleContainer *pcs, const Attributes &at); > //! Get the particles matching v > Particles get_particles(const Values &v) const; > //! Get the particle matching v > /** \throw InvalidStateExpeption if more than one particle matches > */ > Particle *get_particle(const Values &v) const; > > //! Also would have the score state update method to force an update > }; > > > typedef MapScoreState<Int> AtomIndexMap; > > AtomIndexMap *myindexmap= new AtomIndexMap(myparticles, > AtomIndexMap::Key(AtomDecorator::get_index_key())); > myindexmap->before_evaluate(-1); // we might want to add an update method > to ScoreState > > Particle *atom10 = myindexmap->get_particle(AtomIndexMap::Value(10)); > > For python, we would have to create a bunch of the maps with pre-chosen > types (i.e. IntMapScoreState, IntStringMapScoreState). > > > On Dec 9, 2008, at 11:08 AM, Dina Schneidman wrote: > >> but you don't always want to update you indexing before/after scoring >> >> On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com wrote: >>> Score states don't have anything to do with scoring either :-) they are >>> just >>> updated before scoring since that is when things can change during >>> optimization. They used to just be called States which is perhaps >>> clearer. >>> >>> >>> >>> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" duhovka@gmail.com >>> wrote: >>> >>>> maybe it's a simple solution in order to have it in a model, but >>>> conceptually this indexing has nothing to do with scoring >>>> >>>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel drussel@gmail.com >>>> wrote: >>>>> >>>>> From what I understand, what you want is a way of specifying what >>>>> indexes >>>>> we >>>>> want build (not a way of specifying queries). We could easily provide >>>>> ScoreStates for indexes based on: >>>>> - set of discrete valued attributes >>>>> - D-dimensional interval queries on float values >>>>> >>>>> >>>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>>> >>>>>> Daniel - we all know how to run for loops ;) >>>>>> I just thought it make sense to have something more efficient :) >>>>>> >>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>> >>>>>>> >>>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>>> >>>>>>>> For the 26S project we currently do: >>>>>>>> get particle by name >>>>>>> >>>>>>> How could we beat something like: >>>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>>> x.get_value(name) >>>>>>> == >>>>>>> "myname"] >>>>>>> with an SQL query? >>>>>>>> >>>>>>>> get a set of particles within a residue number range >>>>>>> >>>>>>> again, some variant on: >>>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>>> ResidueDecorator(x).get_index() > lb and >>>>>>> ResidueDecorator(x).get_index() >>>>>>> <ub] >>>>>>> >>>>>>> or C++, something like >>>>>>> BOOST_FOREACH(Particle *p, molecular_hierarchy_get_by_type(root, >>>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>>> // do something >>>>>>> } >>>>>>> } >>>>>>>> >>>>>>>> the solution of a function that gets (key,value,container) seems >>>>>>>> like >>>>>>>> a >>>>>>>> good solution. >>>>>>>> However - it can be more complicated : >>>>>>>> 1. it can interact with the hierarchy - give me the residues range >>>>>>>> within this protein for example - so we should probably also allow >>>>>>>> for >>>>>>>> a >>>>>>>> hierarchy handle in the interface. >>>>>>>> 2. we might want to ask residue range + some other property such as >>>>>>>> have >>>>>>>> structural coverage or do not. Therefore I think that a sql type >>>>>>>> string can >>>>>>>> be more general than a list of attributes - because you do not know >>>>>>>> how they >>>>>>>> are related. >>>>>>> >>>>>>> But the added complication is why I would suggest sticking with C++ >>>>>>> or >>>>>>> python. Lambda functions or list comprehensions support very general >>>>>>> logic >>>>>>> (more so than SQL) and allow you to leverage existing code. SQL >>>>>>> would >>>>>>> make >>>>>>> it really hard to use any of the existing functionality and require >>>>>>> lots of >>>>>>> things be exposed in another language. For example, try to find all >>>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>>> >>>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>>> >>>>>>>>> In general, I think having queries on the whole collection of >>>>>>>>> particles >>>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>>> restraints or >>>>>>>>> states can add particles to the model and you can never be sure >>>>>>>>> what >>>>>>>>> those >>>>>>>>> look like). >>>>>>>>> There is already functionality to search a Hierarchy (although it >>>>>>>>> is >>>>>>>>> more aimed at C++-- we could use a python interface which takes >>>>>>>>> takes >>>>>>>>> a >>>>>>>>> python lambda function to make it more convenient to use in >>>>>>>>> python). >>>>>>>>> And >>>>>>>>> python has all sorts of features for searching a list (and C++ has >>>>>>>>> a >>>>>>>>> few). >>>>>>>>> It is not clear to me that we could provide an interface that is >>>>>>>>> general and much more concise. >>>>>>>>> As a slight simplification for python users, we could provide a >>>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>>> arbitrary >>>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>>> interface >>>>>>>>> in C++ >>>>>>>>> as we would have to have a separate list per type. >>>>>>>>> Another thing to simplify such search would be a >>>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend it has >>>>>>>>> all >>>>>>>>> attributes, just providing default values for missing ones. This >>>>>>>>> would >>>>>>>>> obviate the need to check for an attribute before matching against >>>>>>>>> it. >>>>>>>>> What sort of queries do you all do? >>>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> I need this too (surprisingly). Usually I do it with mapping >>>>>>>>>> between >>>>>>>>>> the particle and the attribute. >>>>>>>>>> It is simple. however it is unclear where should we put such a >>>>>>>>>> mapping. Putting it in a model >>>>>>>>>> could be the best, however not everyone needs it. So it means >>>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>>> Dina >>>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> hi all, >>>>>>>>>>> Frido and I find ourselves many times need to query particles >>>>>>>>>>> based >>>>>>>>>>> on >>>>>>>>>>> attribute values. >>>>>>>>>>> Few such examples: a protein with a specific name, particles >>>>>>>>>>> with a >>>>>>>>>>> specific >>>>>>>>>>> residue range. >>>>>>>>>>> I think that it would be very useful to have something similar >>>>>>>>>>> to >>>>>>>>>>> SQL >>>>>>>>>>> queries on the particles DB. >>>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>>> probably >>>>>>>>>>> obsolete. >>>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>>> thank you, >>>>>>>>>>> Keren. >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
On Dec 9, 2008, at 4:58 PM, Keren Lasker wrote:
> looks reasonable. > Q1: can you indicate the relationship between the attributes ( and/ > or) ? "And" as that is the only thing that can be meaningfully accelerated using a hash map. You can write accelerating structures for other sorts of queries as needed, but I doubt we could accelerated complicated boolean queries beyond linear search perhaps paired with some hash maps.
> > Q2: can it support range ( all residues in a range for example) One-D ranges would be trivial to add if I use a std::map internally. Two-D or higher ranges require, obviously, completely different structures internally which are expensive to build and so should be explicitly chosen. I can easily write arbitrary dimensional range searches using CGAL.
> > Q3: how do you support hierarchy? should the container hold all the > leaves ? The Map takes a SingletonContainer. The container can do all sorts of fancy things if you want (you could, for example, write a container that contained all the leaves of the particles you gave it, but I haven't yet).
> On Tue, 9 Dec 2008, Daniel Russel wrote: > >> So does something like this look to be along the lines of what you >> all have in mind? >> //! Create a map from attributes to particles >> /** The map can be built with up to 4 different attributes of type >> Int or >> String. Particles missing the named attributes are skipped. >> */ >> template <class TypeA, class TypeB=NullType, class TypeC=NullType, >> class TypeD=NullType> >> class MapScoreState: public RefCountedObject { >> public: >> typedef fancy_crap Key; >> typedef more_fancy_crap Value; >> //! Initialize the table to index on the attributes in at >> /** \param[in] pcs The container of particles to index in the table >> \param[in] at The set of attributes to index on*/ >> MapScoreState(ParticleContainer *pcs, const Attributes &at); >> //! Get the particles matching v >> Particles get_particles(const Values &v) const; >> //! Get the particle matching v >> /** \throw InvalidStateExpeption if more than one particle matches >> */ >> Particle *get_particle(const Values &v) const; >> >> //! Also would have the score state update method to force an update >> }; >> >> >> typedef MapScoreState<Int> AtomIndexMap; >> >> AtomIndexMap *myindexmap= new AtomIndexMap(myparticles, >> >> AtomIndexMap::Key(AtomDecorator::get_index_key())); >> myindexmap->before_evaluate(-1); // we might want to add an update >> method to ScoreState >> >> Particle *atom10 = myindexmap->get_particle(AtomIndexMap::Value(10)); >> >> For python, we would have to create a bunch of the maps with pre- >> chosen types (i.e. IntMapScoreState, IntStringMapScoreState). >> >> >> On Dec 9, 2008, at 11:08 AM, Dina Schneidman wrote: >> >>> but you don't always want to update you indexing before/after >>> scoring >>> On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com >>> wrote: >>>> Score states don't have anything to do with scoring either :-) >>>> they are just >>>> updated before scoring since that is when things can change during >>>> optimization. They used to just be called States which is perhaps >>>> clearer. >>>> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" >>>> duhovka@gmail.com wrote: >>>>> maybe it's a simple solution in order to have it in a model, but >>>>> conceptually this indexing has nothing to do with scoring >>>>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel >>>>> drussel@gmail.com wrote: >>>>>> From what I understand, what you want is a way of specifying >>>>>> what indexes >>>>>> we >>>>>> want build (not a way of specifying queries). We could easily >>>>>> provide >>>>>> ScoreStates for indexes based on: >>>>>> - set of discrete valued attributes >>>>>> - D-dimensional interval queries on float values >>>>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>>>>> Daniel - we all know how to run for loops ;) >>>>>>> I just thought it make sense to have something more efficient :) >>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>>>>> For the 26S project we currently do: >>>>>>>>> get particle by name >>>>>>>> How could we beat something like: >>>>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>>>> x.get_value(name) >>>>>>>> == >>>>>>>> "myname"] >>>>>>>> with an SQL query? >>>>>>>>> get a set of particles within a residue number range >>>>>>>> again, some variant on: >>>>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>>>> ResidueDecorator(x).get_index() > lb and >>>>>>>> ResidueDecorator(x).get_index() >>>>>>>> <ub] >>>>>>>> or C++, something like >>>>>>>> BOOST_FOREACH(Particle *p, >>>>>>>> molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>>>> // do something >>>>>>>> } >>>>>>>> } >>>>>>>>> the solution of a function that gets (key,value,container) >>>>>>>>> seems like >>>>>>>>> a >>>>>>>>> good solution. >>>>>>>>> However - it can be more complicated : >>>>>>>>> 1. it can interact with the hierarchy - give me the >>>>>>>>> residues range >>>>>>>>> within this protein for example - so we should probably also >>>>>>>>> allow for >>>>>>>>> a >>>>>>>>> hierarchy handle in the interface. >>>>>>>>> 2. we might want to ask residue range + some other property >>>>>>>>> such as >>>>>>>>> have >>>>>>>>> structural coverage or do not. Therefore I think that a sql >>>>>>>>> type >>>>>>>>> string can >>>>>>>>> be more general than a list of attributes - because you do >>>>>>>>> not know >>>>>>>>> how they >>>>>>>>> are related. >>>>>>>> But the added complication is why I would suggest sticking >>>>>>>> with C++ or >>>>>>>> python. Lambda functions or list comprehensions support very >>>>>>>> general >>>>>>>> logic >>>>>>>> (more so than SQL) and allow you to leverage existing code. >>>>>>>> SQL would >>>>>>>> make >>>>>>>> it really hard to use any of the existing functionality and >>>>>>>> require >>>>>>>> lots of >>>>>>>> things be exposed in another language. For example, try to >>>>>>>> find all >>>>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>>>> In general, I think having queries on the whole collection of >>>>>>>>>> particles >>>>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>>>> restraints or >>>>>>>>>> states can add particles to the model and you can never be >>>>>>>>>> sure what >>>>>>>>>> those >>>>>>>>>> look like). >>>>>>>>>> There is already functionality to search a Hierarchy >>>>>>>>>> (although it is >>>>>>>>>> more aimed at C++-- we could use a python interface which >>>>>>>>>> takes takes >>>>>>>>>> a >>>>>>>>>> python lambda function to make it more convenient to use in >>>>>>>>>> python). >>>>>>>>>> And >>>>>>>>>> python has all sorts of features for searching a list (and C >>>>>>>>>> ++ has a >>>>>>>>>> few). >>>>>>>>>> It is not clear to me that we could provide an interface >>>>>>>>>> that is >>>>>>>>>> general and much more concise. >>>>>>>>>> As a slight simplification for python users, we could >>>>>>>>>> provide a >>>>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>>>> arbitrary >>>>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>>>> interface >>>>>>>>>> in C++ >>>>>>>>>> as we would have to have a separate list per type. >>>>>>>>>> Another thing to simplify such search would be a >>>>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend >>>>>>>>>> it has >>>>>>>>>> all >>>>>>>>>> attributes, just providing default values for missing ones. >>>>>>>>>> This >>>>>>>>>> would >>>>>>>>>> obviate the need to check for an attribute before matching >>>>>>>>>> against >>>>>>>>>> it. >>>>>>>>>> What sort of queries do you all do? >>>>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> I need this too (surprisingly). Usually I do it with >>>>>>>>>>> mapping between >>>>>>>>>>> the particle and the attribute. >>>>>>>>>>> It is simple. however it is unclear where should we put >>>>>>>>>>> such a >>>>>>>>>>> mapping. Putting it in a model >>>>>>>>>>> could be the best, however not everyone needs it. So it >>>>>>>>>>> means >>>>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>>>> Dina >>>>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker <kerenl@salilab.org >>>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>>> hi all, >>>>>>>>>>>> Frido and I find ourselves many times need to query >>>>>>>>>>>> particles based >>>>>>>>>>>> on >>>>>>>>>>>> attribute values. >>>>>>>>>>>> Few such examples: a protein with a specific name, >>>>>>>>>>>> particles with a >>>>>>>>>>>> specific >>>>>>>>>>>> residue range. >>>>>>>>>>>> I think that it would be very useful to have something >>>>>>>>>>>> similar to >>>>>>>>>>>> SQL >>>>>>>>>>>> queries on the particles DB. >>>>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>>>> probably >>>>>>>>>>>> obsolete. >>>>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>>>> thank you, >>>>>>>>>>>> Keren. >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I think such functionality would be a reasonable candidate for a new module. Especially as core is getting very large and cluttered.
Perhaps IMP.indexes (or .indices if people prefer that spelling).
On Dec 9, 2008, at 4:58 PM, Keren Lasker wrote:
> looks reasonable. > Q1: can you indicate the relationship between the attributes ( and/ > or) ? > Q2: can it support range ( all residues in a range for example) > Q3: how do you support hierarchy? should the container hold all the > leaves ? > > On Tue, 9 Dec 2008, Daniel Russel wrote: > >> So does something like this look to be along the lines of what you >> all have in mind? >> //! Create a map from attributes to particles >> /** The map can be built with up to 4 different attributes of type >> Int or >> String. Particles missing the named attributes are skipped. >> */ >> template <class TypeA, class TypeB=NullType, class TypeC=NullType, >> class TypeD=NullType> >> class MapScoreState: public RefCountedObject { >> public: >> typedef fancy_crap Key; >> typedef more_fancy_crap Value; >> //! Initialize the table to index on the attributes in at >> /** \param[in] pcs The container of particles to index in the table >> \param[in] at The set of attributes to index on*/ >> MapScoreState(ParticleContainer *pcs, const Attributes &at); >> //! Get the particles matching v >> Particles get_particles(const Values &v) const; >> //! Get the particle matching v >> /** \throw InvalidStateExpeption if more than one particle matches >> */ >> Particle *get_particle(const Values &v) const; >> >> //! Also would have the score state update method to force an update >> }; >> >> >> typedef MapScoreState<Int> AtomIndexMap; >> >> AtomIndexMap *myindexmap= new AtomIndexMap(myparticles, >> >> AtomIndexMap::Key(AtomDecorator::get_index_key())); >> myindexmap->before_evaluate(-1); // we might want to add an update >> method to ScoreState >> >> Particle *atom10 = myindexmap->get_particle(AtomIndexMap::Value(10)); >> >> For python, we would have to create a bunch of the maps with pre- >> chosen types (i.e. IntMapScoreState, IntStringMapScoreState). >> >> >> On Dec 9, 2008, at 11:08 AM, Dina Schneidman wrote: >> >>> but you don't always want to update you indexing before/after >>> scoring >>> On Tue, Dec 9, 2008 at 11:06 AM, Daniel Russel drussel@gmail.com >>> wrote: >>>> Score states don't have anything to do with scoring either :-) >>>> they are just >>>> updated before scoring since that is when things can change during >>>> optimization. They used to just be called States which is perhaps >>>> clearer. >>>> On Dec 9, 2008, at 10:56 AM, "Dina Schneidman" >>>> duhovka@gmail.com wrote: >>>>> maybe it's a simple solution in order to have it in a model, but >>>>> conceptually this indexing has nothing to do with scoring >>>>> On Tue, Dec 9, 2008 at 10:42 AM, Daniel Russel >>>>> drussel@gmail.com wrote: >>>>>> From what I understand, what you want is a way of specifying >>>>>> what indexes >>>>>> we >>>>>> want build (not a way of specifying queries). We could easily >>>>>> provide >>>>>> ScoreStates for indexes based on: >>>>>> - set of discrete valued attributes >>>>>> - D-dimensional interval queries on float values >>>>>> On Dec 9, 2008, at 9:55 AM, Keren Lasker wrote: >>>>>>> Daniel - we all know how to run for loops ;) >>>>>>> I just thought it make sense to have something more efficient :) >>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>> On Dec 9, 2008, at 9:29 AM, Keren Lasker wrote: >>>>>>>>> For the 26S project we currently do: >>>>>>>>> get particle by name >>>>>>>> How could we beat something like: >>>>>>>> [x for x in myparticles if x.has_attribute(name) and >>>>>>>> x.get_value(name) >>>>>>>> == >>>>>>>> "myname"] >>>>>>>> with an SQL query? >>>>>>>>> get a set of particles within a residue number range >>>>>>>> again, some variant on: >>>>>>>> [x for x in molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE) if >>>>>>>> ResidueDecorator(x).get_index() > lb and >>>>>>>> ResidueDecorator(x).get_index() >>>>>>>> <ub] >>>>>>>> or C++, something like >>>>>>>> BOOST_FOREACH(Particle *p, >>>>>>>> molecular_hierarchy_get_by_type(root, >>>>>>>> MolecularHierarchyDecorator.RESIDUE)) { >>>>>>>> if (ResidueDecorator(p).get_index() > lb and >>>>>>>> ResidueDecorator(p).get_index() <ub) { >>>>>>>> // do something >>>>>>>> } >>>>>>>> } >>>>>>>>> the solution of a function that gets (key,value,container) >>>>>>>>> seems like >>>>>>>>> a >>>>>>>>> good solution. >>>>>>>>> However - it can be more complicated : >>>>>>>>> 1. it can interact with the hierarchy - give me the >>>>>>>>> residues range >>>>>>>>> within this protein for example - so we should probably also >>>>>>>>> allow for >>>>>>>>> a >>>>>>>>> hierarchy handle in the interface. >>>>>>>>> 2. we might want to ask residue range + some other property >>>>>>>>> such as >>>>>>>>> have >>>>>>>>> structural coverage or do not. Therefore I think that a sql >>>>>>>>> type >>>>>>>>> string can >>>>>>>>> be more general than a list of attributes - because you do >>>>>>>>> not know >>>>>>>>> how they >>>>>>>>> are related. >>>>>>>> But the added complication is why I would suggest sticking >>>>>>>> with C++ or >>>>>>>> python. Lambda functions or list comprehensions support very >>>>>>>> general >>>>>>>> logic >>>>>>>> (more so than SQL) and allow you to leverage existing code. >>>>>>>> SQL would >>>>>>>> make >>>>>>>> it really hard to use any of the existing functionality and >>>>>>>> require >>>>>>>> lots of >>>>>>>> things be exposed in another language. For example, try to >>>>>>>> find all >>>>>>>> particles close to a point in SQL? It is kind of ugly. >>>>>>>>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>>>>>>>>> In general, I think having queries on the whole collection of >>>>>>>>>> particles >>>>>>>>>> in the model is not a good idea (since other people's code, >>>>>>>>>> restraints or >>>>>>>>>> states can add particles to the model and you can never be >>>>>>>>>> sure what >>>>>>>>>> those >>>>>>>>>> look like). >>>>>>>>>> There is already functionality to search a Hierarchy >>>>>>>>>> (although it is >>>>>>>>>> more aimed at C++-- we could use a python interface which >>>>>>>>>> takes takes >>>>>>>>>> a >>>>>>>>>> python lambda function to make it more convenient to use in >>>>>>>>>> python). >>>>>>>>>> And >>>>>>>>>> python has all sorts of features for searching a list (and C >>>>>>>>>> ++ has a >>>>>>>>>> few). >>>>>>>>>> It is not clear to me that we could provide an interface >>>>>>>>>> that is >>>>>>>>>> general and much more concise. >>>>>>>>>> As a slight simplification for python users, we could >>>>>>>>>> provide a >>>>>>>>>> function which takes a list of key, value pairs (with keys of >>>>>>>>>> arbitrary >>>>>>>>>> type) and a container. It is a bit messier to provide this >>>>>>>>>> interface >>>>>>>>>> in C++ >>>>>>>>>> as we would have to have a separate list per type. >>>>>>>>>> Another thing to simplify such search would be a >>>>>>>>>> "DefaultValuesDecorator" which wraps a particle and pretend >>>>>>>>>> it has >>>>>>>>>> all >>>>>>>>>> attributes, just providing default values for missing ones. >>>>>>>>>> This >>>>>>>>>> would >>>>>>>>>> obviate the need to check for an attribute before matching >>>>>>>>>> against >>>>>>>>>> it. >>>>>>>>>> What sort of queries do you all do? >>>>>>>>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> I need this too (surprisingly). Usually I do it with >>>>>>>>>>> mapping between >>>>>>>>>>> the particle and the attribute. >>>>>>>>>>> It is simple. however it is unclear where should we put >>>>>>>>>>> such a >>>>>>>>>>> mapping. Putting it in a model >>>>>>>>>>> could be the best, however not everyone needs it. So it >>>>>>>>>>> means >>>>>>>>>>> somewhere else or extending the Model to ProteinModel? >>>>>>>>>>> Dina >>>>>>>>>>> P.S. skype me, we can talk about it >>>>>>>>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker <kerenl@salilab.org >>>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>>> hi all, >>>>>>>>>>>> Frido and I find ourselves many times need to query >>>>>>>>>>>> particles based >>>>>>>>>>>> on >>>>>>>>>>>> attribute values. >>>>>>>>>>>> Few such examples: a protein with a specific name, >>>>>>>>>>>> particles with a >>>>>>>>>>>> specific >>>>>>>>>>>> residue range. >>>>>>>>>>>> I think that it would be very useful to have something >>>>>>>>>>>> similar to >>>>>>>>>>>> SQL >>>>>>>>>>>> queries on the particles DB. >>>>>>>>>>>> Bret might had something similar implemented - but it is >>>>>>>>>>>> probably >>>>>>>>>>>> obsolete. >>>>>>>>>>>> IMP.Atom will probably need such functionality as well. >>>>>>>>>>>> has anyone took a look at that before ? >>>>>>>>>>>> thank you, >>>>>>>>>>>> Keren. >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> IMP-dev mailing list >>>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>>> _______________________________________________ >>>>>>>>>> IMP-dev mailing list >>>>>>>>>> IMP-dev@salilab.org >>>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>>> _______________________________________________ >>>>>>>>> IMP-dev mailing list >>>>>>>>> IMP-dev@salilab.org >>>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>>> _______________________________________________ >>>>>>>> IMP-dev mailing list >>>>>>>> IMP-dev@salilab.org >>>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I think IndexModel is the best solution, but in C++. having python and c++ is more than enough, we don't want to support something else. I suggested extending the Model because of the update reason, in this case you can update the indexing when you add/remove particles.
On Tue, Dec 9, 2008 at 9:29 AM, Keren Lasker kerenl@salilab.org wrote: > For the 26S project we currently do: > get particle by name > get a set of particles within a residue number range > > > the solution of a function that gets (key,value,container) seems like a good > solution. > However - it can be more complicated : > 1. it can interact with the hierarchy - give me the residues range within > this protein for example - so we should probably also allow for a hierarchy > handle in the interface. > 2. we might want to ask residue range + some other property such as have > structural coverage or do not. Therefore I think that a sql type string can > be more general than a list of attributes - because you do not know how they > are related. > > > On Tue, 9 Dec 2008, Daniel Russel wrote: > >> In general, I think having queries on the whole collection of particles in >> the model is not a good idea (since other people's code, restraints or >> states can add particles to the model and you can never be sure what those >> look like). >> >> There is already functionality to search a Hierarchy (although it is more >> aimed at C++-- we could use a python interface which takes takes a python >> lambda function to make it more convenient to use in python). And python has >> all sorts of features for searching a list (and C++ has a few). >> It is not clear to me that we could provide an interface that is general >> and much more concise. >> >> As a slight simplification for python users, we could provide a function >> which takes a list of key, value pairs (with keys of arbitrary type) and a >> container. It is a bit messier to provide this interface in C++ as we would >> have to have a separate list per type. >> >> Another thing to simplify such search would be a "DefaultValuesDecorator" >> which wraps a particle and pretend it has all attributes, just providing >> default values for missing ones. This would obviate the need to check for an >> attribute before matching against it. >> >> What sort of queries do you all do? >> >> >> >> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >> >>> Hi, >>> >>> I need this too (surprisingly). Usually I do it with mapping between >>> the particle and the attribute. >>> It is simple. however it is unclear where should we put such a >>> mapping. Putting it in a model >>> could be the best, however not everyone needs it. So it means >>> somewhere else or extending the Model to ProteinModel? >>> >>> Dina >>> >>> P.S. skype me, we can talk about it >>> >>> >>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >>>> >>>> hi all, >>>> >>>> Frido and I find ourselves many times need to query particles based on >>>> attribute values. >>>> Few such examples: a protein with a specific name, particles with a >>>> specific >>>> residue range. >>>> >>>> I think that it would be very useful to have something similar to SQL >>>> queries on the particles DB. >>>> Bret might had something similar implemented - but it is probably >>>> obsolete. >>>> IMP.Atom will probably need such functionality as well. >>>> >>>> has anyone took a look at that before ? >>>> >>>> thank you, >>>> Keren. >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
On Dec 9, 2008, at 9:53 AM, Dina Schneidman wrote:
> I think IndexModel is the best solution, but in C++. having python and > c++ is more than enough, we don't want to support something else. I > suggested extending the Model because of the update reason, in this > case you can update the indexing when you add/remove particles. I don't understand what you mean.
> > > On Tue, Dec 9, 2008 at 9:29 AM, Keren Lasker kerenl@salilab.org > wrote: >> For the 26S project we currently do: >> get particle by name >> get a set of particles within a residue number range >> >> >> the solution of a function that gets (key,value,container) seems >> like a good >> solution. >> However - it can be more complicated : >> 1. it can interact with the hierarchy - give me the residues range >> within >> this protein for example - so we should probably also allow for a >> hierarchy >> handle in the interface. >> 2. we might want to ask residue range + some other property such as >> have >> structural coverage or do not. Therefore I think that a sql type >> string can >> be more general than a list of attributes - because you do not know >> how they >> are related. >> >> >> On Tue, 9 Dec 2008, Daniel Russel wrote: >> >>> In general, I think having queries on the whole collection of >>> particles in >>> the model is not a good idea (since other people's code, >>> restraints or >>> states can add particles to the model and you can never be sure >>> what those >>> look like). >>> >>> There is already functionality to search a Hierarchy (although it >>> is more >>> aimed at C++-- we could use a python interface which takes takes a >>> python >>> lambda function to make it more convenient to use in python). And >>> python has >>> all sorts of features for searching a list (and C++ has a few). >>> It is not clear to me that we could provide an interface that is >>> general >>> and much more concise. >>> >>> As a slight simplification for python users, we could provide a >>> function >>> which takes a list of key, value pairs (with keys of arbitrary >>> type) and a >>> container. It is a bit messier to provide this interface in C++ as >>> we would >>> have to have a separate list per type. >>> >>> Another thing to simplify such search would be a >>> "DefaultValuesDecorator" >>> which wraps a particle and pretend it has all attributes, just >>> providing >>> default values for missing ones. This would obviate the need to >>> check for an >>> attribute before matching against it. >>> >>> What sort of queries do you all do? >>> >>> >>> >>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>> >>>> Hi, >>>> >>>> I need this too (surprisingly). Usually I do it with mapping >>>> between >>>> the particle and the attribute. >>>> It is simple. however it is unclear where should we put such a >>>> mapping. Putting it in a model >>>> could be the best, however not everyone needs it. So it means >>>> somewhere else or extending the Model to ProteinModel? >>>> >>>> Dina >>>> >>>> P.S. skype me, we can talk about it >>>> >>>> >>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>> wrote: >>>>> >>>>> hi all, >>>>> >>>>> Frido and I find ourselves many times need to query particles >>>>> based on >>>>> attribute values. >>>>> Few such examples: a protein with a specific name, particles >>>>> with a >>>>> specific >>>>> residue range. >>>>> >>>>> I think that it would be very useful to have something similar >>>>> to SQL >>>>> queries on the particles DB. >>>>> Bret might had something similar implemented - but it is probably >>>>> obsolete. >>>>> IMP.Atom will probably need such functionality as well. >>>>> >>>>> has anyone took a look at that before ? >>>>> >>>>> thank you, >>>>> Keren. >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I mean having an indexing that lets you find the residue you want in O(1) rather than looping.
On Tue, Dec 9, 2008 at 9:54 AM, Daniel Russel drussel@gmail.com wrote: > > On Dec 9, 2008, at 9:53 AM, Dina Schneidman wrote: > >> I think IndexModel is the best solution, but in C++. having python and >> c++ is more than enough, we don't want to support something else. I >> suggested extending the Model because of the update reason, in this >> case you can update the indexing when you add/remove particles. > > I don't understand what you mean. > >> >> >> On Tue, Dec 9, 2008 at 9:29 AM, Keren Lasker kerenl@salilab.org wrote: >>> >>> For the 26S project we currently do: >>> get particle by name >>> get a set of particles within a residue number range >>> >>> >>> the solution of a function that gets (key,value,container) seems like a >>> good >>> solution. >>> However - it can be more complicated : >>> 1. it can interact with the hierarchy - give me the residues range >>> within >>> this protein for example - so we should probably also allow for a >>> hierarchy >>> handle in the interface. >>> 2. we might want to ask residue range + some other property such as have >>> structural coverage or do not. Therefore I think that a sql type string >>> can >>> be more general than a list of attributes - because you do not know how >>> they >>> are related. >>> >>> >>> On Tue, 9 Dec 2008, Daniel Russel wrote: >>> >>>> In general, I think having queries on the whole collection of particles >>>> in >>>> the model is not a good idea (since other people's code, restraints or >>>> states can add particles to the model and you can never be sure what >>>> those >>>> look like). >>>> >>>> There is already functionality to search a Hierarchy (although it is >>>> more >>>> aimed at C++-- we could use a python interface which takes takes a >>>> python >>>> lambda function to make it more convenient to use in python). And python >>>> has >>>> all sorts of features for searching a list (and C++ has a few). >>>> It is not clear to me that we could provide an interface that is general >>>> and much more concise. >>>> >>>> As a slight simplification for python users, we could provide a function >>>> which takes a list of key, value pairs (with keys of arbitrary type) and >>>> a >>>> container. It is a bit messier to provide this interface in C++ as we >>>> would >>>> have to have a separate list per type. >>>> >>>> Another thing to simplify such search would be a >>>> "DefaultValuesDecorator" >>>> which wraps a particle and pretend it has all attributes, just providing >>>> default values for missing ones. This would obviate the need to check >>>> for an >>>> attribute before matching against it. >>>> >>>> What sort of queries do you all do? >>>> >>>> >>>> >>>> On Dec 9, 2008, at 8:43 AM, Dina Schneidman wrote: >>>> >>>>> Hi, >>>>> >>>>> I need this too (surprisingly). Usually I do it with mapping between >>>>> the particle and the attribute. >>>>> It is simple. however it is unclear where should we put such a >>>>> mapping. Putting it in a model >>>>> could be the best, however not everyone needs it. So it means >>>>> somewhere else or extending the Model to ProteinModel? >>>>> >>>>> Dina >>>>> >>>>> P.S. skype me, we can talk about it >>>>> >>>>> >>>>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >>>>> wrote: >>>>>> >>>>>> hi all, >>>>>> >>>>>> Frido and I find ourselves many times need to query particles based on >>>>>> attribute values. >>>>>> Few such examples: a protein with a specific name, particles with a >>>>>> specific >>>>>> residue range. >>>>>> >>>>>> I think that it would be very useful to have something similar to SQL >>>>>> queries on the particles DB. >>>>>> Bret might had something similar implemented - but it is probably >>>>>> obsolete. >>>>>> IMP.Atom will probably need such functionality as well. >>>>>> >>>>>> has anyone took a look at that before ? >>>>>> >>>>>> thank you, >>>>>> Keren. >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
we do the same - but this is an ad hoc solution. It should not be part of Model I think but an external IndexModel class maybe or a IndexDecorator. I suggested SQL because your query can be complex containing often more than one attribute. Ben - you will need it in IMP.Atom - have you already started working on that?
On Tue, 9 Dec 2008, Dina Schneidman wrote:
> Hi, > > I need this too (surprisingly). Usually I do it with mapping between > the particle and the attribute. > It is simple. however it is unclear where should we put such a > mapping. Putting it in a model > could be the best, however not everyone needs it. So it means > somewhere else or extending the Model to ProteinModel? > > Dina > > P.S. skype me, we can talk about it > > > On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >> hi all, >> >> Frido and I find ourselves many times need to query particles based on >> attribute values. >> Few such examples: a protein with a specific name, particles with a specific >> residue range. >> >> I think that it would be very useful to have something similar to SQL >> queries on the particles DB. >> Bret might had something similar implemented - but it is probably obsolete. >> IMP.Atom will probably need such functionality as well. >> >> has anyone took a look at that before ? >> >> thank you, >> Keren. >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
Why use SQL over python or C++ logic?
On Dec 9, 2008, at 9:15 AM, Keren Lasker wrote:
> we do the same - but this is an ad hoc solution. Agreed, but it might be one of those things (like storing the whole model, restraints and all) where there is unlikely to be a general solution that adds any benefit as the general case is very complex and the ad hoc solution is quite simple.
> > It should not be part of Model I think but an external IndexModel > class maybe or a IndexDecorator. Again, I think having it act on a whole model is a really bad idea (the IMP equivalent of using global variables :-)
> > I suggested SQL because your query can be complex containing often > more than one attribute. Why not just use python or C++ rather than add another language to learn? There are a few operations which are more concise in SQL, but I don't know that we want to support them or that they make sense in our case (where would presumably have to return whole particles).
> > Ben - you will need it in IMP.Atom - have you already started > working on that? How come? I think I am missing some use case :-)
> > > On Tue, 9 Dec 2008, Dina Schneidman wrote: > >> Hi, >> >> I need this too (surprisingly). Usually I do it with mapping between >> the particle and the attribute. >> It is simple. however it is unclear where should we put such a >> mapping. Putting it in a model >> could be the best, however not everyone needs it. So it means >> somewhere else or extending the Model to ProteinModel? >> >> Dina >> >> P.S. skype me, we can talk about it >> >> >> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org >> wrote: >>> hi all, >>> >>> Frido and I find ourselves many times need to query particles >>> based on >>> attribute values. >>> Few such examples: a protein with a specific name, particles with >>> a specific >>> residue range. >>> >>> I think that it would be very useful to have something similar to >>> SQL >>> queries on the particles DB. >>> Bret might had something similar implemented - but it is probably >>> obsolete. >>> IMP.Atom will probably need such functionality as well. >>> >>> has anyone took a look at that before ? >>> >>> thank you, >>> Keren. >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
replied to this question in the email I have just sent :)
On Tue, 9 Dec 2008, Daniel Russel wrote:
> Why use SQL over python or C++ logic? > > On Dec 9, 2008, at 9:15 AM, Keren Lasker wrote: > >> we do the same - but this is an ad hoc solution. > Agreed, but it might be one of those things (like storing the whole model, > restraints and all) where there is unlikely to be a general solution that > adds any benefit as the general case is very complex and the ad hoc solution > is quite simple. > >> >> It should not be part of Model I think but an external IndexModel class >> maybe or a IndexDecorator. > Again, I think having it act on a whole model is a really bad idea (the IMP > equivalent of using global variables :-) > >> >> I suggested SQL because your query can be complex containing often more >> than one attribute. > Why not just use python or C++ rather than add another language to learn? > There are a few operations which are more concise in SQL, but I don't know > that we want to support them or that they make sense in our case (where would > presumably have to return whole particles). > >> >> Ben - you will need it in IMP.Atom - have you already started working on >> that? > How come? I think I am missing some use case :-) > >> >> >> On Tue, 9 Dec 2008, Dina Schneidman wrote: >> >>> Hi, >>> >>> I need this too (surprisingly). Usually I do it with mapping between >>> the particle and the attribute. >>> It is simple. however it is unclear where should we put such a >>> mapping. Putting it in a model >>> could be the best, however not everyone needs it. So it means >>> somewhere else or extending the Model to ProteinModel? >>> >>> Dina >>> >>> P.S. skype me, we can talk about it >>> >>> >>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >>>> hi all, >>>> >>>> Frido and I find ourselves many times need to query particles based on >>>> attribute values. >>>> Few such examples: a protein with a specific name, particles with a >>>> specific >>>> residue range. >>>> >>>> I think that it would be very useful to have something similar to SQL >>>> queries on the particles DB. >>>> Bret might had something similar implemented - but it is probably >>>> obsolete. >>>> IMP.Atom will probably need such functionality as well. >>>> >>>> has anyone took a look at that before ? >>>> >>>> thank you, >>>> Keren. >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Daniel - I think that the selection class in modeller is a good example for types of queries.
On Tue, 9 Dec 2008, Daniel Russel wrote:
> Why use SQL over python or C++ logic? > > On Dec 9, 2008, at 9:15 AM, Keren Lasker wrote: > >> we do the same - but this is an ad hoc solution. > Agreed, but it might be one of those things (like storing the whole model, > restraints and all) where there is unlikely to be a general solution that > adds any benefit as the general case is very complex and the ad hoc solution > is quite simple. > >> >> It should not be part of Model I think but an external IndexModel class >> maybe or a IndexDecorator. > Again, I think having it act on a whole model is a really bad idea (the IMP > equivalent of using global variables :-) > >> >> I suggested SQL because your query can be complex containing often more >> than one attribute. > Why not just use python or C++ rather than add another language to learn? > There are a few operations which are more concise in SQL, but I don't know > that we want to support them or that they make sense in our case (where would > presumably have to return whole particles). > >> >> Ben - you will need it in IMP.Atom - have you already started working on >> that? > How come? I think I am missing some use case :-) > >> >> >> On Tue, 9 Dec 2008, Dina Schneidman wrote: >> >>> Hi, >>> >>> I need this too (surprisingly). Usually I do it with mapping between >>> the particle and the attribute. >>> It is simple. however it is unclear where should we put such a >>> mapping. Putting it in a model >>> could be the best, however not everyone needs it. So it means >>> somewhere else or extending the Model to ProteinModel? >>> >>> Dina >>> >>> P.S. skype me, we can talk about it >>> >>> >>> On Tue, Dec 9, 2008 at 7:03 AM, Keren Lasker kerenl@salilab.org wrote: >>>> hi all, >>>> >>>> Frido and I find ourselves many times need to query particles based on >>>> attribute values. >>>> Few such examples: a protein with a specific name, particles with a >>>> specific >>>> residue range. >>>> >>>> I think that it would be very useful to have something similar to SQL >>>> queries on the particles DB. >>>> Bret might had something similar implemented - but it is probably >>>> obsolete. >>>> IMP.Atom will probably need such functionality as well. >>>> >>>> has anyone took a look at that before ? >>>> >>>> thank you, >>>> Keren. >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Keren Lasker wrote: > Frido and I find ourselves many times need to query particles based on > attribute values. > Few such examples: a protein with a specific name, particles with a > specific residue range. > > I think that it would be very useful to have something similar to SQL > queries on the particles DB. > Bret might had something similar implemented - but it is probably > obsolete.
IMP as originally designed was little more than a storage layer. It is true that Bret's Restrainer code was SQL based, but this was MySQL/PHP code that ran on top of IMP, not IMP itself.
> IMP.Atom will probably need such functionality as well.
I can't think of any obvious places in IMP.atom where I'd need that. But it sounds like it would be useful functionality for you. As you saw, Daniel has added an IMP.search module which could help here - I suggest you play with that once he has it working and let him know if it does or doesn't do what you need. ;)
Ben
participants (5)
-
Andrej Sali
-
Ben Webb
-
Daniel Russel
-
Dina Schneidman
-
Keren Lasker