Proposal: replace BasicScoreFuncParams with PairPotential
I propose to replace BasicScoreFuncParams with classes derived from PairPotential (of course the name is also up for discussion if you hate that one). This would then be used by restraints that need a ScoreFunc generator (Connectivity, ExclusionVolume, PairConnectivity, Proximity). PairPotential classes are simple generators that create new ScoreFunc objects, given two Particles:
class PairPotential { public: virtual ScoreFunc* create_score_func(Particle *p1, Particle *p2) = 0; };
class UpperBoundPairPotential : public PairPotential { public: UpperBoundPairPotential(Float mean, Float stdev); UpperBoundPairPotential(FloatKey radius, Float stdev); ScoreFunc* create_score_func(Particle*, Particle*) { return new HarmonicUpperBound(some_mean, some_stdev); } };
Restraints such as ProximityRestraint then take a pointer to a PairPotential, and own it. The advantage is that ProximityRestraint's two constructors (one using a fixed mean, and the other a fixed particle radius attribute) can now be collapsed into one; the 'add two radii together to get a mean distance' logic is now in the PairPotential. The other advantage is that pair potentials which do not use mean/stdev at all are straightforward. For example, a DOPE potential might look something like:
class DOPEPotential : public PairPotential { public: DOPEPotential(); ScoreFunc* create_score_func(Particle* p1, Particle* p2) { return new CubicSpline(p1->get_value("atom_type"), p2->get_value("atom_type")); } };
Any thoughts/problems? I can of course clarify if any of this is unclear.
Ben
Well, it is not a potential, it generates scoring function terms. Perhaps something like ScoreFunctionCreator would be better.
More substantially, it is not clear to me that we shouldn't have it directly evaluate the score from the particles and skip the function object creation. Creating the ScoreFunc doesn't cache anything useful and just adds another layer of indirection. Internally they can use ScoreFuncs if they want.
On Jan 2, 2008, at 6:19 PM, Ben Webb wrote:
> I propose to replace BasicScoreFuncParams with classes derived from > PairPotential (of course the name is also up for discussion if you > hate > that one). This would then be used by restraints that need a ScoreFunc > generator (Connectivity, ExclusionVolume, PairConnectivity, > Proximity). > PairPotential classes are simple generators that create new ScoreFunc > objects, given two Particles: > > class PairPotential > { > public: > virtual ScoreFunc* create_score_func(Particle *p1, Particle *p2) = > 0; > }; > > class UpperBoundPairPotential : public PairPotential > { > public: > UpperBoundPairPotential(Float mean, Float stdev); > UpperBoundPairPotential(FloatKey radius, Float stdev); > ScoreFunc* create_score_func(Particle*, Particle*) { > return new HarmonicUpperBound(some_mean, some_stdev); > } > }; > > Restraints such as ProximityRestraint then take a pointer to a > PairPotential, and own it. The advantage is that ProximityRestraint's > two constructors (one using a fixed mean, and the other a fixed > particle > radius attribute) can now be collapsed into one; the 'add two radii > together to get a mean distance' logic is now in the PairPotential. > The > other advantage is that pair potentials which do not use mean/stdev at > all are straightforward. For example, a DOPE potential might look > something like: > > class DOPEPotential : public PairPotential > { > public: > DOPEPotential(); > ScoreFunc* create_score_func(Particle* p1, Particle* p2) { > return new CubicSpline(p1->get_value("atom_type"), > p2->get_value("atom_type")); > } > }; > > Any thoughts/problems? I can of course clarify if any of this is > unclear. > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Daniel Russel wrote: > Well, it is not a potential, it generates scoring function terms. > Perhaps something like ScoreFunctionCreator would be better.
I said you'd hate it. ;) But no matter. On closer inspection I suspect it wouldn't work anyway without rather close coupling to the restraints (e.g. ProximityRestraint). The clone & translate approach we discussed before is probably more applicable to Bret's restraints.
> More substantially, it is not clear to me that we shouldn't have it > directly evaluate the score from the particles and skip the function > object creation. Creating the ScoreFunc doesn't cache anything useful > and just adds another layer of indirection. Internally they can use > ScoreFuncs if they want.
I'm not sure what you mean here - could you expand a little? It seems to me that they wouldn't be able to "directly evaluate the score from the particles" because that's the restraint's job. And a harmonic scoring function shouldn't care about the particles anyway - it could be acting on a distance (two particles), a dihedral (four particles) etc. But perhaps I'm missing part of your idea here.
Ben
>> Lets step back for a bit. There are several types of choices that we want to be able to parameterize (pairwise) restraints by: - The choice of pairs (all pairs, bipartite pairs, nearby pairs etc.) This is clearly the job of the restraint and each type should probably be a different type of restraint. - Choice of the attributes being scored for the pairs. Is it the L2 distance? The L1? The X distance? What else? Direction of difference vector? This probably should also be in the restraint, but perhaps not. - Choice of the shape of the scoring function. This is mostly orthogonal to the other choices and so should probably be a class that is a parameter to the Restraint. In shape, I would include steepness of a harmonic. - Choice of the minimum for the scoring function. This is only a valid choice for some types of ScoreFunc (statistical potentials for example, tend to have it already in the function). As a result either we need two sorts of restraints, or some ScoreFuncGenerators need to ignore this (ick). I suspect that any ScoreFunc for which shifting based on the particles makes sense, we are shifting the minimum.
My suggestion would be: - A set of SingleVariableFunctions which are created some way or another and take a float and produce the score and the derivative. These include - Harmonic - Linear - HarmonicLowerBound - etc. - A set of ParticlePairScoreFunctions. These take a pair of particles and a DerivativeAccumulator and compute a score and the derivatives. These include - DistanceScoreFunction which takes a SingleVariableFunction and computes the L2 distance and computes its energy - SphereDistanceScoreFunction which takes a SingleVariableFunction and a name for a radius attribute and computes the energy (reducing the distance by the radius before passing it to the ScoreFunction - A set of restraints which take lists of particles. In general, there is no reason to do as the restraints currently do and create a large number of sub restraints. These only make the code longer and take more memory. Instead, the Restraints can simply call the ScoreFunc or ParticlePairScoreFunction as needed. The restraints include: - AllPairsRestraint which is paramaterized by a ParticlePairScoreFunction and calls it for all pairs - NearbyPairsRestraint which uses the mythical non-bonded list and a ParticlePairScoreFunction - BondedPairsRestraint which uses the mythical bonded list and directly uses the Harmonic SingleVariableFunction. - SphericalRestraint (or whatever I called it) which takes the SingleVariableFunction directly and computes distance to a fixed point
It think those three classes of classes allow us to cover everything with only a few types and keep everything fairly simple. We don't actually need to shift the scoring functions as you can always shift the distance or whatever you are scoring. Loosening or tightening of restraints can occur by changing the object stored in the restraint since there is always only one. For example, if you want to loosen the exclusion volume restraint, you change the HarmonicLowerBound you passed to the DistanceScoreFunction you passed to the NearbyPairsRestraint. Since you set them all up, and they are never duplicated, you can easily modify them.
Each of the Restraints listed is about 15 lines of broiler plate and 15-20 lines of interesting code (judging from my ExclusionVolume and BondRestraints).
The ParticlePairScoreFunctions would, among other things, subsume the non-trivial code from the current DistanceRestraint making it more easily usable.
SingleVariableFunctions would be as simple as ScoreFuncs (a line or two of interesting code).
>> More substantially, it is not clear to me that we shouldn't have it >> directly evaluate the score from the particles and skip the function >> object creation. Creating the ScoreFunc doesn't cache anything useful >> and just adds another layer of indirection. Internally they can use >> ScoreFuncs if they want. > > I'm not sure what you mean here - could you expand a little? It > seems to > me that they wouldn't be able to "directly evaluate the score from the > particles" because that's the restraint's job. And a harmonic scoring > function shouldn't care about the particles anyway - it could be > acting > on a distance (two particles), a dihedral (four particles) etc. But > perhaps I'm missing part of your idea here. In the proposal you have, from what I can tell, you create a generator which takes (perhaps implicitly) the name of the attribute(s) you are restraining. The restraint then uses the generator to create a bunch of ScoreFuncs one per pair of particles. I just realized that there are two options for what happens next:
What you probably actually meant was that the Generator processes the pair of particles and then returns a ScoreFunc (as it currently is) which is at some later time passed the distance. I don't really like that as the knowledge of what exactly is being restrained is spread in two places--part in the score function generation (or the choice of ScoreFunc generator) and part in the Restraint which has to generate the distance. I don't like this and don't see that we end up with a very general mechanism
What I had assumed you intended to have happen was, creating the ScoreFunc just involves copying data from the generator into a new object and storing the Particle*s. The created scorefunc then doesn't take any arguments other than the DerivativeAccumulator on evaluate. If this is what was intended, I would suggest that instead, the (now no longer a) generator could take the two particles and simply return the energy (and set the derivatives). You have the same amount of external control, and fewer objects are created. We then can easily create Restraints based around various types of pairs of particles (nearby ones, far away ones, bipartite ones etc.) Not sure if it is interesting but at least it is a clear separation of what goes in the Restraint (choice of pairs) and what goes in the PairScoreFunc (score an a pair of particles).
A couple comments:
I suspect that we ultimately will not want the compound restraints to create a (large) set of DistanceRestraints rather than just evaluate things on the fly. So decoupling the ScoreFunc generation and evaluation is not necessarily useful.
In addition, despite Bret's lengthy restraints, new Restraints which do various things over all pairs or all nearby pairs of points should only take a few lines of code (my ExclusionVolume and Bond restraints are each <15 lines of interesting code and another 15 or so of broiler plate), so I don't think a complicated and brittle mechanism to make more general in some way is a good use of time. Something simple like scaling and shifting of ScoreFuncs should be enough.
Daniel Russel wrote: > My suggestion would be: > - A set of SingleVariableFunctions which are created some way or > another and take a float and produce the score and the derivative. > These include > - Harmonic > - Linear > - HarmonicLowerBound > - etc.
How is this different from the current ScoreFuncs? (Or Modeller math forms, for that matter.) It seems like this is just a rename, yes?
> - A set of ParticlePairScoreFunctions. These take a pair of particles > and a DerivativeAccumulator and compute a score and the derivatives.
And these would be functors? Taking a SingleVariableFunction as a parameter to the constructor, and two Particle* in the operator() ?
> - A set of restraints which take lists of particles. In general, there > is no reason to do as the restraints currently do and create a large > number of sub restraints. These only make the code longer and take > more memory. Instead, the Restraints can simply call the ScoreFunc or > ParticlePairScoreFunction as needed.
Of course, and only some of the existing restraints work that way right now - my understanding is that Bret wrote them that way because 1) he didn't want to write a nonbonded list and 2) he liked to do everything at constructor time. There's no reason you couldn't do that at evaluate time instead. And if somebody wants to use sub restraints, we should certainly let them. (As I understand it, your proposal does not prohibit this.)
To clarify, are you suggesting that DistanceRestraint would simply instantiate and call a DistanceScoreFunction?
> The ParticlePairScoreFunctions would, among other things, subsume the > non-trivial code from the current DistanceRestraint making it more > easily usable.
Can you clarify what you mean here? It seems like you'd just be moving the code wholesale from one class to the other. How does it simplify it?
Ben
On Jan 3, 2008, at 12:45 AM, Ben Webb wrote: > > How is this different from the current ScoreFuncs? (Or Modeller math > forms, for that matter.) It seems like this is just a rename, yes? Yes, no changes there, just a rename to make them descriptive and decouple them from the idea that they are anything more than just a function.
>> - A set of ParticlePairScoreFunctions. These take a pair of particles >> and a DerivativeAccumulator and compute a score and the derivatives. > > And these would be functors? Taking a SingleVariableFunction as a > parameter to the constructor, and two Particle* in the operator() ? The constructor is not part of the ParticlePairScoreFunction interface, but many of the constructors for various PairScoreFunctions would probably take a SingleVariableFunction. And operator() or evaluate() would take two particles and a DA.
>> > Of course, and only some of the existing restraints work that way > right > now - my understanding is that Bret wrote them that way because 1) he > didn't want to write a nonbonded list and 2) he liked to do everything > at constructor time. There's no reason you couldn't do that at > evaluate > time instead. And if somebody wants to use sub restraints, we should > certainly let them. (As I understand it, your proposal does not > prohibit > this.) I am would not prohibit it, but I would not try to make it easy. Much of the complication with the current design is from trying to make it easy. Since - I can't think of a case where creating trivial sub restraints would simplify the code - we need to support generating the sub restraints on the fly since we will eventually have the non-bonded list, - and the existing restraints are an awful mess, I would like switch the gestalt.
> To clarify, are you suggesting that DistanceRestraint would simply > instantiate and call a DistanceScoreFunction? Yes. The main simplification here is that we would forget about translating ScoreFuncs since that would be handled by the restraint (not a DistanceRestraint) translating the distance appropriately ahead of time rather than forcing everything to be a plain vanilla DistanceRestraint.
If for some reason you want to create sub-restraints, you would write a DistanceRestraint-like class which doesn't delete its PairScoreFunction on destruction and create a mess of those.
> > >> The ParticlePairScoreFunctions would, among other things, subsume the >> non-trivial code from the current DistanceRestraint making it more >> easily usable. > > Can you clarify what you mean here? It seems like you'd just be moving > the code wholesale from one class to the other. How does it simplify > it? Two ways. First, it is nice to be able to use the code without explicitly creating a new Restraint per particle pair. This would allow such usage. My NonbondedRestraint would just call the evaluate of a ParticlepairScoreFunction for each pair passed by the nonbonded list rather than having to create and destroy a DistanceRestraint.
Second I remove having one class of objects create instances of another class. A restraint either takes a PairScoreFunction which it keeps internally or a SingleVariableFunction which it likewise keeps internally. Neither of those create other objects on the heap and we don't provide any easy mechanism to do so.
Since my emails are long, here in brief are the ideas (which can be chosen independently): 1) forget about function generators and just make the restraints shift the distance value itself before calling the ScoreFunc. Then each restraint just has one ScoreFunc and so modifications to it (such as reducing its stiffness) are easy. 2) separate the choice of pairs of particles from the actual computation on the pair. So we have a set of restraints each of which applies a computation to a differently generated set of pairs and a set of computation functions each of which simply computes on a pair.
On Jan 3, 2008, at 12:45 AM, Ben Webb wrote:
> Daniel Russel wrote: >> My suggestion would be: >> - A set of SingleVariableFunctions which are created some way or >> another and take a float and produce the score and the derivative. >> These include >> - Harmonic >> - Linear >> - HarmonicLowerBound >> - etc. > > How is this different from the current ScoreFuncs? (Or Modeller math > forms, for that matter.) It seems like this is just a rename, yes? > >> - A set of ParticlePairScoreFunctions. These take a pair of particles >> and a DerivativeAccumulator and compute a score and the derivatives. > > And these would be functors? Taking a SingleVariableFunction as a > parameter to the constructor, and two Particle* in the operator() ? > >> - A set of restraints which take lists of particles. In general, >> there >> is no reason to do as the restraints currently do and create a large >> number of sub restraints. These only make the code longer and take >> more memory. Instead, the Restraints can simply call the ScoreFunc or >> ParticlePairScoreFunction as needed. > > Of course, and only some of the existing restraints work that way > right > now - my understanding is that Bret wrote them that way because 1) he > didn't want to write a nonbonded list and 2) he liked to do everything > at constructor time. There's no reason you couldn't do that at > evaluate > time instead. And if somebody wants to use sub restraints, we should > certainly let them. (As I understand it, your proposal does not > prohibit > this.) > > To clarify, are you suggesting that DistanceRestraint would simply > instantiate and call a DistanceScoreFunction? > >> The ParticlePairScoreFunctions would, among other things, subsume the >> non-trivial code from the current DistanceRestraint making it more >> easily usable. > > Can you clarify what you mean here? It seems like you'd just be moving > the code wholesale from one class to the other. How does it simplify > it? > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Daniel Russel wrote: > Since my emails are long, here in brief are the ideas (which can be > chosen independently): > 1) forget about function generators and just make the restraints shift > the distance value itself before calling the ScoreFunc. Then each > restraint just has one ScoreFunc and so modifications to it (such as > reducing its stiffness) are easy. > 2) separate the choice of pairs of particles from the actual > computation on the pair. So we have a set of restraints each of which > applies a computation to a differently generated set of pairs and a > set of computation functions each of which simply computes on a pair.
I certainly agree that reducing the number of objects is a good thing. But I'm not sure what you're proposing for (1). Let's say you want an LJ restraint (but let's pretend we can model it with a harmonic term, since we have that already). So this would need to iterate over all pairs of atoms (potentially via a nonbonded list). This you propose should be in the restraint (as it is now, but at evaluate rather than constructor time). Yes?
But what happens next? As I understand it, you propose that this restraint would have a single harmonic ScoreFunc and a single Pair function. Restraint.evaluate calls Pair.something() for each pair, which calculates the distance and then the score. But it's this last step which I'm unclear on. Where do the mean and the stdev (or width or stiffness) live? They are different for every atom pair, so they surely can't live in the ScoreFunc if you have only one. Maybe some pseudocode would elucidate this.
Ben
> Maybe some pseudocode > would elucidate this. So lets look at evaluating an excluded volume potential for all nearby pairs. So we would have class NearbyPairsRestraint { NearbyPairsRestraint(PairScoreFunction *f); evaluate(da) { foreach nearby pair: f->evaluate(p0,p1, da) } }
Then we use a very general class which applies a ScoreFunc to the distance between two spheres: class SphereDistanceScore { SphereDistanceScore(FloatKey radius, ScoreFunc *f); evaluate(p0, p1) { radius= p0->radius()+p1->radius(); dist = distance(p0,p1); return f->evaluate(dist-radius); } } sf= IMP.SphereDistnaceScore(FloatKey("radius"), IMP.Harmonic(0, .1)) r= IMP.NearbyPairsRestraint(sf)
If we decided we want to move to a realish Leonard Jones potential, then we would replace the SphereDistanceScore since LJ isn't a simple function of distance. We could write a little class (ignoring the disassociation energy)
class LeonardJonesPairScore { evaluate(p0,p1) { radius = p0.get_radius()+p1.get_radius(); dist= distance(p0,p1) rat=radius/dist return rat^12-rat^6; } } Then you would just do r= IMP.NearbyPairsRestraint(particles, IMP.LeonardJonesPairScore(IMP.FloatKey("radius")))
Doing this would take a lot of work in the current framework as you would have to rewrite all the involved restraints.
Basically, by writing either a trivial PairScore or a trivial ScoreFunc you could easily generate complicated custrom restraints. Or you could choose whether you want the center distance or the sphere distance or the x distance or something else and what potential you want and compose things that way and then simply apply them to all nearby pairs or all pairs from a list or whatever.
Daniel Russel wrote: >> Maybe some pseudocode >> would elucidate this. > So lets look at evaluating an excluded volume potential for all nearby > pairs. ...
OK, looks reasonable, but could you show it with derivatives as well? I want to see how you propose combining these.
> If we decided we want to move to a realish Leonard Jones potential
The reason I chose a harmonic term for the example is because both the mean and the standard deviation would have to be tweaked, and I don't see how you propose to do that with your proposal. I don't doubt that you can write a 6-12 term, but you cheated and changed the example. ;) (And it's odd that you say LJ isn't a simple function of distance, because Modeller does have a 6-12 math form, so there's nothing to stop you restraining an angle with a 6-12 form if you were sufficiently inclined.)
Ben
On Jan 4, 2008, at 11:59 AM, Ben Webb wrote:
> Daniel Russel wrote: >>> Maybe some pseudocode >>> would elucidate this. >> So lets look at evaluating an excluded volume potential for all >> nearby >> pairs. > ... > > OK, looks reasonable, but could you show it with derivatives as > well? I > want to see how you propose combining these. Ummm, sure. I don't see how that complicates anything. Separate email.
> > >> If we decided we want to move to a realish Leonard Jones potential > > The reason I chose a harmonic term for the example is because both the > mean and the standard deviation would have to be tweaked, and I don't > see how you propose to do that with your proposal. I skipped changing the mean because you don't really have to. If the case if a repulsive potential, what you really want is a harmonic lower bound on the distance between the two spheres (not between their centers). So I passed the distance between the two spheres to the ScoreFunc instead of changing the mean.
My proposal has to be made more complicated if you want to change the standard deviation on a per-pair basis (rather than on a per-restraint basis), but I think if you are doing that you have moved beyond a general ScoreFunc into something more specific. We can always add subclasses of ScoreFunc which take extra parameters if needed (and the replacement for SphereDistanceScore could get the values for these parameters from the particles as it does with the radius).
> (And it's odd that you say LJ isn't a simple function of distance, > because Modeller does have a 6-12 math form, so there's nothing to > stop > you restraining an angle with a 6-12 form if you were sufficiently > inclined.) What I meant is that the shape changes as you change the radius of the particles, it is not just a shift of the minimum. So you can't just shift the origin for computations as you can in the harmonic case. You could have a 6-12 ScoreFunc and pass it the ratio of the radii to the distance, but this would require (in my proposal) a different PairScoreFunction, or in the current model, a whole new restraint.
Daniel Russel wrote: > The previous example with derivatives and a bit more detail.
Looks good to me overall.
> Note that I haven't really looked at Ben's vector3 and haven't tied it > to XYZDecorator (maybe Ben has), but it I expect all the needed > operations will be there eventually.
Sure, it's just a simple triple of coordinates, so it's not hard to add new methods. XYZDecorator currently has a single method get_vector_to() which returns a vector between two particles.
Assuming others don't suddenly pop up with problems, this seems like a solid route to me. I propose:
1. We settle on names for the two base classes. I suggest SingleVariableFunction and ParticlePairScorer. Unless I'm still not understanding your proposal correctly, most non-bonded and/or dynamic restraints will use a ParticlePairScorer underneath (which may in turn use a SingleVariableFunction) while some static restraints may use a SingleVariableFunction directly.
2. I will do the rename of ScoreFunc to SingleVariableFunction immediately after (1) is decided.
3. Let's leave the evil Restrainer restraints alone for now, at least until I get a chance to write some proper tests for them (you may recall that one of the tests passed just fine without any particles in the system). This means leaving BasicScoreFuncParams for the time being (don't worry - it'll die - just read on).
4. Whoever ends up needing a ParticlePairScorer first can write the code. ;) New restraints can use the new framework.
5. Once we have a nice collection of scorers, somebody (probably me, unless others are desperate for this thankless task) can fix the Restrainer restraints and get rid of BasicScoreFuncParams.
Ben
participants (2)
-
Ben Webb
-
Daniel Russel