score function weights

newer
[Fwd: Re: [Fwd: PDB lib]]

older
Remaining open questions

Friedrich Foerster

5 Nov 2007 5 Nov '07

1:36 p.m.

Attachments:

attachment.htm (text/html — 4.9 KB)

Show replies by date

Daniel Russel

5 Nov 5 Nov

5:13 p.m.

I think the three alternatives for weights are 1) Hide it in the restraint class. To make this clean, we should hide the the ModelData* and mirror its functions in the Restraint base class (strictly speaking we only need to mirror add_to_deriv, but for symmetry, we should do all if we do one). For the return value, I would suggest adding a function weighted_eval which is called by the Model or RestraintSet instead of eval and calls eval internally. Then, anyone who wants to override that directly will know they need to deal with the weights themselves. Note that this proposal breaks if we move to a "Particles as Objects" framework since adding to the deriv involves an arbitrary function call.

2) Keep it external to the restraint. To do this we would give each restraint a separate ModelData which has a weight in it and scales add_to_deriv calls accordingly. The external user would be responsible to for weighting the output. I don't like this as much as 1.

3) Store the weight in the Restraint base and hope the Restraints scale everything themselves. This is pretty easy to write a test that checks if a restraint is doing the right thing and such a test could be put in the model and run on each added restraint if debugging checks are enabled. This is the only approach of the three that will work if we go to particles as objects. We can provide macros that make the code a bit simpler too.

If we don't change to the new particle design, I think the first is the best. If we do, we are stuck with the third (and had better write a bunch of macros and test code).

As for changing the distance on a distance restraint, it seems like you should just add code to set the constant in the appropriate restraint class. Seems like a useful thing to have.

On Nov 5, 2007, at 1:36 PM, Friedrich Foerster wrote:

> hi all, > > i think we urgently need a solution for integrating different > weights of terms into the scoring function to get real applications > going. > Schedules in optimization > > Old fashioned - but working - optimization strategies in modeling > often use variable target functions. That means the scoring > function is gradually changed in the course of optimization. In > MODELLER, Andrej probably spent significant time of his thesis > figuring out a reasonable way of varying the different terms of the > scoring function to eventually obtain the best (=lowest scoring) > results. Although the eventual relative weights of all terms is 1, > finding the global minimum can be greatly facilitated by 'guiding' > the optimization by removing high barriers that make the > optimization stuck in early stages of the optimization. > > In particular, it is beneficial to vary the term penalizing steric > clashes S_vol (soft-sphere overlap) during optimization, ie have it > practically zero in the beginning of the optimization to allow the > particles to rearrange drastically and then gradually increase the > weight for this term. In MODELLER a 'schedule' was used for this > purpose. A schedule was a list of relative weights of the relative > weights of the scoring terms. > > IMP currently does not have any way of easily adjusting the > different weights of the terms constituting the score. The only way > of changing the restraints is to generate new ones, eg for volume > exclusion one would need to regenerate these restraints each time > specifying a different standard deviation every time, which is > awkward in my eyes. > > I would be eager to have some kind of schedule in IMP, ie an easy > way to alter the relative contribution of terms to the score. > Specifically, one could envision the following things useful during > an optimization run for the volume exclusion restraint: > > scale S_vol differently at different stages of optimization > vary radii (=minimum distance in restraint) at different stages of > optimization

> Technically, the weight could be either the same for a given kind > of restraint, or possibly even a property of a restraint. In the > latter case, specific atom pairs could be penalized less than > others for clashes, which might make sense if the accuracy of the > representation (eg a rigid body) is lower/higher for specific > particles. > > I do not favor any particular way of integrating a weight. One > option - and currently a parameter in the EM score - is to call the > restraint with a given weight. The term (and derivative) will be > added to the overall functions using this weight. > > Any ideas/suggestions/preferences? > > > frido > > also found at: > https://salilab.org/internal/wiki/IMP/schedule > > > > -- > -- > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > % Friedrich Foerster % > % Andrej Sali Lab % > % University of California at San Francisco % > # MC 2552 % > % Byers Hall Room 501 % > % 1700 4th Street % > % San Francisco, CA 94158-2330, USA % > % % > % phone: +1 (415) 514-4258 % > % fax: +1 (415) 514-4231 % > % % > % www.salilab.org/~frido % > % % > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Friedrich Foerster

8:03 p.m.

anything that works reasonably effective and is understandable even to a physicist with rudimentary programming skills is welcome. some comments below.

frido

Daniel Russel wrote: > I think the three alternatives for weights are > 1) Hide it in the restraint class. To make this clean, we should hide > the the ModelData* and mirror its functions in the Restraint base > class (strictly speaking we only need to mirror add_to_deriv, but for > symmetry, we should do all if we do one). For the return value, I > would suggest adding a function weighted_eval which is called by the > Model or RestraintSet instead of eval and calls eval internally. Then, > anyone who wants to override that directly will know they need to deal > with the weights themselves. Note that this proposal breaks if we > move to a "Particles as Objects" framework since adding to the deriv > involves an arbitrary function call. ?! i probably need a translation for dummies... making a scaling factor part of a restraint seems logic to me - although i do not fully understand why it should be hidden. scaling is not an evil thing, no reason to be ashamed of it.

> > 2) Keep it external to the restraint. To do this we would give each > restraint a separate ModelData which has a weight in it and scales > add_to_deriv calls accordingly. The external user would be responsible > to for weighting the output. I don't like this as much as 1. 1 sounds better to me although again i only grasp a vague idea. > > 3) Store the weight in the Restraint base and hope the Restraints > scale everything themselves. This is pretty easy to write a test that > checks if a restraint is doing the right thing and such a test could > be put in the model and run on each added restraint if debugging > checks are enabled. This is the only approach of the three that will > work if we go to particles as objects. We can provide macros that make > the code a bit simpler too. i am not quite sure i understand it. maybe you mean that all restraints are fulfilled in the 'correct' model as frank always reasons. that is unfortunately difficult with EM data, which are never completely fulfilled (a correlation of one is never achieved with noisy experimental data). the same holds for X-ray crystallography patterns or SAXS spectra and probably many other biophysical sources of information. > > > If we don't change to the new particle design, I think the first is > the best. If we do, we are stuck with the third (and had better write > a bunch of macros and test code). i would always prefer the simple solutions. dinosaurs like me are already barely grasp the intricate structure of imp and further macros etc won't improve the situation. > > > As for changing the distance on a distance restraint, it seems like > you should just add code to set the constant in the appropriate > restraint class. Seems like a useful thing to have. > > On Nov 5, 2007, at 1:36 PM, Friedrich Foerster wrote: > >> hi all, >> >> i think we urgently need a solution for integrating different weights >> of terms into the scoring function to get real applications going. >> >> >> Schedules in optimization >> >> Old fashioned - but working - optimization strategies in modeling >> often use variable target functions. That means the scoring function >> is gradually changed in the course of optimization. In MODELLER, >> Andrej probably spent significant time of his thesis figuring out a >> reasonable way of varying the different terms of the scoring function >> to eventually obtain the best (=lowest scoring) results. Although the >> eventual relative weights of all terms is 1, finding the global >> minimum can be greatly facilitated by 'guiding' the optimization by >> removing high barriers that make the optimization stuck in early >> stages of the optimization. >> >> In particular, it is beneficial to vary the term penalizing steric >> clashes S_vol (soft-sphere overlap) during optimization, /ie/ have it >> practically zero in the beginning of the optimization to allow the >> particles to rearrange drastically and then gradually increase the >> weight for this term. In MODELLER a 'schedule' was used for this >> purpose. A schedule was a list of relative weights of the relative >> weights of the scoring terms. >> >> IMP currently does not have any way of easily adjusting the different >> weights of the terms constituting the score. The only way of changing >> the restraints is to generate new ones, eg for volume exclusion one >> would need to regenerate these restraints each time specifying a >> different standard deviation every time, which is awkward in my eyes. >> >> I would be eager to have some kind of schedule in IMP, /ie/ an easy >> way to alter the relative contribution of terms to the score. >> Specifically, one could envision the following things useful during >> an optimization run for the volume exclusion restraint: >> >> * scale S_vol differently at different stages of optimization >> * vary radii (=minimum distance in restraint) at different stages >> of optimization >> > >> * >> >> >> >> Technically, the weight could be either the same for a given kind of >> restraint, or possibly even a property of a restraint. In the latter >> case, specific atom pairs could be penalized less than others for >> clashes, which might make sense if the accuracy of the representation >> (/eg/ a rigid body) is lower/higher for specific particles. >> >> I do not favor any particular way of integrating a weight. One option >> - and currently a parameter in the EM score - is to call the >> restraint with a given weight. The term (and derivative) will be >> added to the overall functions using this weight. >> >> Any ideas/suggestions/preferences? >> >> >> frido >> >> also found at: >> https://salilab.org/internal/wiki/IMP/schedule >> >> >> >> -- >> -- >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >> % Friedrich Foerster % >> % Andrej Sali Lab % >> % University of California at San Francisco % >> # MC 2552 % >> % Byers Hall Room 501 % >> % 1700 4th Street % >> % San Francisco, CA 94158-2330, USA % >> % % >> % phone: +1 (415) 514-4258 % >> % fax: +1 (415) 514-4231 % >> % % >> % www.salilab.org/~frido % >> % % >> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% >> >

-- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Friedrich Foerster % % Andrej Sali Lab % % University of California at San Francisco % % MC 2552 % % Byers Hall Room 501 % % 1700 4th Street % % San Francisco, CA 94158-2330, USA % % % % phone: +1 (415) 514-4258 % % fax: +1 (415) 514-4231 % % % % www.salilab.org/~frido % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Daniel Russel

8:51 p.m.

>> I think the three alternatives for weights are 1) Hide it in the >> restraint class. To make this clean, we should hide the the >> ModelData* and mirror its functions in the Restraint base class >> (strictly speaking we only need to mirror add_to_deriv, but for >> symmetry, we should do all if we do one). For the return value, I >> would suggest adding a function weighted_eval which is called by >> the Model or RestraintSet instead of eval and calls eval >> internally. Then, anyone who wants to override that directly will >> know they need to deal with the weights themselves. Note that >> this proposal breaks if we move to a "Particles as Objects" >> framework since adding to the deriv involves an arbitrary function >> call. > ?! i probably need a translation for dummies... making a scaling > factor part of a restraint seems logic to me - although i do not > fully understand why it should be hidden. scaling is not an evil > thing, no reason to be ashamed of it. The reason to hide it is so that the implementers of restraints don't have to remember to scale everything properly, it just happens for free. The idea would be that - you implement a MyRestraint::evaluate function - the value it returns is weighted properly in the Restraint::weighted_evaluate function which called evaluate - MyRestraint::evaluate adds to the derivative by calling Restraint::add_to_derivative which scales the value appropriately before passing it to the ModelData. In order to make sure that you do this and don't slip up and accidentally call ModelData::add_to_deriv directly, all the ModelData access and set functions that MyRestraint needs are implemented in Restraint and the Restraint::ModelData pointer is declared private.

So the changes from the current system is only that instead of calling Restraint::get_mode_data()->get_float() you leave out the "get_mode_data()->" part (likewise for add_to_deriv). So it would just simplify things.

> >> >> 2) Keep it external to the restraint. To do this we would give >> each restraint a separate ModelData which has a weight in it and >> scales add_to_deriv calls accordingly. The external user would be >> responsible to for weighting the output. I don't like this as much >> as 1. > 1 sounds better to me although again i only grasp a vague idea. Me too (to the second part) I think in practice it might be a bit annoying and complicated to implement. I haven't thought this one at too much. I am pretty sure the first one would work out to be simpler in the long run, just would requires some simple changes to Restraints, where as this way would not.

>> >> 3) Store the weight in the Restraint base and hope the Restraints >> scale everything themselves. This is pretty easy to write a test >> that checks if a restraint is doing the right thing and such a >> test could be put in the model and run on each added restraint if >> debugging checks are enabled. This is the only approach of the >> three that will work if we go to particles as objects. We can >> provide macros that make the code a bit simpler too. > i am not quite sure i understand it. maybe you mean that all > restraints are fulfilled in the 'correct' model as frank always > reasons. I don't see where I wrote that, so I can't tell you what I meant :-) Anyway, there need be no such assumption.

Ben Webb

6 Nov 6 Nov

10:20 a.m.

Daniel Russel wrote: > I think the three alternatives for weights are > 1) Hide it in the restraint class. To make this clean, we should hide > the the ModelData* and mirror its functions in the Restraint base class > (strictly speaking we only need to mirror add_to_deriv, but for > symmetry, we should do all if we do one). For the return value, I would > suggest adding a function weighted_eval which is called by the Model or > RestraintSet instead of eval and calls eval internally. Then, anyone who > wants to override that directly will know they need to deal with the > weights themselves. Note that this proposal breaks if we move to a > "Particles as Objects" framework since adding to the deriv involves an > arbitrary function call.

I like this option, because I think it a very bad idea to rely on every restraint to do the scaling itself. But I don't think it'll work for restraints which contain other restraints - for example I may want a restraint set which contains a bunch of other restraints, and would then expect each restraint to be scaled twice - once by its own scale factor and once by the set's. It seems like option (2) would be able to handle this more easily - either that or we use option (1) together with something like the DerivativeAccumulator that Keren proposed. Why don't you like option 2?

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

10:41 a.m.

Ben Webb wrote: > Daniel Russel wrote: >> I think the three alternatives for weights are 1) Hide it in the >> restraint class. To make this clean, we should hide the the >> ModelData* and mirror its functions in the Restraint base class >> (strictly speaking we only need to mirror add_to_deriv, but for >> symmetry, we should do all if we do one). For the return value, I >> would suggest adding a function weighted_eval which is called by the >> Model or RestraintSet instead of eval and calls eval internally. >> Then, anyone who wants to override that directly will know they need >> to deal with the weights themselves. Note that this proposal breaks >> if we move to a "Particles as Objects" framework since adding to the >> deriv involves an arbitrary function call. > > I like this option, because I think it a very bad idea to rely on > every restraint to do the scaling itself. But I don't think it'll work > for restraints which contain other restraints - for example I may want > a restraint set which contains a bunch of other restraints, and would > then expect each restraint to be scaled twice - once by its own scale > factor and once by the set's. Good point, I didn't think about the nested case. It could be done by having the outer restraint make sure that the inner restraints weights were the product of the outer weight and the inner weight. This does make writing nested restraints more complicated than non-nested ones.

> It seems like option (2) would be able to handle this more easily - > either that or we use option (1) together with something like the > DerivativeAccumulator that Keren proposed. Why don't you like option 2? Option 2 basically is the DerivativeAccumulator. On reflection, I do like it. - The restraint has a single evaluate function which takes a pointer to a DerivativeAccumulator (which is NULL if no derivs are to be accumulated). No more bool. Can we do this with swig? I would assume NULL gets translated into null. This has the added advantage that you can't set derivs if you are not supposed to. - To evaluate a restraint (either in the Model or as an outer nested restraint) you get the weight, w, and tell the DA to multiply its weight by w (push it on to a stack of weights). Then you call evaluate, multiply the return value by w and tell the DA to pop the last weight. - the weights are stored separately from the restraints since they are no longer handled by the restraint - the add_to_deriv functions on ModelData are hidden

This still has problems with Particles as Objects, but I like it otherwise and change my vote :-)

Ben Webb

6:23 p.m.

BTW: the more observant may have noticed that this is now a "proper" mailing list. So please address further emails to imp-dev@salilab.org, and direct interested people in the lab to the IMP wiki page (which links to the mailing list archive and the sign-up page). imp@salilab will redirect to imp-dev for a while, but you'll get a 'message has implicit destination' error which I have to override. So use imp-dev. (It's imp-dev rather than imp because at some point there will be an imp-users as well, and perhaps an imp-commits if people want it, for notification of SVN commits.)

Daniel Russel wrote: > Ben Webb wrote: >> It seems like option (2) would be able to handle this more easily - >> either that or we use option (1) together with something like the >> DerivativeAccumulator that Keren proposed. Why don't you like option 2? > Option 2 basically is the DerivativeAccumulator. On reflection, I do > like it. > - The restraint has a single evaluate function which takes a pointer to > a DerivativeAccumulator (which is NULL if no derivs are to be > accumulated). No more bool. Can we do this with swig? I would assume > NULL gets translated into null. This has the added advantage that you > can't set derivs if you are not supposed to.

Yes, that should be fine. Python 'None' should go through as a C++ null pointer. If it doesn't, then it would be easy to write a typemap for.

> - To evaluate a restraint (either in the Model or as an outer nested > restraint) you get the weight, w, and tell the DA to multiply its weight > by w (push it on to a stack of weights). Then you call evaluate, > multiply the return value by w and tell the DA to pop the last weight. > - the weights are stored separately from the restraints since they are > no longer handled by the restraint

I don't like stacks like this, for many reasons (threading and exception safety spring to mind). And I'm not sure where you'd store the weight, if not in the restraint. Why not just in the Restraint base class?

My alternative to stacks: restraints which contain other restraints would use a copy-like constructor to make a cloned DerivativeAccumulator, multiplied by their own weight, and then pass that to their child restraints' evaluate methods.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

6:42 p.m.

> And I'm not sure where you'd store the weight, > if not in the restraint. Why not just in the Restraint base class? > The Restraint never actually uses its weight. It is a property of how the thing calling the restraint combines the weights, not of the restraint itself. And so should go with the thing calling the restraint. Plus, this way we can reuse a restraint with different weights in different places if we should so desire. > My alternative to stacks: restraints which contain other restraints > would use a copy-like constructor to make a cloned > DerivativeAccumulator, multiplied by their own weight, and then pass > that to their child restraints' evaluate methods. > I like it.

Ben Webb

7:37 p.m.

Daniel Russel wrote: >> And I'm not sure where you'd store the weight, >> if not in the restraint. Why not just in the Restraint base class? >> > The Restraint never actually uses its weight. It is a property of how > the thing calling the restraint combines the weights, not of the > restraint itself. And so should go with the thing calling the restraint.

But "the thing calling the restraint" is the model, which doesn't know what the weights are. So here's an example: Frido's system contains 1000 restraints, 999 of which are MM forcefield terms (bonds, angles, exclusion volumes, etc.) and the last one of which is an EM restraint. He wants to scale down just the EM restraint to 0.1, so he does (Python syntax):

myemrestraint.set_scale(0.1) model.evaluate()

Are you suggesting that instead he should do: model.evaluate(scale_factors=[1.0] * 999 + [0.1]) ? Or am I misunderstanding what you're saying?

Another analogy: suppose instead he wanted to tweak the standard deviation of the restraint. Surely he would say: myemrestraint.set_standard_deviation(0.1) model.evaluate()

rather than passing the stdevs to the model. Of course, I am arguing that a scale factor and a stdev should be treated similarly here.

> Plus, this way we can reuse a restraint with different weights in > different places if we should so desire.

Surely the easiest way to do that is the put the restraint into two different restraint sets, each with its own weight.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

8:34 p.m.

> > But "the thing calling the restraint" is the model, which doesn't know > what the weights are. Whoever we tell the weights to knows :-)

> Are you suggesting that instead he should do: > model.evaluate(scale_factors=[1.0] * 999 + [0.1]) or model.set_weight(myrestraint_index, .5) or model.set_weight (my_restraint_pointer, .5); or restraintset.set_weight(r, .5);

> rather than passing the stdevs to the model. Of course, I am arguing > that a scale factor and a stdev should be treated similarly here. The stddev is an attribute of a particular type of Restraint. As I see it, weight is a transformation applied to the output of any restraint. Similar to adding up the logs of the restraint values instead of their their actual values.

>> Plus, this way we can reuse a restraint with different weights in >> different places if we should so desire. > > Surely the easiest way to do that is the put the restraint into two > different restraint sets, each with its own weight. Sure. But that requires copying the object and keeping them synchronized. Not that I see any particular application of this :-)

Ben Webb

11:21 p.m.

Daniel Russel wrote: >> Are you suggesting that instead he should do: >> model.evaluate(scale_factors=[1.0] * 999 + [0.1]) > or model.set_weight(myrestraint_index, .5) or > model.set_weight(my_restraint_pointer, .5); > or restraintset.set_weight(r, .5);

I don't like the second because it requires the model to keep a second bunch of restraint pointers hanging around, and then you have to keep them synchronized with the 'real' list of restraints (what happens if you remove a restraint from a model, or you delete a restraint and then create a new one which happens to have the same address?) I _really_ don't like the first one because 1) it requires the list of restraints to remain ordered and 2) it wouldn't work with restraints that live inside other restraints.

If you like the third, why not compromise and say just that any RestraintSet can scale its children. People are likely to want to scale a whole RestraintSet at a time anyway, and can always stick a single Restraint into a RestraintSet if they really want to individually scale it. We can punt on model.set_weight for the time being, at least until you can convince me that it's a good idea. ;)

>> rather than passing the stdevs to the model. Of course, I am arguing >> that a scale factor and a stdev should be treated similarly here. > The stddev is an attribute of a particular type of Restraint. As I see > it, weight is a transformation applied to the output of any restraint.

Yes, I figured you'd say that. ;) On the other hand, adding a scale member to the Restraint base class adds a per-object overhead, and maybe it's better to only have that for RestraintSets. So we can agree not to put it in Restraint, but for different reasons...

>>> Plus, this way we can reuse a restraint with different weights in >>> different places if we should so desire. >> Surely the easiest way to do that is the put the restraint into two >> different restraint sets, each with its own weight. > Sure. But that requires copying the object and keeping them > synchronized. Not that I see any particular application of this :-)

Maybe I'm missing something here, but a RestraintSet (currently) just keeps a vector of pointers to Restraints. So there's no reason why the same Restraint object couldn't be in two sets.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

7 Nov 7 Nov

7:05 a.m.

>> or model.set_weight(myrestraint_index, .5) or >> model.set_weight(my_restraint_pointer, .5); >> or restraintset.set_weight(r, .5); > > I don't like the second because it requires the model to keep a second > bunch of restraint pointers hanging around, and then you have to keep > them synchronized with the 'real' list of restraints (what happens if > you remove a restraint from a model, or you delete a restraint and > then > create a new one which happens to have the same address?) I _really_ > don't like the first one because 1) it requires the list of restraints > to remain ordered and 2) it wouldn't work with restraints that live > inside other restraints. They just require that either the model/RestraintSet (for the first) store the Restraint pointers in a vector of std::pairs with weights and return the index (just like the Particles) or, for the second, just in a map or any other container and go and find the correct restraint when the weight is set. Simple enough. No duplications and no significant overhead.

And the user has to keep around the identity of the restraints he is interested in somehow anyway.

> > If you like the third, why not compromise and say just that any > RestraintSet can scale its children. The third is the same as the first, just that both models and restraint sets store lists of restraints.

> Yes, I figured you'd say that. ;) On the other hand, adding a scale > member to the Restraint base class adds a per-object overhead, and > maybe > it's better to only have that for RestraintSets. So we can agree > not to > put it in Restraint, but for different reasons... Well, one word is pretty negligible. Especially with that mysterious list of particles there already :-)

>> Sure. But that requires copying the object and keeping them >> synchronized. Not that I see any particular application of this :-) > > Maybe I'm missing something here, but a RestraintSet (currently) just > keeps a vector of pointers to Restraints. So there's no reason why the > same Restraint object couldn't be in two sets. Yes, but if we store the weight in the Restraint, it can't really be in two sets any more because the weights are linked.

Friedrich Foerster

11:17 a.m.

just my two cents to the conversation: - i would greatly appreciate if the usage is as simple and intuitive as possible. the end users will ideally be biologists if imp should really have significant impact. for that purpose i would rather not prefer to see all kinds of pointers floating around on the python level. from that point of view i vastly preferred option 1 of ben's original proposals. - from a philosophical standpoint weights just influence the output. but realistically a user will want to influence also the restraint during the optimization. an example: volume exclusions are often modelled as r^-12 potential (eg in modeller, i think). if i intend to use such an expression, it will be much more sensible to change the radius during optimization - and that's what our great hero frank actually did. so it would make sense to access them the same way, i argue. - by the way, it might be useful to include some more tags to the restraints to account for hierarchies. either some fixed number of hierarchies (frank had 14, i think) or fancier to model the universe.

frido

Daniel Russel wrote: >>> or model.set_weight(myrestraint_index, .5) or >>> model.set_weight(my_restraint_pointer, .5); >>> or restraintset.set_weight(r, .5); >>> >> I don't like the second because it requires the model to keep a second >> bunch of restraint pointers hanging around, and then you have to keep >> them synchronized with the 'real' list of restraints (what happens if >> you remove a restraint from a model, or you delete a restraint and >> then >> create a new one which happens to have the same address?) I _really_ >> don't like the first one because 1) it requires the list of restraints >> to remain ordered and 2) it wouldn't work with restraints that live >> inside other restraints. >> > They just require that either the model/RestraintSet (for the first) > store the Restraint pointers in a vector of std::pairs with weights > and return the index (just like the Particles) or, for the second, > just in a map or any other container and go and find the correct > restraint when the weight is set. Simple enough. No duplications and > no significant overhead. > > And the user has to keep around the identity of the restraints he is > interested in somehow anyway. > > >> If you like the third, why not compromise and say just that any >> RestraintSet can scale its children. >> > The third is the same as the first, just that both models and > restraint sets store lists of restraints. > > >> Yes, I figured you'd say that. ;) On the other hand, adding a scale >> member to the Restraint base class adds a per-object overhead, and >> maybe >> it's better to only have that for RestraintSets. So we can agree >> not to >> put it in Restraint, but for different reasons... >> > Well, one word is pretty negligible. Especially with that mysterious > list of particles there already :-) > > >>> Sure. But that requires copying the object and keeping them >>> synchronized. Not that I see any particular application of this :-) >>> >> Maybe I'm missing something here, but a RestraintSet (currently) just >> keeps a vector of pointers to Restraints. So there's no reason why the >> same Restraint object couldn't be in two sets. >> > Yes, but if we store the weight in the Restraint, it can't really be > in two sets any more because the weights are linked. > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

-- -- %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Friedrich Foerster % % Andrej Sali Lab % % University of California at San Francisco % # MC 2552 % % Byers Hall Room 501 % % 1700 4th Street % % San Francisco, CA 94158-2330, USA % % % % phone: +1 (415) 514-4258 % % fax: +1 (415) 514-4231 % % % % www.salilab.org/~frido % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Daniel Russel

11:34 a.m.

> an example: volume exclusions are often modelled as r^-12 potential (eg > in modeller, i think). if i intend to use such an expression, it will be > much more sensible to change the radius during optimization - and that's > what our great hero frank actually did. so it would make sense to access > them the same way, i argue. > OK, but then they should be used in the same way--it is the restraints responsibility to have a weight field or not and handle it properly and the framework can't really touch it. The nice thing about treating weight differently from things like radius is then weight can be handled in a very standard manner and people who write restraints don't have to worry about handling it properly.

Daniel Russel

11:34 a.m.

Anyway, it seems like the DerivativeAccumulator is worth having. For now we can punt on weights and say that the first person who wants them writes a WeightedRestraintSet class which stores a weight and a set of restraints and manipulates its restraints.

Friedrich Foerster wrote: > just my two cents to the conversation: > - i would greatly appreciate if the usage is as simple and intuitive as > possible. the end users will ideally be biologists if imp should really > have significant impact. for that purpose i would rather not prefer to > see all kinds of pointers floating around on the python level. > from that point of view i vastly preferred option 1 of ben's original > proposals. > - from a philosophical standpoint weights just influence the output. but > realistically a user will want to influence also the restraint during > the optimization. > an example: volume exclusions are often modelled as r^-12 potential (eg > in modeller, i think). if i intend to use such an expression, it will be > much more sensible to change the radius during optimization - and that's > what our great hero frank actually did. so it would make sense to access > them the same way, i argue. > - by the way, it might be useful to include some more tags to the > restraints to account for hierarchies. either some fixed number of > hierarchies (frank had 14, i think) or fancier to model the universe. > > frido > > > > Daniel Russel wrote: > >>>> or model.set_weight(myrestraint_index, .5) or >>>> model.set_weight(my_restraint_pointer, .5); >>>> or restraintset.set_weight(r, .5); >>>> >>>> >>> I don't like the second because it requires the model to keep a second >>> bunch of restraint pointers hanging around, and then you have to keep >>> them synchronized with the 'real' list of restraints (what happens if >>> you remove a restraint from a model, or you delete a restraint and >>> then >>> create a new one which happens to have the same address?) I _really_ >>> don't like the first one because 1) it requires the list of restraints >>> to remain ordered and 2) it wouldn't work with restraints that live >>> inside other restraints. >>> >>> >> They just require that either the model/RestraintSet (for the first) >> store the Restraint pointers in a vector of std::pairs with weights >> and return the index (just like the Particles) or, for the second, >> just in a map or any other container and go and find the correct >> restraint when the weight is set. Simple enough. No duplications and >> no significant overhead. >> >> And the user has to keep around the identity of the restraints he is >> interested in somehow anyway. >> >> >> >>> If you like the third, why not compromise and say just that any >>> RestraintSet can scale its children. >>> >>> >> The third is the same as the first, just that both models and >> restraint sets store lists of restraints. >> >> >> >>> Yes, I figured you'd say that. ;) On the other hand, adding a scale >>> member to the Restraint base class adds a per-object overhead, and >>> maybe >>> it's better to only have that for RestraintSets. So we can agree >>> not to >>> put it in Restraint, but for different reasons... >>> >>> >> Well, one word is pretty negligible. Especially with that mysterious >> list of particles there already :-) >> >> >> >>>> Sure. But that requires copying the object and keeping them >>>> synchronized. Not that I see any particular application of this :-) >>>> >>>> >>> Maybe I'm missing something here, but a RestraintSet (currently) just >>> keeps a vector of pointers to Restraints. So there's no reason why the >>> same Restraint object couldn't be in two sets. >>> >>> >> Yes, but if we store the weight in the Restraint, it can't really be >> in two sets any more because the weights are linked. >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> >> > > >

Friedrich Foerster

11:37 a.m.

great. any volunteers who are familiar with the imp kernel?

frido

Daniel Russel wrote: > Anyway, it seems like the DerivativeAccumulator is worth having. For now > we can punt on weights and say that the first person who wants them > writes a WeightedRestraintSet class which stores a weight and a set of > restraints and manipulates its restraints. > > Friedrich Foerster wrote: > >> just my two cents to the conversation: >> - i would greatly appreciate if the usage is as simple and intuitive as >> possible. the end users will ideally be biologists if imp should really >> have significant impact. for that purpose i would rather not prefer to >> see all kinds of pointers floating around on the python level. >> from that point of view i vastly preferred option 1 of ben's original >> proposals. >> - from a philosophical standpoint weights just influence the output. but >> realistically a user will want to influence also the restraint during >> the optimization. >> an example: volume exclusions are often modelled as r^-12 potential (eg >> in modeller, i think). if i intend to use such an expression, it will be >> much more sensible to change the radius during optimization - and that's >> what our great hero frank actually did. so it would make sense to access >> them the same way, i argue. >> - by the way, it might be useful to include some more tags to the >> restraints to account for hierarchies. either some fixed number of >> hierarchies (frank had 14, i think) or fancier to model the universe. >> >> frido >> >> >> >> Daniel Russel wrote: >> >> >>>>> or model.set_weight(myrestraint_index, .5) or >>>>> model.set_weight(my_restraint_pointer, .5); >>>>> or restraintset.set_weight(r, .5); >>>>> >>>>> >>>>> >>>> I don't like the second because it requires the model to keep a second >>>> bunch of restraint pointers hanging around, and then you have to keep >>>> them synchronized with the 'real' list of restraints (what happens if >>>> you remove a restraint from a model, or you delete a restraint and >>>> then >>>> create a new one which happens to have the same address?) I _really_ >>>> don't like the first one because 1) it requires the list of restraints >>>> to remain ordered and 2) it wouldn't work with restraints that live >>>> inside other restraints. >>>> >>>> >>>> >>> They just require that either the model/RestraintSet (for the first) >>> store the Restraint pointers in a vector of std::pairs with weights >>> and return the index (just like the Particles) or, for the second, >>> just in a map or any other container and go and find the correct >>> restraint when the weight is set. Simple enough. No duplications and >>> no significant overhead. >>> >>> And the user has to keep around the identity of the restraints he is >>> interested in somehow anyway. >>> >>> >>> >>> >>>> If you like the third, why not compromise and say just that any >>>> RestraintSet can scale its children. >>>> >>>> >>>> >>> The third is the same as the first, just that both models and >>> restraint sets store lists of restraints. >>> >>> >>> >>> >>>> Yes, I figured you'd say that. ;) On the other hand, adding a scale >>>> member to the Restraint base class adds a per-object overhead, and >>>> maybe >>>> it's better to only have that for RestraintSets. So we can agree >>>> not to >>>> put it in Restraint, but for different reasons... >>>> >>>> >>>> >>> Well, one word is pretty negligible. Especially with that mysterious >>> list of particles there already :-) >>> >>> >>> >>> >>>>> Sure. But that requires copying the object and keeping them >>>>> synchronized. Not that I see any particular application of this :-) >>>>> >>>>> >>>>> >>>> Maybe I'm missing something here, but a RestraintSet (currently) just >>>> keeps a vector of pointers to Restraints. So there's no reason why the >>>> same Restraint object couldn't be in two sets. >>>> >>>> >>>> >>> Yes, but if we store the weight in the Restraint, it can't really be >>> in two sets any more because the weights are linked. >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> >>> >> >> > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Ben Webb

11:39 a.m.

Ah, this is why it's better not to write emails and just wait to read yours. ;) I was just about to say something similar.

I was going to write the DerivativeAccumulator code today, since (as Daniel also points out) it's useful regardless of whether we want weights (e.g. if we want to sum derivatives as plnp rather than just p). But if somebody else wants to write it, let me know!

Once that is in place, it's easy to do weights using any of the methods we've discussed.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Ben Webb

4:54 p.m.

Ben Webb wrote: > I was going to write the DerivativeAccumulator code today, since (as > Daniel also points out) it's useful regardless of whether we want > weights (e.g. if we want to sum derivatives as plnp rather than just p). > But if somebody else wants to write it, let me know!

OK, so there is a very simple first approximation to a DerivativeAccumulator class in r553, and RestraintSets as of r554 will weight their child restraints (and have a set_weight() method). We can break/fix this with future commits. ;)

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

11:35 a.m.

> - by the way, it might be useful to include some more tags to the > restraints to account for hierarchies. either some fixed number of > hierarchies (frank had 14, i think) or fancier to model the universe. > I don't understand. How would this go in the restraints?

Ben Webb

11:41 a.m.

Daniel Russel wrote: >> - by the way, it might be useful to include some more tags to the >> restraints to account for hierarchies. either some fixed number of >> hierarchies (frank had 14, i think) or fancier to model the universe. >> > I don't understand. How would this go in the restraints?

As designed, the hierarchy is encoded primarily in the particles and restraint sets, not the restraint attributes. Then you just turn on/off sets of particles (or sets of restraints) accordingly.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

6354

Age (days ago)

6357

Last active (days ago)

List overview

Download

19 comments

4 participants

tags (0)

participants (4)

Ben Webb
Daniel Russel
Daniel Russel
Friedrich Foerster