Re: Shared state

Ben Webb

16 Oct 2007 16 Oct '07

5:57 p.m.

Daniel Russel wrote: > We had discussed but never implemented some mechanism for having shared > state (such as non-bonded lists). It seems to me the right way to do it > is to have a class State (or some better name) which has a single > virtual method "void update()" which is called in Model::evaluate before > the restraints are evaluated. As with Particle, Restraint it stores a > ModelData pointer and Model would have add_state and get_state methods.

Well, I think we need to discuss this further, so let's drag in others here:

I agree that a shared state class is needed, and could be used by nonbonded lists. But what do you need it for right now, i.e. what else would people use it for?

Calling State::update() in Model::evaluate() would not be sufficient, at least for nonbonded lists, because you can also call Restraint::evaluate() on individual restraints from the Python interface.

How would a nonbonded list know that it needs to do an update? I don't like the Statistics class proposed in ModelData.h. (The idea of that class is to automatically keep min/max/change statistics on all float variables.) Why: because 1. if you don't want nonbonded lists, maintaining these statistics is inefficient, and 2. because it's part of the ModelData class, it can't easily be extended by other classes. Two suggestions: 1. Classes can register callbacks/actions with ModelData::set_float, or this could trigger a State::set_float method, to be notified whenever the model is changed. The advantage of the former is that the callback can go away after a nonbond update is triggered, saving the overhead of a function call for subsequent set_float()s. 2. Classes to allow the get/set of the 'optimizable state' (right now, this is just all optimizable floats) could have similar methods, useful for optimizers such as CG and steepest descent which change all attributes simultaneously.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Show replies by date

Daniel Russel

16 Oct 16 Oct

6:29 p.m.

New subject: Shared state

> Calling State::update() in Model::evaluate() would not be > sufficient, at least for nonbonded lists, because you can also call > Restraint::evaluate() on individual restraints from the Python > interface. Sure, but I don't think there is any reason to guarantee that randomly calling evaluate on restraints gives you anything meaningful unless you are careful to update everything the restraint needs yourself. The Model is there to do this. We could put in a hook so that someone can request the Model update everything.

> How would a nonbonded list know that it needs to do an update? > I don't like the Statistics class proposed in ModelData.h. (The > idea of that class is to automatically keep min/max/change > statistics on all float variables.) Why: because 1. if you don't > want nonbonded lists, maintaining these statistics is inefficient, > and 2. because it's part of the ModelData class, it can't easily be > extended by other classes. Agreed.

> Two suggestions: > 1. Classes can register callbacks/actions with > ModelData::set_float, or this could trigger a State::set_float > method, to be notified whenever the model is changed. The main problem is that given "the float at index 10 is being set to 13.5", you have to lookup somewhat what is at index 10 and whether you care about it. Doing this every time anything is changed, for each of several State objects seems likely to be slow.

> The advantage of the former is that the callback can go away after > a nonbond update is triggered, saving the overhead of a function > call for subsequent set_float()s. How can it go away? You still have to keep track for the next update, no?

> 2. Classes to allow the get/set of the 'optimizable state' (right > now, this is just all optimizable floats) could have similar > methods, useful for optimizers such as CG and steepest descent > which change all attributes simultaneously.

I think in the normal course of things, if anything updates, lots of things update, so it probably more efficient to have State objects go search for changes they care about than notify them of every little change. However, this would require the non-bonded list to keep a copy of the original coordinates to determine how far things moved.

My suggestion: Keep State as it is, with the exception that ModelData has a dirty bit controlled by Model which Model uses to check if the States should be updated.

Add a FloatDataMonitor class which is called when floats change in the ModelData (we can add String and Int ones later).

We can have a FloatStatistic monitor which keeps statistics for a list of fields. The non-bonded list (a State) can use it to keep track of how far things move and delay update until needed.

Separating the two types (State and *DataMonitor) means we won't have too many things happening on each change in the ModelData and ones which want to do things as a batch don't have to slow down ModelData changes.

Ben Webb

7:09 p.m.

New subject: Shared state

Daniel Russel wrote: >> Calling State::update() in Model::evaluate() would not be sufficient, >> at least for nonbonded lists, because you can also call >> Restraint::evaluate() on individual restraints from the Python interface. > Sure, but I don't think there is any reason to guarantee that randomly > calling evaluate on restraints gives you anything meaningful unless you > are careful to update everything the restraint needs yourself. The Model > is there to do this. We could put in a hook so that someone can request > the Model update everything.

True, or we could just not allow people to call Restraint::evaluate directly. Do we have any use cases for such direct calls? (If yes, I'll write a quick unit test for that so that we can't break it in future.) I know one at least one of Bret's unit tests works that way, but it doesn't necessarily have to, as far as I can tell.

>> 1. Classes can register callbacks/actions with ModelData::set_float, >> or this could trigger a State::set_float method, to be notified >> whenever the model is changed. > The main problem is that given "the float at index 10 is being set to > 13.5", you have to lookup somewhat what is at index 10 and whether you > care about it. Doing this every time anything is changed, for each of > several State objects seems likely to be slow.

Well, for nonbonded lists you don't care what the index is, only what the old and new values are. But anyway, this won't work, because we can't tell for sure if we need to do an update based purely on, say, the x coordinate changing, since we need the 3D distance.

>> The advantage of the former is that the callback can go away after a >> nonbond update is triggered, saving the overhead of a function call >> for subsequent set_float()s. > How can it go away? You still have to keep track for the next update, no?

The callback would be removed after an update is triggered. The next rebuild of the nonbonded list would add the callback again.

>> 2. Classes to allow the get/set of the 'optimizable state' (right now, >> this is just all optimizable floats) could have similar methods, >> useful for optimizers such as CG and steepest descent which change all >> attributes simultaneously. > > I think in the normal course of things, if anything updates, lots of > things update, so it probably more efficient to have State objects go > search for changes they care about than notify them of every little > change. However, this would require the non-bonded list to keep a copy > of the original coordinates to determine how far things moved.

This would be really inefficient for Monte Carlo-like optimization strategies, unless perhaps you implemented MC by turning on/off particles like crazy. But Keren told me earlier of some thoughts she had in this direction, so perhaps she'd like to offer some wisdom at this point...

> Keep State as it is, with the exception that ModelData has a dirty bit > controlled by Model which Model uses to check if the States should be > updated.

How/when would this dirty bit be got/set?

> Add a FloatDataMonitor class which is called when floats change in the > ModelData (we can add String and Int ones later). > > We can have a FloatStatistic monitor which keeps statistics for a list > of fields. The non-bonded list (a State) can use it to keep track of how > far things move and delay update until needed.

Now that I think about it, I don't think this would work, at least for nonbonded lists, because 1. the statistic we want to keep track of is not (dx,dy,dz) but (dx*dx + dy*dy + dz*dz) (and we don't care about statistics once they've exceeded the threshold for update anyway) and 2. we want 8 individual moves of 1A to be treated the same as one move of 8A. So going down this route would require the nonbonded list to keep a copy of the coordinates for comparison purposes - and a method to allow optimizers or other 'model state transforms' to do a more efficient batch update of the variables (e.g. so that we don't have to do our nonbond list check three times when we move a particle, for the individual changes in x,y,z, but just once).

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

7:21 p.m.

New subject: Shared state

> >>> 1. Classes can register callbacks/actions with >>> ModelData::set_float, or this could trigger a State::set_float >>> method, to be notified whenever the model is changed. >> The main problem is that given "the float at index 10 is being set >> to 13.5", you have to lookup somewhat what is at index 10 and >> whether you care about it. Doing this every time anything is >> changed, for each of several State objects seems likely to be slow. > > Well, for nonbonded lists you don't care what the index is, only > what the old and new values are. You do need to know that it is a coordinate as opposed to some other float.

> But anyway, this won't work, because we can't tell for sure if we > need to do an update based purely on, say, the x coordinate > changing, since we need the 3D distance. I would have thought that the sqrt(3)*l_infinity norm would be a good enough bound, but what do I know :-) Do you really care about l_2 vs l_inf?

>> Keep State as it is, with the exception that ModelData has a dirty >> bit controlled by Model which Model uses to check if the States >> should be updated. > > How/when would this dirty bit be got/set? If anything changed, it gets set. The Model unsets it when it calls update on the states.

> 2. we want 8 individual moves of 1A to be treated the same as one > move of 8A. So going down this route would require the nonbonded > list to keep a copy of the coordinates for comparison purposes - > and a method to allow optimizers or other 'model state transforms' > to do a more efficient batch update of the variables (e.g. so that > we don't have to do our nonbond list check three times when we move > a particle, for the individual changes in x,y,z, but just once). Good point about the 8x move.

So the big question is whether moving a small number of particles is a common enough occurrence to structure things around it. Someone else has to answer that.

6376

Age (days ago)

6376

Last active (days ago)

List overview

Download

3 comments

2 participants

tags (0)

participants (2)

Ben Webb
Daniel Russel