problems with a sample script (basic IMP optimization)

newer
IMP svn compiling issues Mac OS X...

older
ExcludedVolumeRestraint : meaning...

Benjamin SCHWARZ

23 Jul 2010 23 Jul '10

7 a.m.

Hi list,

As a first step towards a serious usage of IMP, I tried to write a sample code using rigid bodies, restraints and optimizers. With MonteCarlo optimizer, at the end of optimization the model seems to remains unchanged. With ConjugateGradients or SteepestDescent, when using too many steps in the optimizer (under 100 though !?) the script generally crashes with one of these error messages :

File "/usr/local/lib/python2.6/site-packages/IMP/core/__init__.py", line 642, in optimize def optimize(self, *args): return _IMP_core.ConjugateGradients_optimize(self, *args) _IMP.UsageException: Attempting to construct a rotation from a non-quaternion value. The coefficient vector must have a length of 1. Got: 0.165858 -0.916284 0.55176 0.327904 gives 1.27905

Moreover when the script finishes, its outputs seem not relevant at all, though the score is better (smaller). maybe it is normal though, due to the choice of my restraints or to the lack of optimization steps… I think my understanding of IMP usage is still very far from perfect, will you help me sorting that out by having a look at my code ?

You will fin the script attached to this message, its outlet is really basic : - a structure is loaded ( I choose 1RDT, a complex comprising two protein domains and two peptides) - a map is sampled at 10A for that structure - chains are extracted from the structure, rigidified and randomly moved in space - two restraints are added to the model in order to tale into account - the fit of all atoms to the sampled map - no steric clashes between rigid bodies - optimization

thank you very much

Dr. Benjamin SCHWARZ Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Attachments:

attachment.htm (text/html — 8.1 KB)
logo-uds-signature.gif (image/gif — 3.1 KB)

Show replies by date

Benjamin SCHWARZ

23 Jul 23 Jul

7:06 a.m.

New subject: problems with a sample script (basic IMP optimization)

…And it is even better with the script.

Dr. Benjamin SCHWARZ Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Daniel Russel

10:44 a.m.

New subject: problems with a sample script (basic IMP optimization)

The bug sounds like something Elina found and patched after the 1.0 release. I'll test your script with the current SVN.

As general comments: - with monte-carlo, you have to provide the move set. We really should put up an error if there is no move set :-) I'll fix that. - Conjugate gradients can get stuck very easily (which is why we are working on a better optimizer). This is especially true with rigid bodies composed of my particles as the atom balls can get mixed in together and CG can never separate them.

On Jul 23, 2010, at 7:00 AM, Benjamin SCHWARZ wrote:

> Hi list, > > As a first step towards a serious usage of IMP, I tried to write a sample code using rigid bodies, restraints and optimizers. > With MonteCarlo optimizer, at the end of optimization the model seems to remains unchanged. With ConjugateGradients or SteepestDescent, when using too many steps in the optimizer (under 100 though !?) the script generally crashes with one of these error messages : > > File "/usr/local/lib/python2.6/site-packages/IMP/core/__init__.py", line 642, in optimize > def optimize(self, *args): return _IMP_core.ConjugateGradients_optimize(self, *args) > _IMP.UsageException: Attempting to construct a rotation from a non-quaternion value. The coefficient vector must have a length of 1. Got: 0.165858 -0.916284 0.55176 0.327904 gives 1.27905 > > > File "/usr/local/lib/python2.6/site-packages/IMP/core/__init__.py", line 642, in optimize > def optimize(self, *args): return _IMP_core.ConjugateGradients_optimize(self, *args) > _IMP.UsageException: Attempting to construct a rotation from a non-quaternion value. The coefficient vector must have a length of 1. Got: 0.389486 -0.600546 0.553302 0.99031 gives 1.79921 > > Moreover when the script finishes, its outputs seem not relevant at all, though the score is better (smaller). maybe it is normal though, due to the choice of my restraints or to the lack of optimization steps… I think my understanding of IMP usage is still very far from perfect, will you help me sorting that out by having a look at my code ? > > You will fin the script attached to this message, its outlet is really basic : > - a structure is loaded ( I choose 1RDT, a complex comprising two protein domains and two peptides) > - a map is sampled at 10A for that structure > - chains are extracted from the structure, rigidified and randomly moved in space > - two restraints are added to the model in order to tale into account > - the fit of all atoms to the sampled map > - no steric clashes between rigid bodies > - optimization > > > thank you very much > > > > Dr. Benjamin SCHWARZ > Biocomputing group > Email : schwarz@igbmc.fr > Voice : +33 (0)3 68 85 47 30 > FAX : +33 (0)3 68 85 47 18 > > <logo-uds-signature.gif> > > > Structural Biology & Genomics Dept. - IGBMC > 1 rue Laurent Fries > BP 10142 > F - 67404 Illkirch CEDEX > > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users

Benjamin SCHWARZ

26 Jul 26 Jul

1:04 a.m.

New subject: problems with a sample script (basic IMP optimization)

Thanks a lot for the answer

> The bug sounds like something Elina found and patched after the 1.0 release. I'll test your script with the current SVN. The good news for me, is that you seem not to blame my code. I guess it means, my understanding of IMP's usage tends to improve :)

> As general comments: > - with monte-carlo, you have to provide the move set. We really should put up an error if there is no move set :-) I'll fix that. OK, I will have a serious glance on these aspects

> - Conjugate gradients can get stuck very easily (which is why we are working on a better > optimizer). OK, I will have a look at the C++ code for IMP.em.local_rigid_fitting in order to understand how you mixed MC and CG there.

> This is especially true with rigid bodies composed of my particles as the atom balls can get mixed in together and CG can never separate them. I am sorry, I don't understand this point.

Please, let me know when you have tested my script with the latest IMP code.

--Ben

Benjamin SCHWARZ

9:04 a.m.

New subject: problems with a sample script (basic IMP optimization)

Following Daniel's suggestion I added a set of Movers to my MC optimizer (see code attached). Now the optimizer actually does something, though the results are very strange. It is much probable I poorly chose my parameters, and 1000 steps are not so much for a MC optimization.

It also appears I have a problem with the cost function : it evaluates to 18.36 for the native configuration and to 1371 for an initial "noisy" configuration not so far from the initial configuration. It then evaluates to 1.03 for an "optimized" configuration where all four chains are placed far outside the sampled density. I did not discover yet if and how it is possible to set parameters for the global cost function, maybe my problem has something to do with it.

Anyhow, I would be very grateful if someone would glance at my script, just to check I did use IMP in a proper manner.

--Ben

Le 23 juil. 2010 à 19:44, Daniel Russel a écrit :

> The bug sounds like something Elina found and patched after the 1.0 release. I'll test your script with the current SVN. > > As general comments: > - with monte-carlo, you have to provide the move set. We really should put up an error if there is no move set :-) I'll fix that. > - Conjugate gradients can get stuck very easily (which is why we are working on a better optimizer). This is especially true with rigid bodies composed of my particles as the atom balls can get mixed in together and CG can never separate them.

Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Benjamin SCHWARZ

27 Jul 27 Jul

6:30 a.m.

New subject: problems with a sample script (basic IMP optimization)

Hi list,

In order to have a better insight I modified my script, keeping only the em.FitRestraint in my model (script attached). The score for the initial model (5.96046447754e-08) is reassuringly near 0; which is reasonable when comparing two maps sampled from the same structure. Nevertheless, when chains are randomly moved, the score does not radically improve, even when the transformations move them far away from their initial position. Moreover, I have an example where the score is minimized from 0.37 for the initial (moved chains) position to 0.49 for the optimized solution !?

I may have missed something, and I would need some technical support to fix the problem quickly.

--Ben

Dr. Benjamin SCHWARZ Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Daniel Russel

7:45 a.m.

New subject: problems with a sample script (basic IMP optimization)

I think them main problem is the roughness of the scoring function caused by the excluded volume terms (as I had cryptically mentioned before). Your scoring function looks something like EM+sum over many pairs, where EM goes from 0 to 1 and each of the terms being summed over goes from, more of less 0 to 1 also. So the EM score is swamped by the excluded volume terms as soon as there is any overlap. And MC just moves things randomly, hoping for a good solution. So it will take a while to find anything good. In addition, if you use Conjugate Gradients, the derivatives of the excluded volume score on atoms are pretty useless since they are computed using each atom ball (explanation of the problem would probably benefit from a picture, I'll work on that :-).

My suggestions are - don't use a full atomic representation for your proteins (eg, simplify them using IMP.atom.create_simplified_along_backbone()). This will smooth the excluded volume scoring function out and making scoring faster. - use Conjugate Gradients in conjunction with MC (create a conjugate gradients optimizer and add it to the MC one using mc.set_local_optimizer()). This will then perform local minimization after each MC step. - when using monte carlo for optimization (as opposed to sampling), you should call set_return_best(True) on the object so that it saves the best state it finds, rather than just returning the last state accepted (which may have higher score).

On Jul 27, 2010, at 6:30 AM, Benjamin SCHWARZ wrote:

> Hi list, > > In order to have a better insight I modified my script, keeping only the em.FitRestraint in my model (script attached). > The score for the initial model (5.96046447754e-08) is reassuringly near 0; which is reasonable when comparing two maps sampled from the same structure. > Nevertheless, when chains are randomly moved, the score does not radically improve, even when the transformations move them far away from their initial position. Moreover, I have an example where the score is minimized from 0.37 for the initial (moved chains) position to 0.49 for the optimized solution !? > > I may have missed something, and I would need some technical support to fix the problem quickly. > > --Ben > > > <fitWithRestraints.py> > > Dr. Benjamin SCHWARZ > Biocomputing group > Email : schwarz@igbmc.fr > Voice : +33 (0)3 68 85 47 30 > FAX : +33 (0)3 68 85 47 18 > > <logo-uds-signature.gif> > > > Structural Biology & Genomics Dept. - IGBMC > 1 rue Laurent Fries > BP 10142 > F - 67404 Illkirch CEDEX > > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users

Benjamin SCHWARZ

9:10 a.m.

New subject: problems with a sample script (basic IMP optimization)

Hello Daniel,

> I think them main problem is the roughness of the scoring function caused by the excluded volume terms (as I had cryptically mentioned before). Your scoring function looks something like EM+sum over many pairs, where EM goes from 0 to 1 and each of the terms being summed over goes from, more of less 0 to 1 also. So the EM score is swamped by the excluded volume terms as soon as there is any overlap. And MC just moves things randomly, hoping for a good solution. So it will take a while to find anything good.

It perfectly makes sense… So, I understand that diverse restraints may have diverse scoring schemes and images ( e.g. [0,1] for FittingRestraint and [0,+M] for ExcludedVolumeRestraint, with M potentially very high)… And now, what to do with it ? :) There seem to be no "pythonic" way to modify constants in front of the diverse restraints, or to compose a restraint with a function to modulate its score. My lucky guess is : go for C++ to create a new restraint by composing existing restraints… Am I correct ?

> In addition, if you use Conjugate Gradients, the derivatives of the excluded volume score on atoms are pretty useless since they are computed using each atom ball (explanation of the problem would probably benefit from a picture, I'll work on that :-).

I am afraid I need a picture... and a lot more explanation on IMP internal computations ;) So don't spend too much time on this unless you think it can benefit to others.

> My suggestions are > - don't use a full atomic representation for your proteins (eg, simplify them using IMP.atom.create_simplified_along_backbone()). This will smooth the excluded volume scoring function out and making scoring faster. I'll try that

> - use Conjugate Gradients in conjunction with MC (create a conjugate gradients optimizer and add it to the MC one using mc.set_local_optimizer()). This will then perform local minimization after each MC step. I missed the set_local_optimizer() trick… I'll have a closer look at that.

> - when using monte carlo for optimization (as opposed to sampling), you should call set_return_best(True) on the object so that it saves the best state it finds, rather than just returning the last state accepted (which may have higher score).

I missed that to...

Thanks a lot for your answer, I will test that ASAP

Dr. Benjamin SCHWARZ Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Ben Webb

9:13 a.m.

New subject: problems with a sample script (basic IMP optimization)

On 7/27/10 9:10 AM, Benjamin SCHWARZ wrote: > So, I understand that diverse restraints may have diverse scoring > schemes and images ( e.g. [0,1] for FittingRestraint and [0,+M] for > ExcludedVolumeRestraint, with M potentially very high)… And now, what to > do with it ? :) > There seem to be no "pythonic" way to modify constants in front of the > diverse restraints, or to compose a restraint with a function to > modulate its score. My lucky guess is : go for C++ to create a new > restraint by composing existing restraints… Am I correct ?

Yes, when combining very different restraints, weighting may need to be considered. You can weight a Restraint by putting it in a RestraintSet and calling that object's set_weight() method.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

28 Jul 28 Jul

8:44 a.m.

New subject: problems with a sample script (basic IMP optimization)

> It perfectly makes sense… > So, I understand that diverse restraints may have diverse scoring schemes and images ( e.g. [0,1] for FittingRestraint and [0,+M] for ExcludedVolumeRestraint, with M potentially very high)… And now, what to do with it ? :) > There seem to be no "pythonic" way to modify constants in front of the diverse restraints, or to compose a restraint with a function to modulate its score. My lucky guess is : go for C++ to create a new restraint by composing existing restraints… Am I correct ?

As Ben pointed out, you can use RestraintSets to weight things. This does get to a more fundamental problem that weighting inhomogeneous restraints relative to one another is fundamentally broken and should be avoided. We are working on solutions to that (using an inference based optimizer Keren developed), but for now you have to fool around with weights.

To try to explain better, for a pair of intersecting rigid bodies, the derivatives of the excluded volume restraint are computed by adding up the derivatives for each of the balls which constitute the rigid body. The sum of these individual derivatives does not necessarily point in a direction which separates the two rigid bodies unless the inter-penetration of bodies is small compared to the size of the balls. As a result CG would get stuck in local minima. We are working on a simple alternatively, but I don't think it is there yet. Keren?

--Daniel

Benjamin SCHWARZ

9:09 a.m.

New subject: problems with a sample script (basic IMP optimization)

> As Ben pointed out, you can use RestraintSets to weight things. This does get to a more fundamental problem that weighting inhomogeneous restraints relative to one another is fundamentally broken and should be avoided. We are working on solutions to that (using an inference based optimizer Keren developed), but for now you have to fool around with weights.

Just to be 100% sure to understand : what you mean when saying this approach is "broken" is that rather a technical / software problem, it is a fundamental / theoretical evilness / badness to mix "things" that do not leave in the same space. Though, since a more theoretically justified technique is not yet at hand, a weighted sum of restraints will do the trick.

Concerning Keren's development, has it got anything to do with Domino ?

>>> In addition, if you use Conjugate Gradients, the derivatives of the excluded volume score on atoms are pretty useless since they are computed using each atom ball (explanation of the problem would probably benefit from a picture, I'll work on that :-). > I tried with a picture, but couldn't make anything that wasn't confusingly cluttered. > To try to explain better, for a pair of intersecting rigid bodies, the derivatives of the excluded volume restraint are computed by adding up the derivatives for each of the balls which constitute the rigid body. The sum of these individual derivatives does not necessarily point in a direction which separates the two rigid bodies unless the inter-penetration of bodies is small compared to the size of the balls. As a result CG would get stuck in local minima.

That's perfectly clear, thanks a lot :)

Dr. Benjamin SCHWARZ Biocomputing group Email : schwarz@igbmc.fr Voice : +33 (0)3 68 85 47 30 FAX : +33 (0)3 68 85 47 18

Structural Biology & Genomics Dept. - IGBMC 1 rue Laurent Fries BP 10142 F - 67404 Illkirch CEDEX

Daniel Russel

4:37 p.m.

New subject: problems with a sample script (basic IMP optimization)

On Jul 28, 2010, at 9:09 AM, Benjamin SCHWARZ wrote:

>> As Ben pointed out, you can use RestraintSets to weight things. This does get to a more fundamental problem that weighting inhomogeneous restraints relative to one another is fundamentally broken and should be avoided. We are working on solutions to that (using an inference based optimizer Keren developed), but for now you have to fool around with weights. > > Just to be 100% sure to understand : what you mean when saying this approach is "broken" is that rather a technical / software problem, it is a fundamental / theoretical evilness / badness to mix "things" that do not leave in the same space. Though, since a more theoretically justified technique is not yet at hand, a weighted sum of restraints will do the trick. The latter. It is a fundamental problem, rather than a software problem. Different restraints have different minima, grown to different values and at different rates. As a result there isn't really a set of weights that makes sense over a whole optimization process.

> > Concerning Keren's development, has it got anything to do with Domino ? Yes, We are working on a version (domino2) which is based on filtering of possible conformations based on the scores of individual restraints. --Daniel

Benjamin SCHWARZ

12 Aug 12 Aug

7 a.m.

New subject: CG and MC.set_local_optimizer(CG)

As evoked in a former message, I couldn't manage to have GC optimizer working with the binary MAC OSX version (IMP 1.0). So I compiled the latest version on one of our linux machines. The compilation was easily achieved with the default parameters, though I guess it could be possible to enhance IMP speed by tweaking some magic compilation parameters… I'll see this point later.

I am thus using IMP version SVN 6211

With this version CG optimizer seem to work fine, though I always get warning messages WARNING could not find parameters for radius:1.85 WARNING Setting params for radius :1.85 WARNING could not find parameters for radius:2.275 WARNING Setting params for radius :2.275 WARNING could not find parameters for radius:2 WARNING Setting params for radius :2 WARNING could not find parameters for radius:1.7 WARNING Setting params for radius :1.7 WARNING could not find parameters for radius:2.175 WARNING Setting params for radius :2.175 WARNING could not find parameters for radius:2.06 WARNING Setting params for radius :2.06 WARNING could not find parameters for radius:1.77 WARNING Setting params for radius :1.77 WARNING could not find parameters for radius:1.9924 WARNING Setting params for radius :1.9924 WARNING could not find parameters for radius:1.8 WARNING Setting params for radius :1.8 WARNING Object "Nameless" was never used. See the IMP::Object documentation for an explanation.

But when trying to use CG in conjunction with MC ( mco.set_local_optimizer( cgo ) ), I still experience problems; sometimes it seems to pole in an infinite loop, sometimes it prematurely ends with this message :

Traceback (most recent call last): File "./fitWithRestraints.py", line 222, in <module> opti.optimize(params["optimizer"]["maxSteps"]) File "/usr/local/imp/lib64/python2.6/site-packages/IMP/__init__.py", line 1569, in optimize def optimize(self, *args): return _IMP.Optimizer_optimize(self, *args) _IMP.UsageException: Attempt to use uninitialized vector.

Ultimately, I am wondering if trying to get CG working is not a waste of time for me… I mean, maybe there are better approaches... So, If you have any comment or general advice that I should check, feel free to do so.

--Ben

Keren Lasker

8:40 a.m.

New subject: CG and MC.set_local_optimizer(CG)

Ben - from the warning messages I see that you are using the FitRestraint. Are you using any other restraints as well? It will be useful to get your optimization script so we can take a look. Keren. On Aug 12, 2010, at 7:00 AM, Benjamin SCHWARZ wrote:

> As evoked in a former message, I couldn't manage to have GC > optimizer working with the binary MAC OSX version (IMP 1.0). So I > compiled the latest version on one of our linux machines. The > compilation was easily achieved with the default parameters, though > I guess it could be possible to enhance IMP speed by tweaking some > magic compilation parameters… I'll see this point later. > > I am thus using IMP version SVN 6211 > > With this version CG optimizer seem to work fine, though I always > get warning messages > WARNING could not find parameters for radius:1.85 > WARNING Setting params for radius :1.85 > WARNING could not find parameters for radius:2.275 > WARNING Setting params for radius :2.275 > WARNING could not find parameters for radius:2 > WARNING Setting params for radius :2 > WARNING could not find parameters for radius:1.7 > WARNING Setting params for radius :1.7 > WARNING could not find parameters for radius:2.175 > WARNING Setting params for radius :2.175 > WARNING could not find parameters for radius:2.06 > WARNING Setting params for radius :2.06 > WARNING could not find parameters for radius:1.77 > WARNING Setting params for radius :1.77 > WARNING could not find parameters for radius:1.9924 > WARNING Setting params for radius :1.9924 > WARNING could not find parameters for radius:1.8 > WARNING Setting params for radius :1.8 > WARNING Object "Nameless" was never used. See the > IMP::Object documentation for an explanation. > > > But when trying to use CG in conjunction with MC > ( mco.set_local_optimizer( cgo ) ), I still experience problems; > sometimes it seems to pole in an infinite loop, sometimes it > prematurely ends with this message : > > Traceback (most recent call last): > File "./fitWithRestraints.py", line 222, in <module> > opti.optimize(params["optimizer"]["maxSteps"]) > File "/usr/local/imp/lib64/python2.6/site-packages/IMP/ > __init__.py", line 1569, in optimize > def optimize(self, *args): return _IMP.Optimizer_optimize(self, > *args) > _IMP.UsageException: Attempt to use uninitialized vector. > > > Ultimately, I am wondering if trying to get CG working is not a > waste of time for me… I mean, maybe there are better approaches... > So, If you have any comment or general advice that I should check, > feel free to do so. > > --Ben > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users

Benjamin SCHWARZ

9:21 a.m.

New subject: CG and MC.set_local_optimizer(CG)

Hello Keren,

As you can see in the attached script (lines 168-210) I tested several configurations : FitRestraint alone or with an ExcludedVolumeRestraint; with or without weights (through a RestraintSet as suggested by Ben and Daniel).

Le 12 août 2010 à 17:40, Keren Lasker a écrit :

> Ben - > from the warning messages I see that you are using the FitRestraint. > Are you using any other restraints as well? > It will be useful to get your optimization script so we can take a look. > Keren. > On Aug 12, 2010, at 7:00 AM, Benjamin SCHWARZ wrote:

5254

Age (days ago)

5274

Last active (days ago)

List overview

Download

14 comments

4 participants

tags (0)

participants (4)

Ben Webb
Benjamin SCHWARZ
Daniel Russel
Keren Lasker