Hello Daniel,

I think them main problem is the roughness of the scoring function caused by the excluded volume terms (as I had cryptically mentioned before). Your scoring function looks something like EM+sum over many pairs, where EM goes from 0 to 1 and each of the terms being summed over goes from, more of less 0 to 1 also. So the EM score is swamped by the excluded volume terms as soon as there is any overlap. And MC just moves things randomly, hoping for a good solution. So it will take a while to find anything good.

It perfectly makes sense… 
So, I understand that diverse restraints may have diverse scoring schemes and images ( e.g. [0,1] for FittingRestraint and [0,+M] for ExcludedVolumeRestraint, with M potentially very high)… And now, what to do with it ? :)
There seem to be no "pythonic" way to modify constants in front of the diverse restraints, or to compose a restraint with a function to modulate its score. My lucky guess is : go for C++ to create a new restraint by composing existing restraints… Am I correct ?

In addition, if you use Conjugate Gradients, the derivatives of the excluded volume score on atoms are pretty useless since they are computed using each atom ball (explanation of the problem would probably benefit from a picture, I'll work on that  :-).

I am afraid I need a picture... and a lot more explanation on IMP internal computations ;)
So don't spend too much time on this unless you think it can benefit to others.

My suggestions are
- don't use a full atomic representation for your proteins (eg, simplify them using IMP.atom.create_simplified_along_backbone()). This will smooth the excluded volume scoring function out and making scoring faster.
I'll try that

- use Conjugate Gradients in conjunction with MC (create a conjugate gradients optimizer and add it to the MC one using mc.set_local_optimizer()). This will then perform local minimization after each MC step.
I missed the set_local_optimizer() trick… I'll have a closer look at that.

- when using monte carlo for optimization (as opposed to sampling), you should call set_return_best(True) on the object so that it saves the best state it finds, rather than just returning the last state accepted (which may have higher score).

I missed that to...

Thanks a lot for your answer, I will test that ASAP

Dr. Benjamin SCHWARZ
Biocomputing group
 Voice : +33 (0)3 68 85 47 30
 FAX : +33 (0)3 68 85 47 18



Structural Biology & Genomics Dept. - IGBMC 
1 rue Laurent Fries
BP 10142
F - 67404 Illkirch CEDEX