I found a bug in the swig wrappers which I would like to fix. Fixing it though exposes issues with various things being passed around improperly by functions that are part of the api in various modules.
In particular the following have issues (proposed fixes show) - em2d::do_segmentation(): return the opencv matrix instead of pass by ref. As it is internally reference counted there is no reason not to. - em2d::Image::get_min_and_max_values: return a FloatRange - em2d::get_peak() return a pair of values - em2d::ProjectionParameters::get_keys()- return FloatKeys - em::MapReaderWriter all methods: I don't see any need to export them to python anyway - em::ImageHeader::get_{date,time,title}: return std::string - em::KerenParameters and em::RadiusDependentDistanceMask are pretty much a mess in terms of how they are passed and stored, there are plans to revamp it all, so just hiding the methods dealing with them seems ok for now. - saxs::Score::fit_profile: the profile argument can probably be passed by const-ref or value, but it is not entirely clear to me - multifit::FFTFitting::get_wrapped_index and get_wrapped_correlation_map: can return an Ints - multifit::fitting_clustering: should probably just return the solutions + a few others which had obvious, non-breaking fixes which can make.
I'd like to commit the wrapper changes. To do this, I'd just hide the above functions from python, they can be fixed and exposed as needed. Thoughts.
On 6/4/11 1:25 PM, Daniel Russel wrote: > I found a bug in the swig wrappers which I would like to fix. ...
All sounds good to me. But what are KerenParameters? ;)
Ben
KernelParameters ..... :) On Jun 4, 2011, at 1:33 PM, Ben Webb wrote:
> On 6/4/11 1:25 PM, Daniel Russel wrote: >> I found a bug in the swig wrappers which I would like to fix. > ... > > All sounds good to me. But what are KerenParameters? ;) > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I'll fix the em2d part as soon as I can
On 06/04/2011 01:25 PM, Daniel Russel wrote: > I found a bug in the swig wrappers which I would like to fix. Fixing it though exposes issues with various things being passed around improperly by functions that are part of the api in various modules. > > In particular the following have issues (proposed fixes show) > - em2d::do_segmentation(): return the opencv matrix instead of pass by ref. As it is internally reference counted there is no reason not to. > - em2d::Image::get_min_and_max_values: return a FloatRange > - em2d::get_peak() return a pair of values > - em2d::ProjectionParameters::get_keys()- return FloatKeys > - em::MapReaderWriter all methods: I don't see any need to export them to python anyway > - em::ImageHeader::get_{date,time,title}: return std::string > - em::KerenParameters and em::RadiusDependentDistanceMask are pretty much a mess in terms of how they are passed and stored, there are plans to revamp it all, so just hiding the methods dealing with them seems ok for now. > - saxs::Score::fit_profile: the profile argument can probably be passed by const-ref or value, but it is not entirely clear to me > - multifit::FFTFitting::get_wrapped_index and get_wrapped_correlation_map: can return an Ints > - multifit::fitting_clustering: should probably just return the solutions > + a few others which had obvious, non-breaking fixes which can make. > > I'd like to commit the wrapper changes. To do this, I'd just hide the above functions from python, they can be fixed and exposed as needed. Thoughts. > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Is anyone current using any of the listed methods from python? The only one of the methods that is called from a test case or an example is the one is saxs, but Dina, as far as I know, doesn't even build the python wrappers herself.
As a result, since it is a weekend, I checked in the fixed wrappers and hid the problematic methods to test that the new typemaps pass all of our tests, but am quite happy to back them out for a while and unhide things that are currently being used to give some time to fix them.
On Jun 5, 2011, at 11:36 AM, Javier Velazquez wrote:
> I'll fix the em2d part as soon as I can > > On 06/04/2011 01:25 PM, Daniel Russel wrote: >> I found a bug in the swig wrappers which I would like to fix. Fixing it though exposes issues with various things being passed around improperly by functions that are part of the api in various modules. >> >> In particular the following have issues (proposed fixes show) >> - em2d::do_segmentation(): return the opencv matrix instead of pass by ref. As it is internally reference counted there is no reason not to. >> - em2d::Image::get_min_and_max_values: return a FloatRange >> - em2d::get_peak() return a pair of values >> - em2d::ProjectionParameters::get_keys()- return FloatKeys >> - em::MapReaderWriter all methods: I don't see any need to export them to python anyway >> - em::ImageHeader::get_{date,time,title}: return std::string >> - em::KerenParameters and em::RadiusDependentDistanceMask are pretty much a mess in terms of how they are passed and stored, there are plans to revamp it all, so just hiding the methods dealing with them seems ok for now. >> - saxs::Score::fit_profile: the profile argument can probably be passed by const-ref or value, but it is not entirely clear to me >> - multifit::FFTFitting::get_wrapped_index and get_wrapped_correlation_map: can return an Ints >> - multifit::fitting_clustering: should probably just return the solutions >> + a few others which had obvious, non-breaking fixes which can make. >> >> I'd like to commit the wrapper changes. To do this, I'd just hide the above functions from python, they can be fixed and exposed as needed. Thoughts. >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > -- > Javier Velazquez > Postdoc at Salilab, UCSF > 1700 4th st. Byers Hall, office 503 > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Hi everyone,
I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method.
I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory.
I'm using an old version of IMP (r7392) with the following optimizer: > # Set up optimizer > o = IMP.core.MonteCarlo() > o.set_return_best(True) > o.set_model(m) > fk = IMP.core.XYZ.get_xyz_keys() > mov = IMP.core.NormalMover(ps, fk, 0.25) > o.add_mover(mov) > lo = IMP.core.ConjugateGradients() > o.set_local_steps(lsteps) > o.set_local_optimizer(lo)
I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code).
Thanks and best regards, Davide
-- Davide Baù Structural Genomics Laboratory Bioinformatics & Genomics Department, Prince Felipe Research Center Avda. Autopista del Saler 16, 46012 Valencia, Spain Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/
*** http://www.saveaswwf.com ***
I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably.
For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug.
On Jun 7, 2011, at 7:52 AM, Davide Baù wrote:
> Hi everyone, > > I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. > > I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). > The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. > > I'm using an old version of IMP (r7392) with the following optimizer: >> # Set up optimizer >> o = IMP.core.MonteCarlo() >> o.set_return_best(True) >> o.set_model(m) >> fk = IMP.core.XYZ.get_xyz_keys() >> mov = IMP.core.NormalMover(ps, fk, 0.25) >> o.add_mover(mov) >> lo = IMP.core.ConjugateGradients() >> o.set_local_steps(lsteps) >> o.set_local_optimizer(lo) > > I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). > > Thanks and best regards, > Davide > > > -- > Davide Baù > Structural Genomics Laboratory > Bioinformatics & Genomics Department, Prince Felipe Research Center > Avda. Autopista del Saler 16, 46012 Valencia, Spain > Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 > email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ > > *** http://www.saveaswwf.com *** > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. On a side note, I've updated to the latest trunk version (r9632). While it compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error:
> Checking for ExampleLib with variables...(cached) no > Checking for ExampleLib with pkg-config...no > The lib argument must be given as a list. It was not for ExampleLib
On IMP version r7392 local modules compile with no errors. Davide
On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: > I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. > > For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. > > On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: > >> Hi everyone, >> >> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >> >> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >> >> I'm using an old version of IMP (r7392) with the following optimizer: >>> # Set up optimizer >>> o = IMP.core.MonteCarlo() >>> o.set_return_best(True) >>> o.set_model(m) >>> fk = IMP.core.XYZ.get_xyz_keys() >>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>> o.add_mover(mov) >>> lo = IMP.core.ConjugateGradients() >>> o.set_local_steps(lsteps) >>> o.set_local_optimizer(lo) >> >> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >> >> Thanks and best regards, >> Davide >> >> >> -- >> Davide Baù >> Structural Genomics Laboratory >> Bioinformatics & Genomics Department, Prince Felipe Research Center >> Avda. Autopista del Saler 16, 46012 Valencia, Spain >> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >> >> *** http://www.saveaswwf.com *** >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran.
As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that.
On Jun 8, 2011, at 7:31 AM, Davide Baù wrote:
> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. > On a side note, I've updated to the latest trunk version (r9632). While it > compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: > >> Checking for ExampleLib with variables...(cached) no >> Checking for ExampleLib with pkg-config...no >> The lib argument must be given as a list. It was not for ExampleLib > > On IMP version r7392 local modules compile with no errors. > Davide > > > > On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >> >> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >> >> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >> >>> Hi everyone, >>> >>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>> >>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>> >>> I'm using an old version of IMP (r7392) with the following optimizer: >>>> # Set up optimizer >>>> o = IMP.core.MonteCarlo() >>>> o.set_return_best(True) >>>> o.set_model(m) >>>> fk = IMP.core.XYZ.get_xyz_keys() >>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>> o.add_mover(mov) >>>> lo = IMP.core.ConjugateGradients() >>>> o.set_local_steps(lsteps) >>>> o.set_local_optimizer(lo) >>> >>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>> >>> Thanks and best regards, >>> Davide >>> >>> >>> -- >>> Davide Baù >>> Structural Genomics Laboratory >>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>> >>> *** http://www.saveaswwf.com *** >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Thanks Daniel, the local module compilation went fine now.
The blow up was at the time of the first evaluate call (it does not get through the first evaluate call).
Davide
On Jun 8, 2011, at 5:49 PM, Daniel Russel wrote:
> One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran. > > As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that. > > On Jun 8, 2011, at 7:31 AM, Davide Baù wrote: > >> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. >> On a side note, I've updated to the latest trunk version (r9632). While it >> compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: >> >>> Checking for ExampleLib with variables...(cached) no >>> Checking for ExampleLib with pkg-config...no >>> The lib argument must be given as a list. It was not for ExampleLib >> >> On IMP version r7392 local modules compile with no errors. >> Davide >> >> >> >> On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >>> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >>> >>> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >>> >>> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >>> >>>> Hi everyone, >>>> >>>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>>> >>>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>>> >>>> I'm using an old version of IMP (r7392) with the following optimizer: >>>>> # Set up optimizer >>>>> o = IMP.core.MonteCarlo() >>>>> o.set_return_best(True) >>>>> o.set_model(m) >>>>> fk = IMP.core.XYZ.get_xyz_keys() >>>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>>> o.add_mover(mov) >>>>> lo = IMP.core.ConjugateGradients() >>>>> o.set_local_steps(lsteps) >>>>> o.set_local_optimizer(lo) >>>> >>>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>>> >>>> Thanks and best regards, >>>> Davide >>>> >>>> >>>> -- >>>> Davide Baù >>>> Structural Genomics Laboratory >>>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>>> >>>> *** http://www.saveaswwf.com *** >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Given that the blowup is at first evaluate, I'm guessing there culprit might be the dependency computations. I'll put together a test case and look around. So I head in the right direction, what sort of restraints do you use?
On Jun 8, 2011, at 9:05 AM, Davide Baù davide.bau@gmail.com wrote:
> Thanks Daniel, the local module compilation went fine now. > > The blow up was at the time of the first evaluate call (it does not get through the first evaluate call). > > Davide > > > On Jun 8, 2011, at 5:49 PM, Daniel Russel wrote: > >> One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran. >> >> As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that. >> >> On Jun 8, 2011, at 7:31 AM, Davide Baù wrote: >> >>> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. >>> On a side note, I've updated to the latest trunk version (r9632). While it >>> compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: >>> >>>> Checking for ExampleLib with variables...(cached) no >>>> Checking for ExampleLib with pkg-config...no >>>> The lib argument must be given as a list. It was not for ExampleLib >>> >>> On IMP version r7392 local modules compile with no errors. >>> Davide >>> >>> >>> >>> On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >>>> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >>>> >>>> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >>>> >>>> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>>>> >>>>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>>>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>>>> >>>>> I'm using an old version of IMP (r7392) with the following optimizer: >>>>>> # Set up optimizer >>>>>> o = IMP.core.MonteCarlo() >>>>>> o.set_return_best(True) >>>>>> o.set_model(m) >>>>>> fk = IMP.core.XYZ.get_xyz_keys() >>>>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>>>> o.add_mover(mov) >>>>>> lo = IMP.core.ConjugateGradients() >>>>>> o.set_local_steps(lsteps) >>>>>> o.set_local_optimizer(lo) >>>>> >>>>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>>>> >>>>> Thanks and best regards, >>>>> Davide >>>>> >>>>> >>>>> -- >>>>> Davide Baù >>>>> Structural Genomics Laboratory >>>>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>>>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>>>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>>>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>>>> >>>>> *** http://www.saveaswwf.com *** >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I can reproduce the high memory usage. I'll look in to what can be done. For the time being, you may want to look in to whether you can combine various restraints into one as that will help with memory usage a well as make your optimization run faster. For example, if you have many restraints on pairs of particles with similar operations, try to replace them by a container::PairsRestraint. This isn't always possible though.
On Jun 8, 2011, at 9:05 AM, Davide Baù wrote:
> Thanks Daniel, the local module compilation went fine now. > > The blow up was at the time of the first evaluate call (it does not get through the first evaluate call). > > Davide > > > On Jun 8, 2011, at 5:49 PM, Daniel Russel wrote: > >> One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran. >> >> As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that. >> >> On Jun 8, 2011, at 7:31 AM, Davide Baù wrote: >> >>> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. >>> On a side note, I've updated to the latest trunk version (r9632). While it >>> compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: >>> >>>> Checking for ExampleLib with variables...(cached) no >>>> Checking for ExampleLib with pkg-config...no >>>> The lib argument must be given as a list. It was not for ExampleLib >>> >>> On IMP version r7392 local modules compile with no errors. >>> Davide >>> >>> >>> >>> On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >>>> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >>>> >>>> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >>>> >>>> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >>>> >>>>> Hi everyone, >>>>> >>>>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>>>> >>>>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>>>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>>>> >>>>> I'm using an old version of IMP (r7392) with the following optimizer: >>>>>> # Set up optimizer >>>>>> o = IMP.core.MonteCarlo() >>>>>> o.set_return_best(True) >>>>>> o.set_model(m) >>>>>> fk = IMP.core.XYZ.get_xyz_keys() >>>>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>>>> o.add_mover(mov) >>>>>> lo = IMP.core.ConjugateGradients() >>>>>> o.set_local_steps(lsteps) >>>>>> o.set_local_optimizer(lo) >>>>> >>>>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>>>> >>>>> Thanks and best regards, >>>>> Davide >>>>> >>>>> >>>>> -- >>>>> Davide Baù >>>>> Structural Genomics Laboratory >>>>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>>>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>>>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>>>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>>>> >>>>> *** http://www.saveaswwf.com *** >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I will try to combine restraints and make use of container::PairsRestraint as you suggested. Beside harmonic (harmonic, lowerbound and upperbound) and excluded volume restraints I use the ConnectivityRestraint. While I don't know if this has a strong effect on memory usage, it makes the computation longer (as somehow expected).
Davide
On Jun 8, 2011, at 7:50 PM, Daniel Russel wrote: > I can reproduce the high memory usage. I'll look in to what can be done. For the time being, you may want to look in to whether you can combine various restraints into one as that will help with memory usage a well as make your optimization run faster. For example, if you have many restraints on pairs of particles with similar operations, try to replace them by a container::PairsRestraint. This isn't always possible though. > > > On Jun 8, 2011, at 9:05 AM, Davide Baù wrote: > >> Thanks Daniel, the local module compilation went fine now. >> >> The blow up was at the time of the first evaluate call (it does not get through the first evaluate call). >> >> Davide >> >> >> On Jun 8, 2011, at 5:49 PM, Daniel Russel wrote: >> >>> One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran. >>> >>> As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that. >>> >>> On Jun 8, 2011, at 7:31 AM, Davide Baù wrote: >>> >>>> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. >>>> On a side note, I've updated to the latest trunk version (r9632). While it >>>> compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: >>>> >>>>> Checking for ExampleLib with variables...(cached) no >>>>> Checking for ExampleLib with pkg-config...no >>>>> The lib argument must be given as a list. It was not for ExampleLib >>>> >>>> On IMP version r7392 local modules compile with no errors. >>>> Davide >>>> >>>> >>>> >>>> On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >>>>> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >>>>> >>>>> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >>>>> >>>>> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >>>>> >>>>>> Hi everyone, >>>>>> >>>>>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>>>>> >>>>>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>>>>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>>>>> >>>>>> I'm using an old version of IMP (r7392) with the following optimizer: >>>>>>> # Set up optimizer >>>>>>> o = IMP.core.MonteCarlo() >>>>>>> o.set_return_best(True) >>>>>>> o.set_model(m) >>>>>>> fk = IMP.core.XYZ.get_xyz_keys() >>>>>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>>>>> o.add_mover(mov) >>>>>>> lo = IMP.core.ConjugateGradients() >>>>>>> o.set_local_steps(lsteps) >>>>>>> o.set_local_optimizer(lo) >>>>>> >>>>>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>>>>> >>>>>> Thanks and best regards, >>>>>> Davide >>>>>> >>>>>> >>>>>> -- >>>>>> Davide Baù >>>>>> Structural Genomics Laboratory >>>>>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>>>>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>>>>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>>>>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>>>>> >>>>>> *** http://www.saveaswwf.com *** >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
You may also want to think about whether you really need a connectivity restraint as opposed to one of: - container::ConnectedPairContainer which handles the case where you have a set of particles which all need to be connected - core::KClosePairsPairScore which handles the case where you have two groups of particles that must be connected to one another - container::MinimumPairRestraint which handles the case where you have a number of pairs that could be connected and need to choose one
All of those are less general, and hence more efficient than the ConnectivityRestraint.
On Jun 8, 2011, at 1:25 PM, Davide Baù wrote:
> I will try to combine restraints and make use of container::PairsRestraint as you suggested. > Beside harmonic (harmonic, lowerbound and upperbound) and excluded volume restraints I use the ConnectivityRestraint. While I don't know if this has a strong effect on memory usage, it makes the computation longer (as somehow expected). > > Davide > > > > On Jun 8, 2011, at 7:50 PM, Daniel Russel wrote: >> I can reproduce the high memory usage. I'll look in to what can be done. For the time being, you may want to look in to whether you can combine various restraints into one as that will help with memory usage a well as make your optimization run faster. For example, if you have many restraints on pairs of particles with similar operations, try to replace them by a container::PairsRestraint. This isn't always possible though. >> >> >> On Jun 8, 2011, at 9:05 AM, Davide Baù wrote: >> >>> Thanks Daniel, the local module compilation went fine now. >>> >>> The blow up was at the time of the first evaluate call (it does not get through the first evaluate call). >>> >>> Davide >>> >>> >>> On Jun 8, 2011, at 5:49 PM, Daniel Russel wrote: >>> >>>> One thing I forgot to ask yesterday was whether the blow up was at the time of the first evaluate call or was a gradual increase as optimization ran. >>>> >>>> As for the compilation error, at some point there was a minor change in how external dependencies for modules need to be specified in order to be able to support certain libraries that need more than one lib linked at once. It looks like your module has the dummy external library code that used to be in the example module, and which predates the change. Just delete references to Example in your module's SConscript (modules/mymodule/SConscript). If you have real external dependencies, make the libs specification be a python list ["libname"], instead of "libname". Sorry about that. >>>> >>>> On Jun 8, 2011, at 7:31 AM, Davide Baù wrote: >>>> >>>>> I've never used Instruments but I'm giving it a try now. I will also try to reduce the number of restraints using lists if possible. >>>>> On a side note, I've updated to the latest trunk version (r9632). While it >>>>> compiles fine on both linux and mac, I'm not able to compile local modules. with the scons option "local=True", I get the following error: >>>>> >>>>>> Checking for ExampleLib with variables...(cached) no >>>>>> Checking for ExampleLib with pkg-config...no >>>>>> The lib argument must be given as a list. It was not for ExampleLib >>>>> >>>>> On IMP version r7392 local modules compile with no errors. >>>>> Davide >>>>> >>>>> >>>>> >>>>> On Jun 7, 2011, at 6:12 PM, Daniel Russel wrote: >>>>>> I don't think we have ever tried nearly that many restraints, so there may be some inefficiency that we haven't noticed. One possibility is building the dependency graph which is used to figure out the relationship between restraints and particles and scores states. Another is the non-bonded list or lists if you have many packed particles. It can be a bit difficult to control memory usage and in fact there was a bug a bit back with their memory usage increasing uncontrollably. >>>>>> >>>>>> For tracking down problems, if you have a mac where you can run things, there is a wonderful program there called "Instruments" which can track all memory allocations and deallocations and report where they were done. I would also suggest first updating to a more recent version of IMP to see if it is the non-bonded bug. >>>>>> >>>>>> On Jun 7, 2011, at 7:52 AM, Davide Baù wrote: >>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> I'm having a problem with the memory usage of the IMP::Model::evaluate(bool) method. >>>>>>> >>>>>>> I'm trying to optimize a set of particles as done for the alpha-globin domain (I'm doing some testing on a larger chromatin domain) using a large set of restraints (over 10^6 restraints). >>>>>>> The memory usage is OK until all the restraints are generated (~2GB), and then increase to 100% of memory and swap (16GB + 16GB) when I first call m.evaluate(False) (to check the initial score) causing the script to stop. I understand that a large number of restraints are being evaluated, but I was wondering if this is an expected behavior (i.e. if there is a sort of limitation in the number of restraints that can be implemented per memory GB) or if there is a workaround, before trying to run the script on a (shared) machine with much more memory. >>>>>>> >>>>>>> I'm using an old version of IMP (r7392) with the following optimizer: >>>>>>>> # Set up optimizer >>>>>>>> o = IMP.core.MonteCarlo() >>>>>>>> o.set_return_best(True) >>>>>>>> o.set_model(m) >>>>>>>> fk = IMP.core.XYZ.get_xyz_keys() >>>>>>>> mov = IMP.core.NormalMover(ps, fk, 0.25) >>>>>>>> o.add_mover(mov) >>>>>>>> lo = IMP.core.ConjugateGradients() >>>>>>>> o.set_local_steps(lsteps) >>>>>>>> o.set_local_optimizer(lo) >>>>>>> >>>>>>> I've tried without the ConjugateGradients after reading this https://salilab.org/imp/bugs/show_bug.cgi?id=106, but I guess it does not depend on CG (I don't have ClosePairContainer in the current code). >>>>>>> >>>>>>> Thanks and best regards, >>>>>>> Davide >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Davide Baù >>>>>>> Structural Genomics Laboratory >>>>>>> Bioinformatics & Genomics Department, Prince Felipe Research Center >>>>>>> Avda. Autopista del Saler 16, 46012 Valencia, Spain >>>>>>> Tel: +34 96 328 96 80 (ext. 1004) Fax: +34 96 328 97 01 >>>>>>> email: dbau@cipf.es web: http://bioinfo.cipf.es/dbau/ >>>>>>> >>>>>>> *** http://www.saveaswwf.com *** >>>>>>> >>>>>>> _______________________________________________ >>>>>>> IMP-dev mailing list >>>>>>> IMP-dev@salilab.org >>>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>>> >>>>>> _______________________________________________ >>>>>> IMP-dev mailing list >>>>>> IMP-dev@salilab.org >>>>>> https://salilab.org/mailman/listinfo/imp-dev >>>>> >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
participants (6)
-
Ben Webb
-
Daniel Russel
-
Davide Baù
-
Davide Baù
-
Javier Velazquez
-
Keren Lasker