performance of Connectivity restraint
hi all,
I am writing a test case for adding hierarchy support in DOMINO, and it seems that the connectivity restraint works incredibly sssllllooowww between two proteins each of ~100 residues.
i=0 ub = IMP.core.HarmonicUpperBound(1.0, 0.1) ss= IMP.core.DistancePairScore(ub) r= IMP.core.ConnectivityRestraint(ss) ps = IMP.Particles() ps_refined=[] for j in xrange(2):
ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i+j])) ps.append(self.particles[i+j]) for e in ps_refined: r.add_particles(e)
beg = time.time() r.evaluate(None) end = time.time() dt = end - beg print 'connectivity restraint calculation took %9.6f Seconds' % (dt)
The evaluate function takes ~20 seconds.
am I missing something here? is there a faster implementation using some external library ?
thanks, Keren.
just for the records, Ben suggested compiling with release=true, which reduced the running time to ~7 seconds. I still think we should consider a faster implementation. Maybe use geometric hashing to query close particles in space. On Jan 26, 2009, at 9:43 PM, Keren Lasker wrote:
> hi all, > > I am writing a test case for adding hierarchy support in DOMINO, and > it seems that the connectivity restraint works incredibly > sssllllooowww between two proteins each of ~100 residues. > > i=0 > ub = IMP.core.HarmonicUpperBound(1.0, 0.1) > ss= IMP.core.DistancePairScore(ub) > r= IMP.core.ConnectivityRestraint(ss) > ps = IMP.Particles() > ps_refined=[] > for j in xrange(2): > > ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i > +j])) > ps.append(self.particles[i+j]) > for e in ps_refined: > r.add_particles(e) > > beg = time.time() > r.evaluate(None) > end = time.time() > dt = end - beg > print 'connectivity restraint calculation took %9.6f Seconds' % (dt) > > The evaluate function takes ~20 seconds. > > am I missing something here? is there a faster implementation using > some external library ? > > thanks, > Keren. > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Also make sure that NDEBUG is defined (with the current svn it is with build=fast), not sure if it was for release or not.
Also, run it with gprof to see where the time is being spent. Is it in the harmonic evaluation? Or in the actual shortest paths.
On Jan 27, 2009, at 7:20 PM, Keren Lasker wrote:
> just for the records, > Ben suggested compiling with release=true, which reduced the running > time to ~7 seconds. > I still think we should consider a faster implementation. Maybe use > geometric hashing to query close particles in space. > On Jan 26, 2009, at 9:43 PM, Keren Lasker wrote: > >> hi all, >> >> I am writing a test case for adding hierarchy support in DOMINO, >> and it seems that the connectivity restraint works incredibly >> sssllllooowww between two proteins each of ~100 residues. >> >> i=0 >> ub = IMP.core.HarmonicUpperBound(1.0, 0.1) >> ss= IMP.core.DistancePairScore(ub) >> r= IMP.core.ConnectivityRestraint(ss) >> ps = IMP.Particles() >> ps_refined=[] >> for j in xrange(2): >> >> ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i >> +j])) >> ps.append(self.particles[i+j]) >> for e in ps_refined: >> r.add_particles(e) >> >> beg = time.time() >> r.evaluate(None) >> end = time.time() >> dt = end - beg >> print 'connectivity restraint calculation took %9.6f Seconds' % (dt) >> >> The evaluate function takes ~20 seconds. >> >> am I missing something here? is there a faster implementation using >> some external library ? >> >> thanks, >> Keren. >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I forgot to say, doing set_imp_check_level(NONE) is almost as good as NDEBUG.
On Jan 27, 2009, at 7:20 PM, Keren Lasker wrote:
> just for the records, > Ben suggested compiling with release=true, which reduced the running > time to ~7 seconds. > I still think we should consider a faster implementation. Maybe use > geometric hashing to query close particles in space. > On Jan 26, 2009, at 9:43 PM, Keren Lasker wrote: > >> hi all, >> >> I am writing a test case for adding hierarchy support in DOMINO, >> and it seems that the connectivity restraint works incredibly >> sssllllooowww between two proteins each of ~100 residues. >> >> i=0 >> ub = IMP.core.HarmonicUpperBound(1.0, 0.1) >> ss= IMP.core.DistancePairScore(ub) >> r= IMP.core.ConnectivityRestraint(ss) >> ps = IMP.Particles() >> ps_refined=[] >> for j in xrange(2): >> >> ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i >> +j])) >> ps.append(self.particles[i+j]) >> for e in ps_refined: >> r.add_particles(e) >> >> beg = time.time() >> r.evaluate(None) >> end = time.time() >> dt = end - beg >> print 'connectivity restraint calculation took %9.6f Seconds' % (dt) >> >> The evaluate function takes ~20 seconds. >> >> am I missing something here? is there a faster implementation using >> some external library ? >> >> thanks, >> Keren. >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
How many particles are you using: I get about 1s for 1000 particles, which, given you have to compute a million distances, isn't too bad.
import IMP import IMP.core import time
i=0 m= IMP.Model() ub = IMP.core.HarmonicUpperBound(1.0, 0.1) ss= IMP.core.DistancePairScore(ub) ps = IMP.core.create_xyzr_particles(m, 1000, .1) r= IMP.core.ConnectivityRestraint(ss) r.set_particles(ps) m.add_restraint(r)
beg = time.time() m.evaluate(None) end = time.time() dt = end - beg print 'connectivity restraint calculation took %9.6f Seconds' % (dt)
Keren Lasker wrote: > just for the records, > Ben suggested compiling with release=true, which reduced the running > time to ~7 seconds. > I still think we should consider a faster implementation. Maybe use > geometric hashing to query close particles in space. > On Jan 26, 2009, at 9:43 PM, Keren Lasker wrote: > >> hi all, >> >> I am writing a test case for adding hierarchy support in DOMINO, and >> it seems that the connectivity restraint works incredibly >> sssllllooowww between two proteins each of ~100 residues. >> >> i=0 >> ub = IMP.core.HarmonicUpperBound(1.0, 0.1) >> ss= IMP.core.DistancePairScore(ub) >> r= IMP.core.ConnectivityRestraint(ss) >> ps = IMP.Particles() >> ps_refined=[] >> for j in xrange(2): >> >> ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i+j])) >> ps.append(self.particles[i+j]) >> for e in ps_refined: >> r.add_particles(e) >> >> beg = time.time() >> r.evaluate(None) >> end = time.time() >> dt = end - beg >> print 'connectivity restraint calculation took %9.6f Seconds' % (dt) >> >> The evaluate function takes ~20 seconds. >> >> am I missing something here? is there a faster implementation using >> some external library ? >> >> thanks, >> Keren. >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
It appears that all of the time is spent evaluating the harmonic when the number of particles in large (say 1000). This makes sense as the code was designed more for Frank's case where there are only a few proteins of each type (and so the only quadratic steps were computing all pairs between the particles representing a pair of proteins in the LowestRefinedPairScore).
One option would be to use a ClosePairsScoreState and change the connectivity restraint to get its pairs from that. Then the connectivity restraint would only search through edges defined by pairs in a PairContainer and then you can fill that however you want. If you are lazy, then you just use an AllPairsPairContainer and replicate the current behavior (module particle refiners, which would have to be worked in another way). Comments?
On Jan 27, 2009, at 7:20 PM, Keren Lasker wrote:
> just for the records, > Ben suggested compiling with release=true, which reduced the running > time to ~7 seconds. > I still think we should consider a faster implementation. Maybe use > geometric hashing to query close particles in space. > On Jan 26, 2009, at 9:43 PM, Keren Lasker wrote: > >> hi all, >> >> I am writing a test case for adding hierarchy support in DOMINO, >> and it seems that the connectivity restraint works incredibly >> sssllllooowww between two proteins each of ~100 residues. >> >> i=0 >> ub = IMP.core.HarmonicUpperBound(1.0, 0.1) >> ss= IMP.core.DistancePairScore(ub) >> r= IMP.core.ConnectivityRestraint(ss) >> ps = IMP.Particles() >> ps_refined=[] >> for j in xrange(2): >> >> ps_refined.append(IMP.core.hierarchy_get_leaves(self.h_particles[i >> +j])) >> ps.append(self.particles[i+j]) >> for e in ps_refined: >> r.add_particles(e) >> >> beg = time.time() >> r.evaluate(None) >> end = time.time() >> dt = end - beg >> print 'connectivity restraint calculation took %9.6f Seconds' % (dt) >> >> The evaluate function takes ~20 seconds. >> >> am I missing something here? is there a faster implementation using >> some external library ? >> >> thanks, >> Keren. >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
hi,
The recent changes still do not compile.
g++ -o modules/em/src/ImageEM.os -c -Wall -fvisibility=hidden -fPIC - D_USE_MATH_DEFINES -DIMPEM_EXPORTS -DGCC_VISIBILITY -Ibuild/include -I/ viola1/home/kerenl/bioinformatics/projects/embed/src/emlib modules/em/ src/ImageEM.cpp g++ -o modules/em/src/ImageHeader.os -c -Wall -fvisibility=hidden - fPIC -D_USE_MATH_DEFINES -DIMPEM_EXPORTS -DGCC_VISIBILITY -Ibuild/ include -I/viola1/home/kerenl/bioinformatics/projects/embed/src/emlib modules/em/src/ImageHeader.cpp modules/em/src/ImageHeader.cpp: In member function 'bool IMP::em::ImageHeader::read(std::ifstream&, bool, bool, bool)': modules/em/src/ImageHeader.cpp:123: error: 'reversed_read' was not declared in this scope modules/em/src/ImageHeader.cpp:242: error: 'reversed_read' was not declared in this scope modules/em/src/ImageHeader.cpp: In member function 'void IMP::em::ImageHeader::write(std::ofstream&, bool)': modules/em/src/ImageHeader.cpp:282: error: 'reversed_write' was not declared in this scope build/include/IMP/algebra/MultiArray.h: In function 'bool IMP::algebra::roll_inds(T1&, T2*, T3*) [with T1 = std::vector<int, std::allocator<int> >, T2 = const boost::multi_array_types::size_type, T3 = const boost::multi_array_types::index]': build/include/IMP/algebra/MultiArray.h:185: instantiated from 'IMP::algebra::MultiArray<T, D>& IMP::algebra::MultiArray<T, D>::operator=(const IMP::algebra::MultiArray<T, D>&) [with T = double, int D = 2]' build/include/IMP/algebra/MultiArray.h:92: instantiated from 'IMP::algebra::MultiArray<T, D>::MultiArray(const IMP::algebra::MultiArray<T, D>&) [with T = double, int D = 2]' build/include/IMP/algebra/Matrix2D.h:21: instantiated from here build/include/IMP/algebra/MultiArray.h:50: warning: comparison between signed and unsigned integer expressions scons: *** [modules/em/src/ImageHeader.os] Error 1
Javi/Daniel - is it ok on your machines ? does not compile on viola with the latest SVN version.
thanks, Keren.
participants (2)
-
Daniel Russel
-
Keren Lasker