Questions about example script local_fitting.py
hello list,
I started my IMP apprenticeship by trying to run and understand the example script local_fitting.py script. I spent some time rewritting it and parsing the documentation, and I come to you now with remaining questions. I am pretty new to the CryoEM field, so I will probably ask naïve questions. If some of these questions can be answered by reading a manual, please tell me which manual and I will be glad to read it.
1.) As a first remark concerning this script, the module IMP.algebra is used throughout the script, though never imported. Funnily, it puzzles my PyDev plugin but does not appear to lead to execution problems.
2.) I have problems wit regard to the resolution parameter and its apparent absence in density maps : at line 30 the header of a loaded density map is modified. dmap.get_header_writable().set_resolution(resolution) I though, the idea was to fix a possibly distinct resolution value for the resampled map; so I commented the line and replaced it by sampled_input_density.get_header_writable().set_resolution(resolution) at line 36… Which resulted in an error during the local fitting process. I thus checked wether the map header contained a resolution or not : >>> dmap.get_header().get_has_resolution() False >>> dmap.get_header().get_resolution() nan Does the resolution never appear it map files ? What is the incidence/relevance of "arbitrarily" setting this parameter such as it is apparently done in this example ?
3.) I don't fully understand the way objects share elements, and what happens when these elements are moved. For instance, the three objects : line 19 : mh=IMP.atom.read_pdb(IMP.em.get_example_path("input.pdb"),m,sel) line 24 : ps= IMP.Particles(IMP.core.get_leaves(mh)) line 58 : prot_rb = IMP.atom.setup_as_rigid_body(mh)
seem to share atoms since when moving : line 60 : IMP.core.transform(prot_rb,local_trans) it looks like every atoms in mh and in ps have also been moved.
I thus have the feeling that all three objects share references on the same atoms.
If this is true, I don't understand what happens when iteratively applying transformations to prot_rb at the end of the script : line 100 : prot_rb.set_transformation(fitting_sols.get_transformation(i)) line 102 : rmsd=IMP.atom.get_rmsd(IMP.core.XYZs(ps),IMP.core.XYZs(IMP.core.get_leaves(mh_ref))) line 103 : IMP.atom.write_pdb(mh,"temp_"+str(i)+".pdb") Since the transformations are applied iteratively, I would expect them to be composed; hence, parsing from 1 to fitting_sols.get_number_of_solutions() in this order or in reverse order would not lead to the same result. I checked this behavior, and the solutions are (happily) the same whatever the order in which they are parsed... and I can't guess why…
4.) Is there a difference between prot_rb.set_transformation and IMP.core.transform ?
5.) The signature is not the same for function IMP::em::local_rigid_fitting in the documentation and in the script. In the documentation, it has a ( FittingSolutions & fr) argument whereas, according to the script, this in fact seems to be the return type
6.) I could not find the documentation for the class FittingSolutions
Thanks for any hint
--Ben
> I started my IMP apprenticeship by trying to run and understand the example script local_fitting.py script. I spent some time rewritting it and parsing the documentation, and I come to you now with remaining questions. I am pretty new to the CryoEM field, so I will probably ask naïve questions. If some of these questions can be answered by reading a manual, please tell me which manual and I will be glad to read it. I'm not very familiar with the rigid fitting and em code, so I will just answer the general IMP questions. I've cced Keren who wrote that code in case she is not on this list.
> 1.) As a first remark concerning this script, the module IMP.algebra is used throughout the script, though never imported. Funnily, it puzzles my PyDev plugin but does not appear to lead to execution problems. Now that you mention it, it is puzzling. Various other modules depend on IMP.algebra and the result is that it is imported when you import say, IMP.core. But why this is, I don't really understand.
> 2.) I have problems wit regard to the resolution parameter and its apparent absence in density maps : > at line 30 the header of a loaded density map is modified. > dmap.get_header_writable().set_resolution(resolution) > I though, the idea was to fix a possibly distinct resolution value for the resampled map; so I commented the line and replaced it by > sampled_input_density.get_header_writable().set_resolution(resolution) > at line 36… Which resulted in an error during the local fitting process. > I thus checked wether the map header contained a resolution or not : > >>> dmap.get_header().get_has_resolution() > False > >>> dmap.get_header().get_resolution() > nan > Does the resolution never appear it map files ? > What is the incidence/relevance of "arbitrarily" setting this parameter such as it is apparently done in this example ? > > > 3.) I don't fully understand the way objects share elements, and what happens when these elements are moved. > For instance, the three objects : > line 19 : mh=IMP.atom.read_pdb(IMP.em.get_example_path("input.pdb"),m,sel) > line 24 : ps= IMP.Particles(IMP.core.get_leaves(mh)) > line 58 : prot_rb = IMP.atom.setup_as_rigid_body(mh) > > seem to share atoms since when moving : > line 60 : IMP.core.transform(prot_rb,local_trans) > it looks like every atoms in mh and in ps have also been moved. > > I thus have the feeling that all three objects share references on the same atoms. Yes. Particles, Restraints and other things inheriting from IMP.Object are not copied (or copyable). So get_X methods return a shared object when it is an IMP.Object.
> If this is true, I don't understand what happens when iteratively applying transformations to prot_rb at the end of the script : > line 100 : prot_rb.set_transformation(fitting_sols.get_transformation(i)) > line 102 : rmsd=IMP.atom.get_rmsd(IMP.core.XYZs(ps),IMP.core.XYZs(IMP.core.get_leaves(mh_ref))) > line 103 : IMP.atom.write_pdb(mh,"temp_"+str(i)+".pdb") > Since the transformations are applied iteratively, I would expect them to be composed; hence, parsing from 1 to fitting_sols.get_number_of_solutions() > in this order or in reverse order would not lead to the same result. I checked this behavior, and the solutions are (happily) the same whatever the order in which they are parsed... and I can't guess why… See below.
> > 4.) Is there a difference between prot_rb.set_transformation and IMP.core.transform ? Yes, and this has been a constant source of confusion, so a way of simplifying it would be nice. The current orientation of a rigid body is defined by a transformation (equivalent to the x,y,z coordinates of a simple point). set_transformation() replaces this transformation with another one (equivalent to set_coordinates() on an IMP.core.XYZ). transform() transforms a rigid body, that is it composes the previous transformation with the supplied one and calls set_transformation() with that (equivalent to transforming the coordinates of an IMP.core.XYZ).
yes - I was not part of this mailing list. please see below answers to the rest of the questions. On Jul 19, 2010, at 8:04 AM, Daniel Russel wrote:
>> I started my IMP apprenticeship by trying to run and understand >> the example script local_fitting.py script. I spent some time >> rewritting it and parsing the documentation, and I come to you now >> with remaining questions. I am pretty new to the CryoEM field, so I >> will probably ask naïve questions. If some of these questions can >> be answered by reading a manual, please tell me which manual and I >> will be glad to read it. > I'm not very familiar with the rigid fitting and em code, so I will > just answer the general IMP questions. I've cced Keren who wrote > that code in case she is not on this list. > >> 1.) As a first remark concerning this script, the module >> IMP.algebra is used throughout the script, though never imported. >> Funnily, it puzzles my PyDev plugin but does not appear to lead to >> execution problems. > Now that you mention it, it is puzzling. Various other modules > depend on IMP.algebra and the result is that it is imported when you > import say, IMP.core. But why this is, I don't really understand. > >> 2.) I have problems wit regard to the resolution parameter and its >> apparent absence in density maps : >> at line 30 the header of a loaded density map is modified. >> dmap.get_header_writable().set_resolution(resolution) >> I though, the idea was to fix a possibly distinct resolution value >> for the resampled map; so I commented the line and replaced it by >> >> sampled_input_density >> .get_header_writable().set_resolution(resolution) >> at line 36… Which resulted in an error during the local fitting >> process. >> I thus checked wether the map header contained a resolution or not : >> >>> dmap.get_header().get_has_resolution() >> False >> >>> dmap.get_header().get_resolution() >> nan >> Does the resolution never appear it map files ? Resolution is no stored in the density files. >> What is the incidence/relevance of "arbitrarily" setting this >> parameter such as it is apparently done in this example ? >> Setting a wrong resolution will lead to wrong fitting results. In the next steps of the procedure we simulate the template protein to the resolution of the map and apply a cross-correlation measure to compare between the two. If the smoothed template significantly differs from the density map we will not be able to get correct solutions. >> >> 3.) I don't fully understand the way objects share elements, and >> what happens when these elements are moved. >> For instance, the three objects : >> line 19 : >> mh=IMP.atom.read_pdb(IMP.em.get_example_path("input.pdb"),m,sel) >> line 24 : ps= IMP.Particles(IMP.core.get_leaves(mh)) >> line 58 : prot_rb = IMP.atom.setup_as_rigid_body(mh) >> >> seem to share atoms since when moving : >> line 60 : IMP.core.transform(prot_rb,local_trans) >> it looks like every atoms in mh and in ps have also been moved. >> >> I thus have the feeling that all three objects share references on >> the same atoms. > Yes. Particles, Restraints and other things inheriting from > IMP.Object are not copied (or copyable). So get_X methods return a > shared object when it is an IMP.Object. > > >> If this is true, I don't understand what happens when iteratively >> applying transformations to prot_rb at the end of the script : >> line 100 : >> prot_rb.set_transformation(fitting_sols.get_transformation(i)) >> line 102 : >> rmsd >> = >> IMP >> .atom >> .get_rmsd >> (IMP.core.XYZs(ps),IMP.core.XYZs(IMP.core.get_leaves(mh_ref))) >> line 103 : IMP.atom.write_pdb(mh,"temp_"+str(i)+".pdb") >> Since the transformations are applied iteratively, I would expect >> them to be composed; hence, parsing from 1 to >> fitting_sols.get_number_of_solutions() >> in this order or in reverse order would not lead to the same >> result. I checked this behavior, and the solutions are (happily) >> the same whatever the order in which they are parsed... and I can't >> guess why… > See below. > >> >> 4.) Is there a difference between prot_rb.set_transformation and >> IMP.core.transform ? > Yes, and this has been a constant source of confusion, so a way of > simplifying it would be nice. The current orientation of a rigid > body is defined by a transformation (equivalent to the x,y,z > coordinates of a simple point). set_transformation() replaces this > transformation with another one (equivalent to set_coordinates() on > an IMP.core.XYZ). transform() transforms a rigid body, that is it > composes the previous transformation with the supplied one and calls > set_transformation() with that (equivalent to transforming the > coordinates of an IMP.core.XYZ). > >
>>> Does the resolution never appear it map files ? > Resolution is no stored in the density files. >>> What is the incidence/relevance of "arbitrarily" setting this parameter such as it is apparently done in this example ? >>> > Setting a wrong resolution will lead to wrong fitting results. In the next steps of the procedure we simulate the template protein to the resolution of the map and apply a cross-correlation measure to compare between the two. If the smoothed template significantly differs from the density map we will not be able to get correct solutions.
If I get it right; though this parameter is critical for computations, the resolution is not stored in density files. HENCE you have to remember the resolution of your map, so that you can fill this parameter in the map header when you load it in IMP.
In fact, I think I don't really understand the notion of resolution in Cryo EM, or maybe it is just in IMP. I have an intuition from my experience with RX structures, where resolution has something to do with the precision and size of data used for the density map generation; and ultimately represents some kind of a "level of detail".
In order to better understand the two notions, I used IMP.em.SampledDensityMap and IMP.em.write_map to generate density maps of a structures with different values for voxel size and resolution. It appears that increasing the resolution of a map (while keeping the size of a voxel unchanged) indeed blurs the information and "potatoes" isodensities, which is fine; but for some reason it also affect the data size : the bigger the resolution, the bigger the map size (in angstrom as well as in number of voxels). Is this normal a behavior ?
--Ben
On Wed, Jul 21, 2010 at 11:46 AM, Benjamin SCHWARZ schwarz.ben@gmail.comwrote:
> > Does the resolution never appear it map files ? > > Resolution is no stored in the density files. > > What is the incidence/relevance of "arbitrarily" setting this parameter > such as it is apparently done in this example ? > > Setting a wrong resolution will lead to wrong fitting results. In the next > steps of the procedure we simulate the template protein to the resolution of > the map and apply a cross-correlation measure to compare between the two. > If the smoothed template significantly differs from the density map we will > not be able to get correct solutions. > > > If I get it right; though this parameter is critical for computations, the > resolution is not stored in density files. HENCE you have to remember the > resolution of your map, so that you can fill this parameter in the map > header when you load it in IMP. > > In fact, I think I don't really understand the notion of resolution in Cryo > EM, or maybe it is just in IMP. I have an intuition from my experience with > RX structures, where resolution has something to do with the precision and > size of data used for the density map generation; and ultimately represents > some kind of a "level of detail". > > In order to better understand the two notions, I used > IMP.em.SampledDensityMap and IMP.em.write_map to generate density maps of > a structures with different values for voxel size and resolution. It appears > that increasing the resolution of a map (while keeping the size of a voxel > unchanged) indeed blurs the information and "potatoes" isodensities, which > is fine; but for some reason it also affect the data size : the bigger the > resolution, the bigger the map size (in angstrom as well as in number of > voxels). Is this normal a behavior ? >
just to clarify: pixelsize and resolution of an EM map are not necessary correlated - but for computational efficiency one should choose the pixel (or voxel) size according to the resolution. the max resolution of an EM map is 1/2 voxelsize. however, most people will oversample to avoid signal loss as result of interpolations. so voxelsize should be 3-4 * resolution. if the voxelsize of your map is ridiculously small compared to the resolution you can resample your map in a dedicated image processing software, typically using trilinear interpolation (spider, eman, tom-toolbox).
cheers
frido
> --Ben > > _______________________________________________ > IMP-users mailing list > IMP-users@salilab.org > https://salilab.org/mailman/listinfo/imp-users > >
Like Daniel, I will leave the EM-specific parts to one of our EM people...
On 7/19/10 7:14 AM, Benjamin SCHWARZ wrote: > 1.) As a first remark concerning this script, the module IMP.algebra is > used throughout the script, though never imported. Funnily, it puzzles > my PyDev plugin but does not appear to lead to execution problems.
Python has a global dictionary called sys.modules which lists all modules that have been loaded. So once you've loaded a module, you can use it for the rest of the lifetime of the interpreter. While IMP.algebra isn't loaded in the script itself, it is loaded by one of the modules that the script loads. I agree it would be clearer if the script loaded the module itself though.
> 3.) I don't fully understand the way objects share elements, and what > happens when these elements are moved. > For instance, the three objects : > line 19 : mh=IMP.atom.read_pdb(IMP.em.get_example_path("input.pdb"),m,sel) > line 24 : ps= IMP.Particles(IMP.core.get_leaves(mh)) > line 58 : prot_rb = IMP.atom.setup_as_rigid_body(mh) > > seem to share atoms
A better way of thinking about it would be to say that they all share the same set of IMP Particles (roughly speaking). All of the information about the system is stored in Particles. But you can apply decorators to particles to treat them in different ways. So here 'ps' is a flat list of all of the atoms in the system, as Particles. 'mh' is a Hierarchy decorator - it points to the same set of atoms, but as a hierarchy of atoms, residues, chains, proteins etc. rather than a flat list. 'prot_rb' is a rigid body that again points to the same set of atoms but treats them as a rigid body.
> 5.) The signature is not the same for function > IMP::em::local_rigid_fitting in the documentation and in the script. > In the documentation, it has a ( FittingSolutions & /fr) /argument > whereas, according to the script, this in fact seems to be the return type
Where are you seeing that in the documentation? According to http://salilab.org/imp/1.0/doc/html/namespaceIMP_1_1em.html it returns a FittingSolutions object.
Ben
> Python has a global dictionary called sys.modules which lists all modules that have been loaded. So once you've loaded a module, you can use it for the rest of the lifetime of the interpreter. While IMP.algebra isn't loaded in the script itself, it is loaded by one of the modules that the script loads. I agree it would be clearer if the script loaded the module itself though.
After a few tests I have a better understanding of the mechanism. In fact, there seem to be a special behavior when handling packages : once a packaged module is imported, it is accessible to other modules in that package without further importations. This seems not to be the case when dealing with unpackaged modules.
> Where are you seeing that in the documentation? According to > http://salilab.org/imp/1.0/doc/html/namespaceIMP_1_1em.html > it returns a FittingSolutions object. you are right, I was not reading the good manual : https://salilab.org/imp/doc/html/namespaceIMP_1_1em.html#d07d358082b943413d6... I think I landed here from the "documentation" link on the wiki pages : http://salilab.org/imp/wiki/
>> 4.) Is there a difference between prot_rb.set_transformation and IMP.core.transform ? > Yes, and this has been a constant source of confusion, so a way of simplifying it would be nice. The current orientation of a rigid body is defined by a transformation (equivalent to the x,y,z coordinates of a simple point). set_transformation() replaces this transformation with another one (equivalent to set_coordinates() on an IMP.core.XYZ). transform() transforms a rigid body, that is it composes the previous transformation with the supplied one and calls set_transformation() with that (equivalent to transforming the coordinates of an IMP.core.XYZ). With your explanations and after a few more tests I finally understand.
Thank you all for your very fast answers
participants (5)
-
Ben Webb
-
Benjamin SCHWARZ
-
Daniel Russel
-
Friedrich Foerster
-
Keren Lasker