New subject: on documentation

6 Aug 2012


      ---------- Forwarded message ----------
From: Riccardo Pellarin pellarin.riccardo@gmail.com
Date: Mon, Aug 6, 2012 at 12:05 AM
Subject: Re: on documentation
To: Daniel Russel drussel@gmail.com
Hi Guys,
would like to share my thoughts on IMP documentation, maybe repeating
what we've already said. I think it is important, though, to share our
experience.
Let's suppose I want to fit two structures and calculate the Calpha-rmsd,
a very simple task.
Was typing RMSD in the imp manual search field and got 36 entries.
1st problem: the entry titles are uninformative, unless you know exactly
what each module is supposed to do (statistic, atom, multifit etc, etc).
Knowing a little bit of IMP I could filter the entries and remove all
classes belonging
to multifit and em modules, for instance. Let's take the first seven
entries which
might do what I want to do:
Member IMP::statistics::ConfigurationSetRMSDMetric::
ConfigurationSetRMSDMetric
class IMP::atom::RMSDCalculator
Member IMP::atom::RMSDCalculator::RMSDCalculator
Member IMP::atom::RMSDCalculator::RMSDCalculators
class IMP::statistics::ConfigurationSetRMSDMetric
Member IMP::atom::get_pairwise_rmsd_score
IMP::atom::get_rmsd
2nd problem: I see a lot of redundancy in the list, and a lot of confusion:
classes and members are mixed together... why is that? Wouldn't it be
cleaner
to separate them in two different lists?
Now, let's clean a little bit the list, my eyes go on these candidates:
class IMP::atom::RMSDCalculator
class IMP::statistics::ConfigurationSetRMSDMetric
Member IMP::atom::get_pairwise_rmsd_score
IMP::atom::get_rmsd
3rd Problem: there is not a single function that does a simple task
as an RMSD calculation, but there are many, with different flavors...
Probably many people implemented the same thing many times
because they didn't understand what was implemented before?
Let's have a look at the functions and see if they do what I want...
(as a side note, reading the documentation of IMP functions,
I really would like to leave notes on many of them....)
Let's start with IMP::atom::RMSDCalculator
Detailed Description
Fast rmsd calculation. Used to calculate rmsd between multiple
transformation that operate on the same particles
Well, that is not detailed.
What is a "fast rmsd"? No structural fitting I guess? What is the "rmsd
between multiple
transformations" ? Maybe rigid body transformations? I start to doubt that
this rmsd function is
calculated between particles at all...
Let's try to rewrite it. This is what I would like to read:
*Short Description:* Calculates the rmsd of a list of particles.
*Detailed Description:* Calculates the root mean square displacement (rmsd)
of particles
subjected to rigid-body transformations. The rmsd calculation does
not perform structural best-fit alignment.
*Usage:*
1) construct the class using a list of particles:
RMSDCalculator(particles)
2) get the rmsd using the method get_rmsd(trans3D1, trans3D2)
where trans3D1 and trans3D2 are rigid body transformations of the
reference and displaced configurations, respectively.
*Simple Example: ....*
*
*
It would be cool if the short description appears in the search
page, along with the class name.
Let's go to the second
function:  IMP::statistics::ConfigurationSetRMSDMetric
Detailed Description
Compute the RMSD between specified sets of particles in pairs of
configurations, within a configuration set
this is even more cryptic. Maybe:
Calculates the RMSD of a list of particles between all possible
configurations pairs in a "configuration set", which is....
Strangely, this class has not get_rmsd(), but get_distance() method....
Is that the same?
Let's go to another example: IMP::atom::get_pairwise_rmsd_score
The measure quantifies the RMSD between the relative placements of two
components compared to a reference relative placement. First, the two
compared structures are brought into the same frame of reference by
superposing the first pair of equivalent domains (ref1 and mdl1). Next, the
RMSD is calculated for the second component
What are the components? Maybe subunits? What are the domains? Why the
function is called rmsd_score? Is that different from the rmsd?
Ok I can go on for almost every function and method in IMP.
At the end, I'm completely unsure of what function I should use
for my task.... they all look the same.
Here's my proposal: Every function documentation must have these entries:
*Short Description:* (appears in the search page)
*Detailed Description:*
[*Algorithm Description:* in some cases]
*Usage:*
*Simple Example: *
The developer might leave these fields empty, of course.
When I search something, the first entries should be the
ones which are more relevant and documented.
Or maybe, the search page should have Documented and Undocumented results.
(where Undocumented is a function which is lacking a long documentation
page).
Of course we cannot force people to write comprehensive documentation,
but at least we can give the user the option of choosing the functions which
are better documented: that will be bad for developers that write code
which is
undocumented, since their code will never be used by somebody else.
As a user, I will be skeptical using something where the documentation
fields
are empty!
Sorry, that was long. Hope to hear your feedbacks

Fwd: on documentation

Daniel Russel

Daniel Russel

Daniel Russel

Javier Velazquez-Muriel

Daniel Russel

Dave Barkan

Daniel Russel

Daniel Russel

Barak Raveh

Barak Raveh

Dave Barkan

Yannick Spill

Barak Raveh

Ben Webb

Dina Schneidman

Barak Raveh

Barak Raveh

Daniel Russel

tags (0)

participants (7)