Fwd: on documentation
---------- Forwarded message ---------- From: Riccardo Pellarin pellarin.riccardo@gmail.com Date: Mon, Aug 6, 2012 at 12:05 AM Subject: Re: on documentation To: Daniel Russel drussel@gmail.com
Hi Guys,
would like to share my thoughts on IMP documentation, maybe repeating what we've already said. I think it is important, though, to share our experience.
Let's suppose I want to fit two structures and calculate the Calpha-rmsd, a very simple task.
Was typing RMSD in the imp manual search field and got 36 entries.
1st problem: the entry titles are uninformative, unless you know exactly what each module is supposed to do (statistic, atom, multifit etc, etc).
Knowing a little bit of IMP I could filter the entries and remove all classes belonging to multifit and em modules, for instance. Let's take the first seven entries which might do what I want to do:
Member IMP::statistics::ConfigurationSetRMSDMetric:: ConfigurationSetRMSDMetric class IMP::atom::RMSDCalculator Member IMP::atom::RMSDCalculator::RMSDCalculator Member IMP::atom::RMSDCalculator::RMSDCalculators class IMP::statistics::ConfigurationSetRMSDMetric Member IMP::atom::get_pairwise_rmsd_score IMP::atom::get_rmsd
2nd problem: I see a lot of redundancy in the list, and a lot of confusion: classes and members are mixed together... why is that? Wouldn't it be cleaner to separate them in two different lists?
Now, let's clean a little bit the list, my eyes go on these candidates:
class IMP::atom::RMSDCalculator class IMP::statistics::ConfigurationSetRMSDMetric Member IMP::atom::get_pairwise_rmsd_score IMP::atom::get_rmsd
3rd Problem: there is not a single function that does a simple task as an RMSD calculation, but there are many, with different flavors... Probably many people implemented the same thing many times because they didn't understand what was implemented before?
Let's have a look at the functions and see if they do what I want... (as a side note, reading the documentation of IMP functions, I really would like to leave notes on many of them....)
Let's start with IMP::atom::RMSDCalculator
Detailed Description
Fast rmsd calculation. Used to calculate rmsd between multiple transformation that operate on the same particles
Well, that is not detailed. What is a "fast rmsd"? No structural fitting I guess? What is the "rmsd between multiple transformations" ? Maybe rigid body transformations? I start to doubt that this rmsd function is calculated between particles at all... Let's try to rewrite it. This is what I would like to read:
*Short Description:* Calculates the rmsd of a list of particles.
*Detailed Description:* Calculates the root mean square displacement (rmsd) of particles subjected to rigid-body transformations. The rmsd calculation does not perform structural best-fit alignment. *Usage:* 1) construct the class using a list of particles: RMSDCalculator(particles) 2) get the rmsd using the method get_rmsd(trans3D1, trans3D2) where trans3D1 and trans3D2 are rigid body transformations of the reference and displaced configurations, respectively. *Simple Example: ....* * * It would be cool if the short description appears in the search page, along with the class name.
Let's go to the second function: IMP::statistics::ConfigurationSetRMSDMetric
Detailed Description
Compute the RMSD between specified sets of particles in pairs of configurations, within a configuration set this is even more cryptic. Maybe:
Calculates the RMSD of a list of particles between all possible configurations pairs in a "configuration set", which is....
Strangely, this class has not get_rmsd(), but get_distance() method.... Is that the same?
Let's go to another example: IMP::atom::get_pairwise_rmsd_score The measure quantifies the RMSD between the relative placements of two components compared to a reference relative placement. First, the two compared structures are brought into the same frame of reference by superposing the first pair of equivalent domains (ref1 and mdl1). Next, the RMSD is calculated for the second component
What are the components? Maybe subunits? What are the domains? Why the function is called rmsd_score? Is that different from the rmsd?
Ok I can go on for almost every function and method in IMP.
At the end, I'm completely unsure of what function I should use for my task.... they all look the same.
Here's my proposal: Every function documentation must have these entries:
*Short Description:* (appears in the search page) *Detailed Description:* [*Algorithm Description:* in some cases] *Usage:* *Simple Example: *
The developer might leave these fields empty, of course. When I search something, the first entries should be the ones which are more relevant and documented. Or maybe, the search page should have Documented and Undocumented results. (where Undocumented is a function which is lacking a long documentation page).
Of course we cannot force people to write comprehensive documentation, but at least we can give the user the option of choosing the functions which are better documented: that will be bad for developers that write code which is undocumented, since their code will never be used by somebody else. As a user, I will be skeptical using something where the documentation fields are empty!
Sorry, that was long. Hope to hear your feedbacks
To add my two cents: - the search in doxygen kind of sucks and google definitely isn't better on this. There is no good way that I can think of to prioritize search results, so I'm not sure where to go to make this aspect better. And, unfortunately, as one adds more to the API and documentation, you just get more hits it more or less random order that you have to look through. Anyone have any good ideas on this? We can try going back to the doxygen live search as that may allow one to experiment more interactively with search terms (I had severe limitations before, but these may have been fixed).
- in general, you need to read the documentation of all the bases classes of a class and the module before you will understand the class. I think this cannot be reasonably avoided. Otherwise content would have to be duplicated in many places, which invariably results in it having more errors/being even less compete (or requiring a great deal more time for the same amount of content). Hopefully something like ConfigurationSetRMSDMetric would make more sense in light understanding statistics::Metric. For example, it has no get_rmsd() method since it is a specialization of the Metric base class and that defines a get_distance() virtual method, so having a get_rmsd() method would be useless where it is supposed to be used.
- What I would really like to see is that when someone spends the time to figure something out like this, they add an example/patch the comments in the files and then sends the patch off to someone to integrate :-)
- I'd like to move to a more structured commit model for IMP with some more review of things that go in so that we can prod people (and me) more to improve docs/merge redundant things. I typed up some thoughts on modifying the comment model here < https://github.com/salilab/imp/wiki/A-proposed-commit-model-for-IMP%3E Feel free to edit (or request permissions to edit, I'm a bit unclear on how those are regulated :-) The main idea would be that if things, in general, have two people look at them before going into most modules in IMP, they should be a bit more coherent and documented. And, if one is able to share things prior to committing them to the SVN repository, they can stay in purgatory a bit longer (and will hopefully be worked on a bit longer), before they considered good enough and work on them ceases (as tends to happen). Not sure if this will work :-)
On Mon, Aug 6, 2012 at 9:56 AM, Daniel Russel drussel@gmail.com wrote:
> ---------- Forwarded message ---------- > From: Riccardo Pellarin pellarin.riccardo@gmail.com > Date: Mon, Aug 6, 2012 at 12:05 AM > Subject: Re: on documentation > To: Daniel Russel drussel@gmail.com > > Hi Guys, > > would like to share my thoughts on IMP documentation, maybe repeating > what we've already said. I think it is important, though, to share our > experience. > > Let's suppose I want to fit two structures and calculate the Calpha-rmsd, > a very simple task. > > Was typing RMSD in the imp manual search field and got 36 entries. > > 1st problem: the entry titles are uninformative, unless you know exactly > what each module is supposed to do (statistic, atom, multifit etc, etc). > > Knowing a little bit of IMP I could filter the entries and remove all > classes belonging > to multifit and em modules, for instance. Let's take the first seven > entries which > might do what I want to do: > > Member IMP::statistics::ConfigurationSetRMSDMetric:: > ConfigurationSetRMSDMetric > class IMP::atom::RMSDCalculator > Member IMP::atom::RMSDCalculator::RMSDCalculator > Member IMP::atom::RMSDCalculator::RMSDCalculators > class IMP::statistics::ConfigurationSetRMSDMetric > Member IMP::atom::get_pairwise_rmsd_score > IMP::atom::get_rmsd > > 2nd problem: I see a lot of redundancy in the list, and a lot of confusion: > classes and members are mixed together... why is that? Wouldn't it be > cleaner > to separate them in two different lists? > > Now, let's clean a little bit the list, my eyes go on these candidates: > > class IMP::atom::RMSDCalculator > class IMP::statistics::ConfigurationSetRMSDMetric > Member IMP::atom::get_pairwise_rmsd_score > IMP::atom::get_rmsd > > 3rd Problem: there is not a single function that does a simple task > as an RMSD calculation, but there are many, with different flavors... > Probably many people implemented the same thing many times > because they didn't understand what was implemented before? > > Let's have a look at the functions and see if they do what I want... > (as a side note, reading the documentation of IMP functions, > I really would like to leave notes on many of them....) > > Let's start with IMP::atom::RMSDCalculator > > Detailed Description > > Fast rmsd calculation. Used to calculate rmsd between multiple > transformation that operate on the same particles > > > Well, that is not detailed. > What is a "fast rmsd"? No structural fitting I guess? What is the "rmsd > between multiple > transformations" ? Maybe rigid body transformations? I start to doubt > that this rmsd function is > calculated between particles at all... > Let's try to rewrite it. This is what I would like to read: > > *Short Description:* Calculates the rmsd of a list of particles. > > *Detailed Description:* Calculates the root mean square displacement > (rmsd) of particles > subjected to rigid-body transformations. The rmsd calculation does > not perform structural best-fit alignment. > *Usage:* > 1) construct the class using a list of particles: > RMSDCalculator(particles) > 2) get the rmsd using the method get_rmsd(trans3D1, trans3D2) > where trans3D1 and trans3D2 are rigid body transformations of the > reference and displaced configurations, respectively. > *Simple Example: ....* > * > * > It would be cool if the short description appears in the search > page, along with the class name. > > Let's go to the second > function: IMP::statistics::ConfigurationSetRMSDMetric > > Detailed Description > > Compute the RMSD between specified sets of particles in pairs of > configurations, within a configuration set > this is even more cryptic. Maybe: > > Calculates the RMSD of a list of particles between all possible > configurations pairs in a "configuration set", which is.... > > Strangely, this class has not get_rmsd(), but get_distance() method.... > Is that the same? > > Let's go to another example: IMP::atom::get_pairwise_rmsd_score > The measure quantifies the RMSD between the relative placements of two > components compared to a reference relative placement. First, the two > compared structures are brought into the same frame of reference by > superposing the first pair of equivalent domains (ref1 and mdl1). Next, the > RMSD is calculated for the second component > > What are the components? Maybe subunits? What are the domains? Why the > function is called rmsd_score? Is that different from the rmsd? > > Ok I can go on for almost every function and method in IMP. > > At the end, I'm completely unsure of what function I should use > for my task.... they all look the same. > > Here's my proposal: Every function documentation must have these entries: > > *Short Description:* (appears in the search page) > *Detailed Description:* > [*Algorithm Description:* in some cases] > *Usage:* > *Simple Example: * > > The developer might leave these fields empty, of course. > When I search something, the first entries should be the > ones which are more relevant and documented. > Or maybe, the search page should have Documented and Undocumented results. > (where Undocumented is a function which is lacking a long documentation > page). > > Of course we cannot force people to write comprehensive documentation, > but at least we can give the user the option of choosing the functions > which > are better documented: that will be bad for developers that write code > which is > undocumented, since their code will never be used by somebody else. > As a user, I will be skeptical using something where the documentation > fields > are empty! > > Sorry, that was long. Hope to hear your feedbacks > > > >
I'd like to amend a bit. Google search does work rather better. eg googling "site:salilab.org/imp/nightly/doc/html/ compute rmsd between two hierarchies" does get you to the page with the right function as the top hit (although it is a very long page, I'm trying to figure out something for that).
Perhaps we should just drop the doxygen search entirely. Does anyone find it useful?
For people who don't know "site:xxxx" it restricts google to only return hits whose url matches the prefix.
On Mon, Aug 6, 2012 at 11:03 AM, Daniel Russel drussel@gmail.com wrote:
> To add my two cents: > - the search in doxygen kind of sucks and google definitely isn't better > on this. There is no good way that I can think of to prioritize search > results, so I'm not sure where to go to make this aspect better. And, > unfortunately, as one adds more to the API and documentation, you just get > more hits it more or less random order that you have to look through. > Anyone have any good ideas on this? We can try going back to the doxygen > live search as that may allow one to experiment more interactively with > search terms (I had severe limitations before, but these may have been > fixed). > > - in general, you need to read the documentation of all the bases classes > of a class and the module before you will understand the class. I think > this cannot be reasonably avoided. Otherwise content would have to be > duplicated in many places, which invariably results in it having more > errors/being even less compete (or requiring a great deal more time for the > same amount of content). Hopefully something > like ConfigurationSetRMSDMetric would make more sense in light > understanding statistics::Metric. For example, it has no get_rmsd() method > since it is a specialization of the Metric base class and that defines a > get_distance() virtual method, so having a get_rmsd() method would be > useless where it is supposed to be used. > > - What I would really like to see is that when someone spends the time to > figure something out like this, they add an example/patch the comments in > the files and then sends the patch off to someone to integrate :-) > > - I'd like to move to a more structured commit model for IMP with some > more review of things that go in so that we can prod people (and me) more > to improve docs/merge redundant things. I typed up some thoughts on > modifying the comment model here < > https://github.com/salilab/imp/wiki/A-proposed-commit-model-for-IMP%3E Feel > free to edit (or request permissions to edit, I'm a bit unclear on how > those are regulated :-) The main idea would be that if things, in general, > have two people look at them before going into most modules in IMP, they > should be a bit more coherent and documented. And, if one is able to share > things prior to committing them to the SVN repository, they can stay in > purgatory a bit longer (and will hopefully be worked on a bit longer), > before they considered good enough and work on them ceases (as tends to > happen). Not sure if this will work :-) > > > On Mon, Aug 6, 2012 at 9:56 AM, Daniel Russel drussel@gmail.com wrote: > >> ---------- Forwarded message ---------- >> From: Riccardo Pellarin pellarin.riccardo@gmail.com >> Date: Mon, Aug 6, 2012 at 12:05 AM >> Subject: Re: on documentation >> To: Daniel Russel drussel@gmail.com >> >> Hi Guys, >> >> would like to share my thoughts on IMP documentation, maybe repeating >> what we've already said. I think it is important, though, to share our >> experience. >> >> Let's suppose I want to fit two structures and calculate the Calpha-rmsd, >> a very simple task. >> >> Was typing RMSD in the imp manual search field and got 36 entries. >> >> 1st problem: the entry titles are uninformative, unless you know exactly >> what each module is supposed to do (statistic, atom, multifit etc, etc). >> >> Knowing a little bit of IMP I could filter the entries and remove all >> classes belonging >> to multifit and em modules, for instance. Let's take the first seven >> entries which >> might do what I want to do: >> >> Member IMP::statistics::ConfigurationSetRMSDMetric:: >> ConfigurationSetRMSDMetric >> class IMP::atom::RMSDCalculator >> Member IMP::atom::RMSDCalculator::RMSDCalculator >> Member IMP::atom::RMSDCalculator::RMSDCalculators >> class IMP::statistics::ConfigurationSetRMSDMetric >> Member IMP::atom::get_pairwise_rmsd_score >> IMP::atom::get_rmsd >> >> 2nd problem: I see a lot of redundancy in the list, and a lot of >> confusion: >> classes and members are mixed together... why is that? Wouldn't it be >> cleaner >> to separate them in two different lists? >> >> Now, let's clean a little bit the list, my eyes go on these candidates: >> >> class IMP::atom::RMSDCalculator >> class IMP::statistics::ConfigurationSetRMSDMetric >> Member IMP::atom::get_pairwise_rmsd_score >> IMP::atom::get_rmsd >> >> 3rd Problem: there is not a single function that does a simple task >> as an RMSD calculation, but there are many, with different flavors... >> Probably many people implemented the same thing many times >> because they didn't understand what was implemented before? >> >> Let's have a look at the functions and see if they do what I want... >> (as a side note, reading the documentation of IMP functions, >> I really would like to leave notes on many of them....) >> >> Let's start with IMP::atom::RMSDCalculator >> >> Detailed Description >> >> Fast rmsd calculation. Used to calculate rmsd between multiple >> transformation that operate on the same particles >> >> >> Well, that is not detailed. >> What is a "fast rmsd"? No structural fitting I guess? What is the "rmsd >> between multiple >> transformations" ? Maybe rigid body transformations? I start to doubt >> that this rmsd function is >> calculated between particles at all... >> Let's try to rewrite it. This is what I would like to read: >> >> *Short Description:* Calculates the rmsd of a list of particles. >> >> *Detailed Description:* Calculates the root mean square displacement >> (rmsd) of particles >> subjected to rigid-body transformations. The rmsd calculation does >> not perform structural best-fit alignment. >> *Usage:* >> 1) construct the class using a list of particles: >> RMSDCalculator(particles) >> 2) get the rmsd using the method get_rmsd(trans3D1, trans3D2) >> where trans3D1 and trans3D2 are rigid body transformations of the >> reference and displaced configurations, respectively. >> *Simple Example: ....* >> * >> * >> It would be cool if the short description appears in the search >> page, along with the class name. >> >> Let's go to the second >> function: IMP::statistics::ConfigurationSetRMSDMetric >> >> Detailed Description >> >> Compute the RMSD between specified sets of particles in pairs of >> configurations, within a configuration set >> this is even more cryptic. Maybe: >> >> Calculates the RMSD of a list of particles between all possible >> configurations pairs in a "configuration set", which is.... >> >> Strangely, this class has not get_rmsd(), but get_distance() method.... >> Is that the same? >> >> Let's go to another example: IMP::atom::get_pairwise_rmsd_score >> The measure quantifies the RMSD between the relative placements of two >> components compared to a reference relative placement. First, the two >> compared structures are brought into the same frame of reference by >> superposing the first pair of equivalent domains (ref1 and mdl1). Next, the >> RMSD is calculated for the second component >> >> What are the components? Maybe subunits? What are the domains? Why the >> function is called rmsd_score? Is that different from the rmsd? >> >> Ok I can go on for almost every function and method in IMP. >> >> At the end, I'm completely unsure of what function I should use >> for my task.... they all look the same. >> >> Here's my proposal: Every function documentation must have these entries: >> >> *Short Description:* (appears in the search page) >> *Detailed Description:* >> [*Algorithm Description:* in some cases] >> *Usage:* >> *Simple Example: * >> >> The developer might leave these fields empty, of course. >> When I search something, the first entries should be the >> ones which are more relevant and documented. >> Or maybe, the search page should have Documented and Undocumented >> results. >> (where Undocumented is a function which is lacking a long documentation >> page). >> >> Of course we cannot force people to write comprehensive documentation, >> but at least we can give the user the option of choosing the functions >> which >> are better documented: that will be bad for developers that write code >> which is >> undocumented, since their code will never be used by somebody else. >> As a user, I will be skeptical using something where the documentation >> fields >> are empty! >> >> Sorry, that was long. Hope to hear your feedbacks >> >> >> >> >
Here are my comments: > - in general, you need to read the documentation of all the bases > classes of a class and the module before you will understand the > class. I think this cannot be reasonably avoided.
Agreed
> ConfigurationSetRMSDMetric would make more sense in light > understanding statistics::Metric. For example, it has no get_rmsd() > method since it is a specialization of the Metric base class and that > defines a get_distance() virtual method, so having a get_rmsd() method > would be useless where it is supposed to be used. > > - What I would really like to see is that when someone spends the > time to figure something out like this, they add an example/patch the > comments in the files and then sends the patch off to someone to > integrate :-) > I wouldn't like to see any of these situations. If functions were documented (by the writer), the user wouldn't have to figure out anything or write patches. I agree with Riccardo also, a lot of functions in IMP (mine included) are cryptic, or require some knowledge.
> - I'd like to move to a more structured commit model for IMP with some > more review of things that go in so that we can prod people (and me) > more to improve docs/merge redundant things. I typed up some thoughts > on modifying the comment model here > https://github.com/salilab/imp/wiki/A-proposed-commit-model-for-IMP > Feel free to edit (or request permissions to edit, I'm a bit unclear > on how those are regulated :-) The main idea would be that if things, > in general, have two people look at them before going into most > modules in IMP, they should be a bit more coherent and documented. > And, if one is able to share things prior to committing them to the > SVN repository, they can stay in purgatory a bit longer (and will > hopefully be worked on a bit longer), before they considered good > enough and work on them ceases (as tends to happen). Not sure if this > will work :-) > Just a practical comment o this. If were are going to do all these changes, please let's switch to git as soon as possible. Learning another tool (git-svn or git flow) that is just an intermediate solution is a mess and has problems: For example, I talked to Ben and Daniel about moving developments not in the main repository between computers (mine and the cluster) and the options were suboptimal (use private git repositories, deal with github ...). I tried to use them anyway, and I just gave up because of the commit hooks:
http://jinntech.blogspot.com/2009/11/git-svn-and-failed-svn-commit-hooks.htm...
> >> - What I would really like to see is that when someone spends the time >> to figure something out like this, they add an example/patch the comments >> in the files and then sends the patch off to someone to integrate :-) >> >> I wouldn't like to see any of these situations. If functions were > documented (by the writer), the user wouldn't have to figure out anything > or write patches. I agree with Riccardo also, a lot of functions in IMP > (mine included) are cryptic, or require some knowledge. > I agree it is better when the writer documents it, but there will always be cases where - functionality is used in a way the writer did not foresee - something that was clear to the writer was not clear to someone else - corner cases that the writer wanted to leave ambiguous (to allow more flexibility with implementation, for example) are important to some user (here the act can act a a proposal to disambiguate the corner case) And so I think (especially since the currently the writers don't always even document well enough for themselves), it is a good habit at least for people who otherwise contribute to IMP.
> Here's my proposal: Every function documentation must have these entries: > > *Short Description:* (appears in the search page) > *Detailed Description:* > [*Algorithm Description:* in some cases] > *Usage:* > *Simple Example: > * I agree 100% with this suggestion by Riccardo. Also, that was a great example that he used about how the existing documentation doesn't help in many cases.
Here is the thing: as it stands now, it is fine for the lab and close friends to use IMP, because we can walk over to Daniel or Ben when we have a problem and quickly get it answered (thanks guys!). But the goal for IMP is to get it distributed globally. If a lab that doesn't have any connection to ours wants to use it, currently nothing but the most simple functionality will be usable due to the lack of documentation. So really the overall impact of the software is reduced.
A point of comparison is Modeller. Modeller is used by many labs without having to email us for tech support. That's because if you go to the Modeller website, it not only has a great tutorial, but also massive class and method documentation. Most questions can be answered by using that documentation as a reference, and I would argue that the documentation more than almost anything else is responsible for Modeller's widespread use.
The problem is that people don't want to document their code because once you get something working, you either feel great about it and want to apply it to something right away instead of writing how it works, or you're so sick of it that you don't want to go through each method and document it, or you say you'll do it later (but of course, later never comes). I am as guilty as anyone about this for lab stuff. Some insight from my brief time working in industry: I have to document everything I write here for IP reasons and there is a lot going on every day so I don't really have time to revisit old code. Therefore I have to document right away or it's never going to happen.
I thought Barak's attempt at organizing this was noble and was sad to see that people didn't have time to do it, even if it was just a day or two. I think another attempt should be made. Remember that your legacy as a scientist will be defined in large part by who uses your code after you have moved on from the lab. Therefore a day or two of your life to document as much as possible will have a high impact on your output. We have many lab alumni who spent five years on a project that no one else can use because it wasn't properly documented (I myself am still trying to make sure this doesn't happen). Honestly, I think Andrej should crack down a little more on this, i.e not signing off on paper submission until all the code that was used to do it is documented, or something like that, but I imagine he would prefer a more organic solution.
Another suggestion which might be extreme is to hire a summer college intern whose sole job would be documentation. That would actually get them to be the IMP expert and if they wanted to go into our field it would be a great way to start, and we would get the benefit of documented code.
(tl;dr ?)
thanks, dave
> > > > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
Cool to see all this interest in improving the docs, let's make sure it doesn't peter out :-) One useful direction to move would be to add more standards for documentation (to the developer guide) and then switch to a commit model as outlined in the previous link so there is someone to poke people to get them to do the work before it gets forgotten about.
On a related note, I find that when using other libraries (boost, CGAL, bullet being some recent example) when I have a question I tend to google for the answer (as opposed to looking through the docs directly). Useful hits often include the library docs, but more often than not other sources such as the support email lists and q&a sites such as stackoverflow (probably the single most useful site with popular libraries). As a result, I would highly encourage people to email the imp-users list with questions, rather than email or ask Ben or I, so that the question and answer get indexed for other people to see.
On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan dbarkan@salilab.org wrote:
> ** > > Here's my proposal: Every function documentation must have these > entries: > > *Short Description:* (appears in the search page) > *Detailed Description:* > [*Algorithm Description:* in some cases] > *Usage:* > *Simple Example: > * > > I agree 100% with this suggestion by Riccardo. Also, that was a great > example that he used about how the existing documentation doesn't help in > many cases. > > Here is the thing: as it stands now, it is fine for the lab and close > friends to use IMP, because we can walk over to Daniel or Ben when we have > a problem and quickly get it answered (thanks guys!). But the goal for IMP > is to get it distributed globally. If a lab that doesn't have any > connection to ours wants to use it, currently nothing but the most simple > functionality will be usable due to the lack of documentation. So really > the overall impact of the software is reduced. > > A point of comparison is Modeller. Modeller is used by many labs without > having to email us for tech support. That's because if you go to the > Modeller website, it not only has a great tutorial, but also massive class > and method documentation. Most questions can be answered by using that > documentation as a reference, and I would argue that the documentation more > than almost anything else is responsible for Modeller's widespread use. > > The problem is that people don't want to document their code because once > you get something working, you either feel great about it and want to apply > it to something right away instead of writing how it works, or you're so > sick of it that you don't want to go through each method and document it, > or you say you'll do it later (but of course, later never comes). I am as > guilty as anyone about this for lab stuff. Some insight from my brief time > working in industry: I have to document everything I write here for IP > reasons and there is a lot going on every day so I don't really have time > to revisit old code. Therefore I have to document right away or it's never > going to happen. > > I thought Barak's attempt at organizing this was noble and was sad to see > that people didn't have time to do it, even if it was just a day or two. I > think another attempt should be made. Remember that your legacy as a > scientist will be defined in large part by who uses your code after you > have moved on from the lab. Therefore a day or two of your life to document > as much as possible will have a high impact on your output. We have many > lab alumni who spent five years on a project that no one else can use > because it wasn't properly documented (I myself am still trying to make > sure this doesn't happen). Honestly, I think Andrej should crack down a > little more on this, i.e not signing off on paper submission until all the > code that was used to do it is documented, or something like that, but I > imagine he would prefer a more organic solution. > > Another suggestion which might be extreme is to hire a summer college > intern whose sole job would be documentation. That would actually get them > to be the IMP expert and if they wanted to go into our field it would be a > great way to start, and we would get the benefit of documented code. > > (tl;dr ?) > > thanks, > dave > > > > > > > > > > > _______________________________________________ > IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev > > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > >
In terms of specific doc additions, do people think it would be useful to have: - links from each class to all examples that use the class (or, rather, use the name of the class) - links from each class to all functions that take or return the function
I had an implementation of the former, but it was a bit of a hack and broke.
On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel drussel@gmail.com wrote:
> Cool to see all this interest in improving the docs, let's make sure it > doesn't peter out :-) One useful direction to move would be to add more > standards for documentation (to the developer guide) and then switch to a > commit model as outlined in the previous link so there is someone to poke > people to get them to do the work before it gets forgotten about. > > On a related note, I find that when using other libraries (boost, CGAL, > bullet being some recent example) when I have a question I tend to google > for the answer (as opposed to looking through the docs directly). Useful > hits often include the library docs, but more often than not other sources > such as the support email lists and q&a sites such as stackoverflow > (probably the single most useful site with popular libraries). As a result, > I would highly encourage people to email the imp-users list with questions, > rather than email or ask Ben or I, so that the question and answer get > indexed for other people to see. > > > On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan dbarkan@salilab.org wrote: > >> ** >> >> Here's my proposal: Every function documentation must have these >> entries: >> >> *Short Description:* (appears in the search page) >> *Detailed Description:* >> [*Algorithm Description:* in some cases] >> *Usage:* >> *Simple Example: >> * >> >> I agree 100% with this suggestion by Riccardo. Also, that was a great >> example that he used about how the existing documentation doesn't help in >> many cases. >> >> Here is the thing: as it stands now, it is fine for the lab and close >> friends to use IMP, because we can walk over to Daniel or Ben when we have >> a problem and quickly get it answered (thanks guys!). But the goal for IMP >> is to get it distributed globally. If a lab that doesn't have any >> connection to ours wants to use it, currently nothing but the most simple >> functionality will be usable due to the lack of documentation. So really >> the overall impact of the software is reduced. >> >> A point of comparison is Modeller. Modeller is used by many labs without >> having to email us for tech support. That's because if you go to the >> Modeller website, it not only has a great tutorial, but also massive class >> and method documentation. Most questions can be answered by using that >> documentation as a reference, and I would argue that the documentation more >> than almost anything else is responsible for Modeller's widespread use. >> >> The problem is that people don't want to document their code because once >> you get something working, you either feel great about it and want to apply >> it to something right away instead of writing how it works, or you're so >> sick of it that you don't want to go through each method and document it, >> or you say you'll do it later (but of course, later never comes). I am as >> guilty as anyone about this for lab stuff. Some insight from my brief time >> working in industry: I have to document everything I write here for IP >> reasons and there is a lot going on every day so I don't really have time >> to revisit old code. Therefore I have to document right away or it's never >> going to happen. >> >> I thought Barak's attempt at organizing this was noble and was sad to see >> that people didn't have time to do it, even if it was just a day or two. I >> think another attempt should be made. Remember that your legacy as a >> scientist will be defined in large part by who uses your code after you >> have moved on from the lab. Therefore a day or two of your life to document >> as much as possible will have a high impact on your output. We have many >> lab alumni who spent five years on a project that no one else can use >> because it wasn't properly documented (I myself am still trying to make >> sure this doesn't happen). Honestly, I think Andrej should crack down a >> little more on this, i.e not signing off on paper submission until all the >> code that was used to do it is documented, or something like that, but I >> imagine he would prefer a more organic solution. >> >> Another suggestion which might be extreme is to hire a summer college >> intern whose sole job would be documentation. That would actually get them >> to be the IMP expert and if they wanted to go into our field it would be a >> great way to start, and we would get the benefit of documented code. >> >> (tl;dr ?) >> >> thanks, >> dave >> >> >> >> >> >> >> >> >> >> >> _______________________________________________ >> IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev >> >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> >> >
1) Daniel, I think it is a nice idea to add these autolinks 2) I am also really happy about the renewed interest in documentation. A bunch of us (Daniel, Riccardo, Dina, myself) have already started. If you want to help, let us know. We need all the help we can get. B
On Mon, Aug 6, 2012 at 12:31 PM, Daniel Russel drussel@gmail.com wrote:
> In terms of specific doc additions, do people think it would be useful to > have: > - links from each class to all examples that use the class (or, rather, > use the name of the class) > - links from each class to all functions that take or return the function > > I had an implementation of the former, but it was a bit of a hack and > broke. > > > > On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel drussel@gmail.com wrote: > >> Cool to see all this interest in improving the docs, let's make sure it >> doesn't peter out :-) One useful direction to move would be to add more >> standards for documentation (to the developer guide) and then switch to a >> commit model as outlined in the previous link so there is someone to poke >> people to get them to do the work before it gets forgotten about. >> >> On a related note, I find that when using other libraries (boost, CGAL, >> bullet being some recent example) when I have a question I tend to google >> for the answer (as opposed to looking through the docs directly). Useful >> hits often include the library docs, but more often than not other sources >> such as the support email lists and q&a sites such as stackoverflow >> (probably the single most useful site with popular libraries). As a result, >> I would highly encourage people to email the imp-users list with questions, >> rather than email or ask Ben or I, so that the question and answer get >> indexed for other people to see. >> >> >> On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan dbarkan@salilab.org wrote: >> >>> ** >>> >>> Here's my proposal: Every function documentation must have these >>> entries: >>> >>> *Short Description:* (appears in the search page) >>> *Detailed Description:* >>> [*Algorithm Description:* in some cases] >>> *Usage:* >>> *Simple Example: >>> * >>> >>> I agree 100% with this suggestion by Riccardo. Also, that was a great >>> example that he used about how the existing documentation doesn't help in >>> many cases. >>> >>> Here is the thing: as it stands now, it is fine for the lab and close >>> friends to use IMP, because we can walk over to Daniel or Ben when we have >>> a problem and quickly get it answered (thanks guys!). But the goal for IMP >>> is to get it distributed globally. If a lab that doesn't have any >>> connection to ours wants to use it, currently nothing but the most simple >>> functionality will be usable due to the lack of documentation. So really >>> the overall impact of the software is reduced. >>> >>> A point of comparison is Modeller. Modeller is used by many labs without >>> having to email us for tech support. That's because if you go to the >>> Modeller website, it not only has a great tutorial, but also massive class >>> and method documentation. Most questions can be answered by using that >>> documentation as a reference, and I would argue that the documentation more >>> than almost anything else is responsible for Modeller's widespread use. >>> >>> The problem is that people don't want to document their code because >>> once you get something working, you either feel great about it and want to >>> apply it to something right away instead of writing how it works, or you're >>> so sick of it that you don't want to go through each method and document >>> it, or you say you'll do it later (but of course, later never comes). I am >>> as guilty as anyone about this for lab stuff. Some insight from my brief >>> time working in industry: I have to document everything I write here for IP >>> reasons and there is a lot going on every day so I don't really have time >>> to revisit old code. Therefore I have to document right away or it's never >>> going to happen. >>> >>> I thought Barak's attempt at organizing this was noble and was sad to >>> see that people didn't have time to do it, even if it was just a day or >>> two. I think another attempt should be made. Remember that your legacy as a >>> scientist will be defined in large part by who uses your code after you >>> have moved on from the lab. Therefore a day or two of your life to document >>> as much as possible will have a high impact on your output. We have many >>> lab alumni who spent five years on a project that no one else can use >>> because it wasn't properly documented (I myself am still trying to make >>> sure this doesn't happen). Honestly, I think Andrej should crack down a >>> little more on this, i.e not signing off on paper submission until all the >>> code that was used to do it is documented, or something like that, but I >>> imagine he would prefer a more organic solution. >>> >>> Another suggestion which might be extreme is to hire a summer college >>> intern whose sole job would be documentation. That would actually get them >>> to be the IMP expert and if they wanted to go into our field it would be a >>> great way to start, and we would get the benefit of documented code. >>> >>> (tl;dr ?) >>> >>> thanks, >>> dave >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev >>> >>> >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> >> > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > >
p.s. I didn't mean to suggest that only the people listed below have been helpful to IMP, and I bet if I add Dave, Ben, Javi, Max, Keren and Charles to the list of past and present contributors, I am still missing out on many people, I apologies in advance to anyone left out!!! All I meant to say is that it's greaaaat many people think the documentation effort is important, and that we're doing something about it, so please come and join!!!
Barak
On Mon, Aug 6, 2012 at 12:40 PM, Barak Raveh barak.raveh@gmail.com wrote:
> 1) Daniel, I think it is a nice idea to add these autolinks > 2) I am also really happy about the renewed interest in documentation. A > bunch of us (Daniel, Riccardo, Dina, myself) have already started. If you > want to help, let us know. We need all the help we can get. > B > > > On Mon, Aug 6, 2012 at 12:31 PM, Daniel Russel drussel@gmail.com wrote: > >> In terms of specific doc additions, do people think it would be useful to >> have: >> - links from each class to all examples that use the class (or, rather, >> use the name of the class) >> - links from each class to all functions that take or return the function >> >> I had an implementation of the former, but it was a bit of a hack and >> broke. >> >> >> >> On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel drussel@gmail.com wrote: >> >>> Cool to see all this interest in improving the docs, let's make sure it >>> doesn't peter out :-) One useful direction to move would be to add more >>> standards for documentation (to the developer guide) and then switch to a >>> commit model as outlined in the previous link so there is someone to poke >>> people to get them to do the work before it gets forgotten about. >>> >>> On a related note, I find that when using other libraries (boost, CGAL, >>> bullet being some recent example) when I have a question I tend to google >>> for the answer (as opposed to looking through the docs directly). Useful >>> hits often include the library docs, but more often than not other sources >>> such as the support email lists and q&a sites such as stackoverflow >>> (probably the single most useful site with popular libraries). As a result, >>> I would highly encourage people to email the imp-users list with questions, >>> rather than email or ask Ben or I, so that the question and answer get >>> indexed for other people to see. >>> >>> >>> On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan dbarkan@salilab.orgwrote: >>> >>>> ** >>>> >>>> Here's my proposal: Every function documentation must have these >>>> entries: >>>> >>>> *Short Description:* (appears in the search page) >>>> *Detailed Description:* >>>> [*Algorithm Description:* in some cases] >>>> *Usage:* >>>> *Simple Example: >>>> * >>>> >>>> I agree 100% with this suggestion by Riccardo. Also, that was a great >>>> example that he used about how the existing documentation doesn't help in >>>> many cases. >>>> >>>> Here is the thing: as it stands now, it is fine for the lab and close >>>> friends to use IMP, because we can walk over to Daniel or Ben when we have >>>> a problem and quickly get it answered (thanks guys!). But the goal for IMP >>>> is to get it distributed globally. If a lab that doesn't have any >>>> connection to ours wants to use it, currently nothing but the most simple >>>> functionality will be usable due to the lack of documentation. So really >>>> the overall impact of the software is reduced. >>>> >>>> A point of comparison is Modeller. Modeller is used by many labs >>>> without having to email us for tech support. That's because if you go to >>>> the Modeller website, it not only has a great tutorial, but also massive >>>> class and method documentation. Most questions can be answered by using >>>> that documentation as a reference, and I would argue that the documentation >>>> more than almost anything else is responsible for Modeller's widespread use. >>>> >>>> The problem is that people don't want to document their code because >>>> once you get something working, you either feel great about it and want to >>>> apply it to something right away instead of writing how it works, or you're >>>> so sick of it that you don't want to go through each method and document >>>> it, or you say you'll do it later (but of course, later never comes). I am >>>> as guilty as anyone about this for lab stuff. Some insight from my brief >>>> time working in industry: I have to document everything I write here for IP >>>> reasons and there is a lot going on every day so I don't really have time >>>> to revisit old code. Therefore I have to document right away or it's never >>>> going to happen. >>>> >>>> I thought Barak's attempt at organizing this was noble and was sad to >>>> see that people didn't have time to do it, even if it was just a day or >>>> two. I think another attempt should be made. Remember that your legacy as a >>>> scientist will be defined in large part by who uses your code after you >>>> have moved on from the lab. Therefore a day or two of your life to document >>>> as much as possible will have a high impact on your output. We have many >>>> lab alumni who spent five years on a project that no one else can use >>>> because it wasn't properly documented (I myself am still trying to make >>>> sure this doesn't happen). Honestly, I think Andrej should crack down a >>>> little more on this, i.e not signing off on paper submission until all the >>>> code that was used to do it is documented, or something like that, but I >>>> imagine he would prefer a more organic solution. >>>> >>>> Another suggestion which might be extreme is to hire a summer college >>>> intern whose sole job would be documentation. That would actually get them >>>> to be the IMP expert and if they wanted to go into our field it would be a >>>> great way to start, and we would get the benefit of documented code. >>>> >>>> (tl;dr ?) >>>> >>>> thanks, >>>> dave >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev >>>> >>>> >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> >>> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> >> > > > -- > Barak >
Similarly, I didn't realize that Barak's effort appears to have been semi-successful, in that people are working on it, since the last email he sent indicated that people didn't have time. Thanks to everyone for doing that. As usual, just being a P-I-T-A.
db
On 8/6/12 1:14 PM, Barak Raveh wrote: > p.s. I didn't mean to suggest that only the people listed below have > been helpful to IMP, and I bet if I add Dave, Ben, Javi, Max, Keren > and Charles to the list of past and present contributors, I am still > missing out on many people, I apologies in advance to anyone left > out!!! All I meant to say is that it's greaaaat many people think the > documentation effort is important, and that we're doing something > about it, so please come and join!!! > > Barak > > On Mon, Aug 6, 2012 at 12:40 PM, Barak Raveh <barak.raveh@gmail.com > mailto:barak.raveh@gmail.com> wrote: > > 1) Daniel, I think it is a nice idea to add these autolinks > 2) I am also really happy about the renewed interest in > documentation. A bunch of us (Daniel, Riccardo, Dina, myself) have > already started. If you want to help, let us know. We need all the > help we can get. > B > > > On Mon, Aug 6, 2012 at 12:31 PM, Daniel Russel <drussel@gmail.com > mailto:drussel@gmail.com> wrote: > > In terms of specific doc additions, do people think it would > be useful to have: > - links from each class to all examples that use the class > (or, rather, use the name of the class) > - links from each class to all functions that take or return > the function > > I had an implementation of the former, but it was a bit of a > hack and broke. > > > > On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel > <drussel@gmail.com mailto:drussel@gmail.com> wrote: > > Cool to see all this interest in improving the docs, let's > make sure it doesn't peter out :-) One useful direction to > move would be to add more standards for documentation (to > the developer guide) and then switch to a commit model as > outlined in the previous link so there is someone to poke > people to get them to do the work before it gets forgotten > about. > > On a related note, I find that when using other libraries > (boost, CGAL, bullet being some recent example) when I > have a question I tend to google for the answer (as > opposed to looking through the docs directly). Useful hits > often include the library docs, but more often than not > other sources such as the support email lists and q&a > sites such as stackoverflow (probably the single most > useful site with popular libraries). As a result, I would > highly encourage people to email the imp-users list with > questions, rather than email or ask Ben or I, so that the > question and answer get indexed for other people to see. > > > On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan > <dbarkan@salilab.org mailto:dbarkan@salilab.org> wrote: > > >> Here's my proposal: Every function documentation must >> have these entries: >> >> *Short Description:* (appears in the search page) >> *Detailed Description:* >> [*Algorithm Description:* in some cases] >> *Usage:* >> *Simple Example: >> * > I agree 100% with this suggestion by Riccardo. Also, > that was a great example that he used about how the > existing documentation doesn't help in many cases. > > Here is the thing: as it stands now, it is fine for > the lab and close friends to use IMP, because we can > walk over to Daniel or Ben when we have a problem and > quickly get it answered (thanks guys!). But the goal > for IMP is to get it distributed globally. If a lab > that doesn't have any connection to ours wants to use > it, currently nothing but the most simple > functionality will be usable due to the lack of > documentation. So really the overall impact of the > software is reduced. > > A point of comparison is Modeller. Modeller is used by > many labs without having to email us for tech support. > That's because if you go to the Modeller website, it > not only has a great tutorial, but also massive class > and method documentation. Most questions can be > answered by using that documentation as a reference, > and I would argue that the documentation more than > almost anything else is responsible for Modeller's > widespread use. > > The problem is that people don't want to document > their code because once you get something working, you > either feel great about it and want to apply it to > something right away instead of writing how it works, > or you're so sick of it that you don't want to go > through each method and document it, or you say you'll > do it later (but of course, later never comes). I am > as guilty as anyone about this for lab stuff. Some > insight from my brief time working in industry: I have > to document everything I write here for IP reasons and > there is a lot going on every day so I don't really > have time to revisit old code. Therefore I have to > document right away or it's never going to happen. > > I thought Barak's attempt at organizing this was noble > and was sad to see that people didn't have time to do > it, even if it was just a day or two. I think another > attempt should be made. Remember that your legacy as a > scientist will be defined in large part by who uses > your code after you have moved on from the lab. > Therefore a day or two of your life to document as > much as possible will have a high impact on your > output. We have many lab alumni who spent five years > on a project that no one else can use because it > wasn't properly documented (I myself am still trying > to make sure this doesn't happen). Honestly, I think > Andrej should crack down a little more on this, i.e > not signing off on paper submission until all the code > that was used to do it is documented, or something > like that, but I imagine he would prefer a more > organic solution. > > Another suggestion which might be extreme is to hire a > summer college intern whose sole job would be > documentation. That would actually get them to be the > IMP expert and if they wanted to go into our field it > would be a great way to start, and we would get the > benefit of documented code. > > (tl;dr ?) > > thanks, > dave > > > > > >> >> >> >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org mailto:IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org mailto:IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > > > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org mailto:IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > > > > > -- > Barak > > > > > -- > Barak > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
I'm in, I'll try to document and clean up my documentation bugs in ISD in a near future!
I agree with Barak, it would be great to have links from each class to all examples that use them. Personnally, I like hands-on approaches, so I tend to look at examples to see how things work. If we had examples for most classes that aren't experimental, that would be great and lower the learning curve. Then again, who has the time to write them... Ideally, in a few years, it should look like this http://eigen.tuxfamily.org/dox/ (search function is as bad there as well) About the doxygen search function: I use it to look for a specific function whose name I know. It however doesn't work for names with underscores in them. It would definitely be amazing to have a solution for that, but it seems difficult and complicated.
Just to help those who aren't in the salilab's daily discussions: Are you all planning to use git and move everything to git now?
Le 06/08/12 21:40, Barak Raveh a écrit : > 1) Daniel, I think it is a nice idea to add these autolinks > 2) I am also really happy about the renewed interest in documentation. > A bunch of us (Daniel, Riccardo, Dina, myself) have already started. > If you want to help, let us know. We need all the help we can get. > B > > > On Mon, Aug 6, 2012 at 12:31 PM, Daniel Russel <drussel@gmail.com > mailto:drussel@gmail.com> wrote: > > In terms of specific doc additions, do people think it would be > useful to have: > - links from each class to all examples that use the class (or, > rather, use the name of the class) > - links from each class to all functions that take or return the > function > > I had an implementation of the former, but it was a bit of a hack > and broke. > > > > On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel <drussel@gmail.com > mailto:drussel@gmail.com> wrote: > > Cool to see all this interest in improving the docs, let's > make sure it doesn't peter out :-) One useful direction to > move would be to add more standards for documentation (to the > developer guide) and then switch to a commit model as outlined > in the previous link so there is someone to poke people to get > them to do the work before it gets forgotten about. > > On a related note, I find that when using other libraries > (boost, CGAL, bullet being some recent example) when I have a > question I tend to google for the answer (as opposed to > looking through the docs directly). Useful hits often include > the library docs, but more often than not other sources such > as the support email lists and q&a sites such as stackoverflow > (probably the single most useful site with popular libraries). > As a result, I would highly encourage people to email the > imp-users list with questions, rather than email or ask Ben or > I, so that the question and answer get indexed for other > people to see. > > > On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan > <dbarkan@salilab.org mailto:dbarkan@salilab.org> wrote: > > >> Here's my proposal: Every function documentation must >> have these entries: >> >> *Short Description:* (appears in the search page) >> *Detailed Description:* >> [*Algorithm Description:* in some cases] >> *Usage:* >> *Simple Example: >> * > I agree 100% with this suggestion by Riccardo. Also, that > was a great example that he used about how the existing > documentation doesn't help in many cases. > > Here is the thing: as it stands now, it is fine for the > lab and close friends to use IMP, because we can walk over > to Daniel or Ben when we have a problem and quickly get it > answered (thanks guys!). But the goal for IMP is to get it > distributed globally. If a lab that doesn't have any > connection to ours wants to use it, currently nothing but > the most simple functionality will be usable due to the > lack of documentation. So really the overall impact of the > software is reduced. > > A point of comparison is Modeller. Modeller is used by > many labs without having to email us for tech support. > That's because if you go to the Modeller website, it not > only has a great tutorial, but also massive class and > method documentation. Most questions can be answered by > using that documentation as a reference, and I would argue > that the documentation more than almost anything else is > responsible for Modeller's widespread use. > > The problem is that people don't want to document their > code because once you get something working, you either > feel great about it and want to apply it to something > right away instead of writing how it works, or you're so > sick of it that you don't want to go through each method > and document it, or you say you'll do it later (but of > course, later never comes). I am as guilty as anyone about > this for lab stuff. Some insight from my brief time > working in industry: I have to document everything I write > here for IP reasons and there is a lot going on every day > so I don't really have time to revisit old code. > Therefore I have to document right away or it's never > going to happen. > > I thought Barak's attempt at organizing this was noble and > was sad to see that people didn't have time to do it, even > if it was just a day or two. I think another attempt > should be made. Remember that your legacy as a scientist > will be defined in large part by who uses your code after > you have moved on from the lab. Therefore a day or two of > your life to document as much as possible will have a high > impact on your output. We have many lab alumni who spent > five years on a project that no one else can use because > it wasn't properly documented (I myself am still trying to > make sure this doesn't happen). Honestly, I think Andrej > should crack down a little more on this, i.e not signing > off on paper submission until all the code that was used > to do it is documented, or something like that, but I > imagine he would prefer a more organic solution. > > Another suggestion which might be extreme is to hire a > summer college intern whose sole job would be > documentation. That would actually get them to be the IMP > expert and if they wanted to go into our field it would be > a great way to start, and we would get the benefit of > documented code. > > (tl;dr ?) > > thanks, > dave > > > > > >> >> >> >> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org mailto:IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org mailto:IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > > > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org mailto:IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > > > > > -- > Barak > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
Yannick, you are another very important name I forgot, just hard to see across the ocean sorry :) And Jeremey too!
About git - it's kinda in the twilight zone now, Daniel already has set a git repository that he keeps in sync with the svn every once in a while, but it might take a little while until we leave svn, probably only following the next release? Daniel? Ben?
On Mon, Aug 6, 2012 at 1:36 PM, Yannick Spill yannick@salilab.org wrote:
> I'm in, I'll try to document and clean up my documentation bugs in ISD > in a near future! > > I agree with Barak, it would be great to have links from each class to all > examples that use them. Personnally, I like hands-on approaches, so I tend > to look at examples to see how things work. If we had examples for most > classes that aren't experimental, that would be great and lower the > learning curve. Then again, who has the time to write them... > Ideally, in a few years, it should look like this > http://eigen.tuxfamily.org/dox/ > (search function is as bad there as well) > About the doxygen search function: I use it to look for a specific > function whose name I know. It however doesn't work for names with > underscores in them. It would definitely be amazing to have a solution for > that, but it seems difficult and complicated. > > Just to help those who aren't in the salilab's daily discussions: Are you > all planning to use git and move everything to git now? > > > Le 06/08/12 21:40, Barak Raveh a écrit : > > 1) Daniel, I think it is a nice idea to add these autolinks > 2) I am also really happy about the renewed interest in documentation. A > bunch of us (Daniel, Riccardo, Dina, myself) have already started. If you > want to help, let us know. We need all the help we can get. > B > > > On Mon, Aug 6, 2012 at 12:31 PM, Daniel Russel drussel@gmail.com wrote: > >> In terms of specific doc additions, do people think it would be useful to >> have: >> - links from each class to all examples that use the class (or, rather, >> use the name of the class) >> - links from each class to all functions that take or return the function >> >> I had an implementation of the former, but it was a bit of a hack and >> broke. >> >> >> >> On Mon, Aug 6, 2012 at 12:26 PM, Daniel Russel drussel@gmail.com wrote: >> >>> Cool to see all this interest in improving the docs, let's make sure it >>> doesn't peter out :-) One useful direction to move would be to add more >>> standards for documentation (to the developer guide) and then switch to a >>> commit model as outlined in the previous link so there is someone to poke >>> people to get them to do the work before it gets forgotten about. >>> >>> On a related note, I find that when using other libraries (boost, CGAL, >>> bullet being some recent example) when I have a question I tend to google >>> for the answer (as opposed to looking through the docs directly). Useful >>> hits often include the library docs, but more often than not other sources >>> such as the support email lists and q&a sites such as stackoverflow >>> (probably the single most useful site with popular libraries). As a result, >>> I would highly encourage people to email the imp-users list with questions, >>> rather than email or ask Ben or I, so that the question and answer get >>> indexed for other people to see. >>> >>> >>> On Mon, Aug 6, 2012 at 11:45 AM, Dave Barkan dbarkan@salilab.orgwrote: >>> >>>> >>>> Here's my proposal: Every function documentation must have these >>>> entries: >>>> >>>> *Short Description:* (appears in the search page) >>>> *Detailed Description:* >>>> [*Algorithm Description:* in some cases] >>>> *Usage:* >>>> *Simple Example: >>>> * >>>> >>>> I agree 100% with this suggestion by Riccardo. Also, that was a great >>>> example that he used about how the existing documentation doesn't help in >>>> many cases. >>>> >>>> Here is the thing: as it stands now, it is fine for the lab and close >>>> friends to use IMP, because we can walk over to Daniel or Ben when we have >>>> a problem and quickly get it answered (thanks guys!). But the goal for IMP >>>> is to get it distributed globally. If a lab that doesn't have any >>>> connection to ours wants to use it, currently nothing but the most simple >>>> functionality will be usable due to the lack of documentation. So really >>>> the overall impact of the software is reduced. >>>> >>>> A point of comparison is Modeller. Modeller is used by many labs >>>> without having to email us for tech support. That's because if you go to >>>> the Modeller website, it not only has a great tutorial, but also massive >>>> class and method documentation. Most questions can be answered by using >>>> that documentation as a reference, and I would argue that the documentation >>>> more than almost anything else is responsible for Modeller's widespread use. >>>> >>>> The problem is that people don't want to document their code because >>>> once you get something working, you either feel great about it and want to >>>> apply it to something right away instead of writing how it works, or you're >>>> so sick of it that you don't want to go through each method and document >>>> it, or you say you'll do it later (but of course, later never comes). I am >>>> as guilty as anyone about this for lab stuff. Some insight from my brief >>>> time working in industry: I have to document everything I write here for IP >>>> reasons and there is a lot going on every day so I don't really have time >>>> to revisit old code. Therefore I have to document right away or it's never >>>> going to happen. >>>> >>>> I thought Barak's attempt at organizing this was noble and was sad to >>>> see that people didn't have time to do it, even if it was just a day or >>>> two. I think another attempt should be made. Remember that your legacy as a >>>> scientist will be defined in large part by who uses your code after you >>>> have moved on from the lab. Therefore a day or two of your life to document >>>> as much as possible will have a high impact on your output. We have many >>>> lab alumni who spent five years on a project that no one else can use >>>> because it wasn't properly documented (I myself am still trying to make >>>> sure this doesn't happen). Honestly, I think Andrej should crack down a >>>> little more on this, i.e not signing off on paper submission until all the >>>> code that was used to do it is documented, or something like that, but I >>>> imagine he would prefer a more organic solution. >>>> >>>> Another suggestion which might be extreme is to hire a summer college >>>> intern whose sole job would be documentation. That would actually get them >>>> to be the IMP expert and if they wanted to go into our field it would be a >>>> great way to start, and we would get the benefit of documented code. >>>> >>>> (tl;dr ?) >>>> >>>> thanks, >>>> dave >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev >>>> >>>> >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> >>> >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> >> > > > -- > Barak > > > _______________________________________________ > IMP-dev mailing listIMP-dev@salilab.orghttps://salilab.org/mailman/listinfo/imp-dev > > > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev > >
On 8/6/12 1:54 PM, Barak Raveh wrote: > About git - it's kinda in the twilight zone now, Daniel already has set > a git repository that he keeps in sync with the svn every once in a > while, but it might take a little while until we leave svn, probably > only following the next release? Daniel? Ben?
When we last discussed this, the tentative plan was to move to git after the next stable IMP release. If people have strong opinions either way about this, now would be the time to say so! (If you have no experience with git but are interested in playing with it, you could try out git-svn, which works quite nicely as a 'frontend' to IMP's current SVN repository.)
Ben
I think it is a good idea if we all use the same repository to eliminate the need to sync svn and git. I am sure we can all easily learn git basics fast.
On Mon, Aug 6, 2012 at 6:11 PM, Ben Webb ben@salilab.org wrote: > On 8/6/12 1:54 PM, Barak Raveh wrote: >> >> About git - it's kinda in the twilight zone now, Daniel already has set >> a git repository that he keeps in sync with the svn every once in a >> while, but it might take a little while until we leave svn, probably >> only following the next release? Daniel? Ben? > > > When we last discussed this, the tentative plan was to move to git after the > next stable IMP release. If people have strong opinions either way about > this, now would be the time to say so! (If you have no experience with git > but are interested in playing with it, you could try out git-svn, which > works quite nicely as a 'frontend' to IMP's current SVN repository.) > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev
I am cool with moving to git too. I played with it a little ever since Daniel's demo. It does come with some "headaches" that are avoided in svn (instead of just committing - first stage, then commit locally, then push). But it will probably be fine with everybody after some getting used to.
On Mon, Aug 6, 2012 at 6:17 PM, Dina Schneidman duhovka@gmail.com wrote:
> I think it is a good idea if we all use the same repository to > eliminate the need to sync svn and git. > I am sure we can all easily learn git basics fast. > > On Mon, Aug 6, 2012 at 6:11 PM, Ben Webb ben@salilab.org wrote: > > On 8/6/12 1:54 PM, Barak Raveh wrote: > >> > >> About git - it's kinda in the twilight zone now, Daniel already has set > >> a git repository that he keeps in sync with the svn every once in a > >> while, but it might take a little while until we leave svn, probably > >> only following the next release? Daniel? Ben? > > > > > > When we last discussed this, the tentative plan was to move to git after > the > > next stable IMP release. If people have strong opinions either way about > > this, now would be the time to say so! (If you have no experience with > git > > but are interested in playing with it, you could try out git-svn, which > > works quite nicely as a 'frontend' to IMP's current SVN repository.) > > > > Ben > > -- > > ben@salilab.org http://salilab.org/~ben/ > > "It is a capital mistake to theorize before one has data." > > - Sir Arthur Conan Doyle > > > > _______________________________________________ > > IMP-dev mailing list > > IMP-dev@salilab.org > > https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >
p.s. and of course the "headaches" are a tradeoff with some advantages
On Mon, Aug 6, 2012 at 6:19 PM, Barak Raveh barak.raveh@gmail.com wrote:
> I am cool with moving to git too. I played with it a little ever since > Daniel's demo. It does come with some "headaches" that are avoided in svn > (instead of just committing - first stage, then commit locally, then push). > But it will probably be fine with everybody after some getting used to. > > > On Mon, Aug 6, 2012 at 6:17 PM, Dina Schneidman duhovka@gmail.com wrote: > >> I think it is a good idea if we all use the same repository to >> eliminate the need to sync svn and git. >> I am sure we can all easily learn git basics fast. >> >> On Mon, Aug 6, 2012 at 6:11 PM, Ben Webb ben@salilab.org wrote: >> > On 8/6/12 1:54 PM, Barak Raveh wrote: >> >> >> >> About git - it's kinda in the twilight zone now, Daniel already has set >> >> a git repository that he keeps in sync with the svn every once in a >> >> while, but it might take a little while until we leave svn, probably >> >> only following the next release? Daniel? Ben? >> > >> > >> > When we last discussed this, the tentative plan was to move to git >> after the >> > next stable IMP release. If people have strong opinions either way about >> > this, now would be the time to say so! (If you have no experience with >> git >> > but are interested in playing with it, you could try out git-svn, which >> > works quite nicely as a 'frontend' to IMP's current SVN repository.) >> > >> > Ben >> > -- >> > ben@salilab.org http://salilab.org/~ben/ >> > "It is a capital mistake to theorize before one has data." >> > - Sir Arthur Conan Doyle >> > >> > _______________________________________________ >> > IMP-dev mailing list >> > IMP-dev@salilab.org >> > https://salilab.org/mailman/listinfo/imp-dev >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > > > > -- > Barak >
On Aug 6, 2012, at 6:11 PM, Ben Webb ben@salilab.org wrote: > On 8/6/12 1:54 PM, Barak Raveh wrote: >> About git - it's kinda in the twilight zone now, Daniel already has set >> a git repository that he keeps in sync with the svn every once in a >> while, but it might take a little while until we leave svn, probably >> only following the next release? Daniel? Ben? > > When we last discussed this, the tentative plan was to move to git after the next stable IMP release. If people have strong opinions either way about this, now would be the time to say so! (If you have no experience with git but are interested in playing with it, you could try out git-svn, which works quite nicely as a 'frontend' to IMP's current SVN repository.)
I now have a pretty decent setup for syncing back and forth between git and svn. As it looks more or less like the commit model I am proposing we use for git (https://github.com/salilab/imp/wiki/A-proposed-commit-model-for-IMP) under the extra restriction that I am the one doing the actual commits to salilab/imp/develop, I would be all for having a few people use the github repo under that model to experiment with things and work out kinds that might arise with that workflow.
To be more concrete, anyone who is interested in either committing things to modules that I know things about (ones having me on the developer list) is encouraged to read the page above and try to work through things.
participants (7)
-
Barak Raveh
-
Ben Webb
-
Daniel Russel
-
Dave Barkan
-
Dina Schneidman
-
Javier Velazquez-Muriel
-
Yannick Spill