1 Developer Guide {#devguide}
4 # Developing with IMP #
7 This page presents instructions on how to develop code
using
8 IMP. Developers should also read [Getting started as a developer](https:
10 # Getting around IMP # {#devguide_getting_around}
12 The input files in the IMP directory are structured as follows:
13 - `tools` contains various command line utilities
for use by developers. They
14 are [documented below](#devguide_scripts).
15 - `doc` contains inputs
for general IMP overview documentation (such as
this
16 page), as well as configuration scripts
for `doxygen`.
17 - `applications` contains various applications implementing easier-to-use
18 command line functionality,
using a variety of IMP modules.
19 - each subdirectory of `modules/` defines a module; they all have the same
20 structure. The directory
for module `name` has the following structure:
21 - `README.md` contains a module overview
22 - `include` contains the C++ header files
23 - `src` contains the C++ source files
24 - `bin` contains C++ source files each of which is built into an executable
25 - `pyext` contains files defining the Python
interface to the module as well
26 as Python source files (in `pyext/src`)
27 - `test` contains test files, that can be run with `ctest`
28 - `doc` contains additional documentation that is provided via `.dox`
30 - `examples` contains examples in Python and C++, as well as any data needed
32 - `data` contains any data files needed by the module
34 When IMP is built, a number of directories are created in the build directory. They are
35 - `include` which includes all the headers. The headers for module `name` are
36 placed in `include/IMP/name`
37 - `lib` where the C++ and Python libraries are placed. Module `name` is built
38 into a C++ library `lib/libimp_name.so` (or `.dylib` on a Mac) and a Python
39 library with Python files located in `lib/IMP/name` and the binary part in
41 - `doc` where the html documentation is placed in `doc/html` and the examples
42 in `doc/examples` with a subdirectory for each module
43 - `data` where each module gets a subdirectory for its data.
45 When IMP is installed, the structure from the build directory is
46 moved over more or less intact except that the C++ and Python
47 libraries are put in the (different) appropriate locations.
50 # Writing new code # {#devguide_new_code}
52 The easiest way to start writing
new functions and classes is to
53 create a
new module
using [make-module.py](\ref dev_tools_make_module).
54 This creates a
new module in the `modules` directory. Alternatively, you can
55 simply use the `scratch` module.
57 We highly recommend
using a revision control system such as
59 keep track of changes to your module.
61 If, instead, you choose to add code to an existing module, you need to
62 consult with the person or people who control that module. Their names
63 can be found on the module main page.
65 When designing the interface
for your
new code, you should
67 - search IMP
for similar functionality and,
if there is any, adapt
68 the existing interface
for your purposes. For example, the existing
70 templates that should be used for the design of any functions that
71 create particles from a file or write particles to a file. Since
72 IMP::atom::Bond, IMP::algebra::Segment3D and
73 IMP::display::Geometry all use methods like
74 IMP::algebra::Segment3D::get_point() to access the
75 endpoints of a segment, any new
object which defines similar
76 point-based geometry should do likewise.
78 - think about how other people are likely to use the code. For
79 example, not all molecular hierarchies have atoms as their leaves,
80 so make sure your code searches for arbitrary
81 IMP::core::XYZ particles rather than atoms if you only care
84 - look for easy ways of splitting the functionality into pieces. It
85 generally makes sense, for %example, to split selection of the
86 particles from the action taken on them, either by accepting a
88 IMP::kernel::ParticleIndexes
object.
91 You may want to read [the design example](\ref designexample) for
92 some suggestions on how to go about implementing your functionality
95 ## Coding conventions ## {#devguide_conventions}
97 Make sure you read the [API Conventions](\ref introduction_conventions) page
100 To ensure code consistency and readability, certain conventions
101 must be adhered to when writing code
for IMP. Some of these
102 conventions are automatically checked
for by source control before
103 allowing a
new commit, and can also be checked yourself in
new
104 code by running [check_standards.py](#devguide_check_standards).
106 ### Indentation ### {#devguide_indentation}
108 All C++ headers and code should be indented with 2-space indents. Do not use
109 tabs. [clang-format](\ref dev_tools_clang_format) can help you
do this formatting
112 All Python code should conform to the [Python style
114 translates to 4-space indents, no tabs, and similar
class, method and
115 variable naming to the C++ code. You can ensure that your Python code
116 is correctly indented by
using the
117 [cleanup_code.py script](\ref dev_tools_clang_format).
119 ### Names ### {#devguide_names}
121 See the [introduction](\ref introduction_names) first. In addition, developers
123 - all preprocessor symbols must begin with `IMP`.
124 - names of files that implement a single
class should be named
for that
125 class;
for example the `SpecialVector`
class could be implemented in
126 `SpecialVector.h` and `SpecialVector.cpp`
127 - files that provide free functions or macros should be given names
128 `separated_by_underscores,`
for `example `container_macros.h`
129 - Functions which take a parameter which has units should have the
130 unit as part of the
function name,
for %example
131 IMP::atom::SimulationParameters::set_maximum_time_step_in_femtoseconds().
132 Remember the Mars orbiter. The exception to
this is distance and
133 force numbers which should always be in angstroms and kcal/mol
134 angstrom respectively unless otherwise stated.
136 ### Passing and storing data ### {#devguide_passing}
138 - When a
class or function takes a set of particles which are expected to
139 be those of a particular type of decorator, it should take a list of
141 This makes it clearer what attributes the particle is required to have
142 as well as allows functions to be overloaded (so there can be an
143 IMP::core::transform() which takes IMP::core::RigidBody particles instead).
150 - Store collections of IMP::Object-derived
151 objects of type `Name` using a `Names.` Declare functions that
152 accept them to take a `NamesTemp` (`Names` is a `NamesTemp)`.
153 `Names` are reference counted (see IMP::RefCounted for details),
154 `NamesTemp` are not. Store collections of particles using a
155 `Particles` object, rather than decorators.
157 ### Display ### {#devguide_display}
159 All values must have a `
show` method which takes an optional
160 `std::ostream` and prints information about the object (see
161 IMP::base::Array::show() for an example). Add a `write` method if you
162 want to provide output that can be read back in.
164 ### Errors ### {#devguide_errors}
166 Classes and methods should use IMP exceptions to
report errors. See
168 [checks](exception_8h.html) for more information.
170 ### Namespaces ### {#devguide_namespace}
172 Use the provided `IMPMODULE_BEGIN_NAMESPACE,`
173 `IMPMODULE_END_NAMESPACE,` `IMPMODULE_BEGIN_INTERNAL_NAMESPACE` and
174 `IMPMODULE_END_INTERNAL_NAMESPACE` macros to put declarations in a
175 namespace appropriate for module `MODULE.`
177 Each module has an internal namespace, eg `IMP::base::internal` and an internal
178 include directory `IMP/base/internal.` Any function which is
179 - not intended to be part of the API,
181 - liable to change without notice,
184 should be declared in an internal header and placed in the internal namespace.
186 The functionality in such internal headers is
187 - not exported to Python
188 - and not part of of documented API
190 As a result, such functions do not need to obey all the coding conventions
191 (but we recommend that they do).
194 ## Documenting your code ## {#devguide_documenting}
196 IMP is documented using `doxygen`. See
197 [Documenting your code in doxygen](http:
198 to get started. We use `
199 You are encouraged to use `Doxygen's`
200 [markdown support](http:
202 Python code should provide Python doc strings.
204 All headers not in internal directories are parsed through
205 `doxygen`. Any function that you do not want documented (for example,
206 because it is not well tested), hide by surrounding with
209 void messy_poorly_thought_out_function();
212 We provide a number of extra Doxygen commands to aid in producing nice
215 - To mark that some part of the API has not yet been well planned at may change
216 using `\\unstable{Classname}.` The documentation will include a disclaimer
217 and the class or function will be added to a list of unstable classes. It is
218 better to simply hide such things from `doxygen`.
220 - To mark a method as not having been well tested yet, use `\\untested{Classname}.`
222 - To mark a method as not having been implemented, use `\\untested{Classname}.`
224 ## Debugging and testing your code ## {#devguide_testing}
226 Ensuring that your code is correct can be very difficult, so IMP
227 provides a number of tools to help you out.
229 The first set are assert-style macros:
231 - IMP_USAGE_CHECK() which should be used to check that arguments to
232 functions and methods satisfy the preconditions.
234 - IMP_INTERNAL_CHECK() which should be used to verify internal state
235 and return values to make sure they satisfy pre and post-conditions.
237 See [checks](exception_8h.html) page for more details. As a
238 general guideline, any improper usage to produce at least a warning
239 all return values should be checked by such code.
241 The second is logging macros such as:
243 - IMP_LOG() which allows controlled display of messages about what the
244 code is doing. See [logging](log_8h.html) for more information.
246 Finally, each module has a set of unit tests. The
247 tests are located in the `modules/modulename/test` directory.
248 These tests should try, as much as possible to provide independent
249 verification of the correctness of the code. Any
250 file in that directory or a subdirectory whose name matches `test_*.{py,cpp}`,
251 `medium_test_*.{py,cpp}` or `expensive_test_*.{py,cpp}` is considered a test.
252 Normal tests should run in at most a few seconds on a typical machine, medium
253 tests in 10 seconds or so and expensive tests in a couple of minutes.
255 Some tests will require input files or temporary files. Input files
256 should be placed in a directory called `input` in the `test`
257 directory. The test script should then call
258 \command{self.get_input_file_name(file_name)} to get the true path to
259 the file. Likewise, appropriate names for temporary files should be
261 \command{self.get_tmp_file_name(file_name)}. Temporary files will be
262 located in `build/tmp.` The test should remove temporary files after
265 ## Writing Examples ## {#devguide_examples}
267 Writing examples is very important part of being an IMP developer and
268 one of the best ways to help people use your code. To write a (Python)
269 example, create a file `myexample.py` in the example directory of an
270 appropriate module, along with a file `myexample.readme.` The readme
271 should provide a brief overview of what the code in the module is
272 trying to accomplish as well as key pieces of IMP functionality that
275 When writing examples, one should try (as appropriate) to do the following:
276 - begin the example with `import` lines for the IMP modules used
277 - have parameters describing the process taking place. These include names of
278 PDB files, the resolution to perform computations at etc.
279 - define a function `create_representating` which creates and returns the model
280 with the needed particles along with a data structure so that key
281 particles can be located. It should define nested functions as
282 needed to encapsulate commonly used code
283 - define a function `create_restraints` which creates the restraints to score
284 conformations of the representation
285 - define a function `get_conformations` to perform the sampling
286 - define a function `analyze_conformations` to perform some sort of clustering
287 and analysis of the resulting conformations
288 - finally do the actual work of calling the `create_representation` and
289 `create_restraints` functions and performing samping and analysis and
290 displaying the solutions.
292 Obviously, not all examples need all of the above parts.
294 The example should have enough comments that the reasoning behind each line of code is clear to someone who roughly understands how IMP in general works.
296 Examples must use methods like IMP::base::get_example_data() to access
297 data in the example directory. This allows them to be run from
301 ## Exporting code to Python ## {#devguide_swig}
303 IMP uses SWIG to wrap code C++ code and export it to Python. Since SWIG is
304 relatively complicated, we provide a number of helper macros and an example
305 file (see modules/example/pyext/swig.i-in). The key bits are
306 - the information goes into a file called swig.i-in in the module pyext directory
307 - the first part should be one `IMP_SWIG_VALUE(),` `IMP_SWIG_OBJECT()` or
308 `IMP_SWIG_DECORATOR()` line per value type, object type or decorator object
309 the module exports to Python. Each of these lines looks like
311 IMP_SWIG_VALUE(IMP::module_namespace, ClassName, ClassNames);
313 - then there should be a number of `%include` lines, one per header file
314 in the module which exports a class or function to Python. The header files
315 must be in order such that no class is used before a declaration for it
316 is encountered (SWIG does not do recursive inclusion)
317 - finally, any templates that are to be exported to SWIG must have a
318 `%template` call. It should look something like
321 namespace module_namespace {
322 %template(PythonName) CPPName<Restraint, 3>;
328 # Managing your own module # {#devguide_module}
330 When there is a significant group of new functionality, a new set of
331 authors, or code that is dependent on a new external dependency, it is
332 probably a good idea to put that code in its own module. To create a
333 new module, run [make-module.py](\ref dev_tools_make_module) script
334 from the main IMP source directory, passing the name of your new
335 module. The module name should consist of lower case characters and
336 numbers and the name should not start with a number. In addition the
337 name "local" is special and is reserved to modules that are internal
338 to code for handling a particular biological system or application. eg
340 ./tools/make-module.py mymodule
342 The next step is to update the information about the module stored in
343 `modules/mymodule/README.md`. This includes the names of the authors and
344 descriptions of what the module is supposed to do.
346 If the module makes use of external libraries. See the files `modules/base/dependencies.py` and `modules/base/dependency/Log4CXX.description`
349 Each module has an auto-generated header called `modulename_config.h.`
350 This header contains basic definitions needed for the module and
351 should be included (first) in each header file in the module. In
352 addition, there is a header `module_version.h` which contains the
353 version info as preprocessor symbols. This should not be included in
354 module headers or cpp files as doing so will force frequent
360 # Contributing code back to the repository # {#devguide_contributing}
362 In order to be shared with others as part of the IMP distribution,
363 code needs to be of higher quality and more thoroughly vetted than
364 typical research code. As a result, it may make sense to keep the
365 code as part of a private module until you better understand what
366 capabilities can be cleanly offered to others.
368 The first set of questions to answer are
370 - What exactly is the functionality I would like to contribute? Is
371 it a single function, a single Restraint, a set of related classes
374 - Is there similar functionality already in IMP? If so, it might make
375 more sense to modify the existing code in cooperation with its
376 author. At the very least, the new code needs to respect the
377 conventions established by the prior code in order to maintain
380 - Where should the new functionality go? It can either be added to an
381 existing module or as part of a new module. If adding to an existing
382 module, you must communicate with the authors of that module to get
383 permission and coordinate changes.
385 - Should the functionality be written in C++ or Python? In general, we
386 suggest C++ if you are comfortable programming in that language as
387 that makes the functionality available to more people.
389 You are encouraged to post to the
390 `imp-dev` list to find help
391 answering these questions as it can be hard to grasp all the various
392 pieces of functionality already in the repository.
394 All code contributed to IMP
395 - must follow the [IMP coding conventions](#devguide_conventions)
396 - should follow general good [C++ programming practices](#devguide_cpp)
397 - must have unit tests
398 - must pass all unit tests
399 - must have documentation
400 - must build on all supported compilers (roughly, recent versions of `gcc`,
401 `clang++` and `Visual C++`) without warnings
402 - should have examples
403 - must not have warnings when its doc is built
405 See [getting started as a developer](https:
407 ## Once you have submitted code ## {#devguide_supporting}
409 Once you have submitted code, you should monitor the [Nightly build
411 your code builds on all platforms and passes the unit tests. Please
412 fix all build problems as fast as possible.
414 In addition to monitoring the `imp-dev` list, developers who have a module or
415 are committing patches to svn may want to subscribe to the `imp-commits` email
416 list which receives notices of all changes made to the IMP repository.
419 ## Cross platform compatibility ## {#devguide_cross_platform}
421 IMP is designed to run on a wide variety of platforms. To detect problems on
423 we provide nightly test runs on the supported
424 platforms for code that is part of the IMP repository.
426 In order to make it more likely that your code works on all the supported platforms:
427 - use the headers and classes in IMP::compatibility when appropriate
428 - avoid the use of `and` and `or` in C++ code, use `&&` and `||` instead.
429 - avoid `friend` declarations involving templates, use the preprocessor,
430 conditionally on the symbols `SWIG` and `IMP_DOXYGEN` to hide code as
433 ### C++ 11 ### {#devguide_cxx11}
434 IMP now turns on C++ 11 support when it can. However, since compilers
435 are still quite variable in which C++ 11 features they support, it is
436 not adviseable to use them directly in IMP code at this point. To aid
437 in their use when practical we provide several helper macros:
438 - IMP_OVERRIDE inserts the `override` keyword when available
439 - IMP_FINAL inserts the `final` keyword when available
443 # Good programming practices # {#devguide_cpp}
445 Two excellent sources for general C++ coding guidelines are
447 - [C++ Coding Standards](http:
449 - [Effective C++](http:
451 IMP endeavors to follow all the of the guidelines published in those
452 books. The Sali lab owns copies of both of these books that you
456 # IMP gotchas # {#devguide_gotchas}
458 Below are a suggestions prompted by bugs found in code submitted to IMP.
460 - Never use '`using namespace`' outside of a function; instead
461 explicitly provide the namespace. (This avoids namespace pollution, and
462 removes any ambiguity.)
464 - Never use the preprocessor to define constants. Use `const`
465 variables instead. Preprocessor symbols don't have scope or type
466 and so can have unexpected effects.
468 - Don't expect IMP::base::Object::get_name() names to be unique; they
469 are there for human viewing. If you need a unique identifier
470 associated with an object or non-geometric value, just use the
471 object or value itself.
473 - Pass other objects by value or by `const &` (if the object is
474 large) and store copies of them.
476 - Never expose member variables in an object which has
477 methods. All such member variables should be private.
479 - Don't derive a class from another class simply to reuse some
480 code that the base class provides - only do so if your derived
481 class could make sense when cast to the base class. As above,
482 reuse existing code by pulling it into a function.
484 - Clearly mark any file that is created by a script so that other
485 people know to edit the original file.
487 - Always return a `const` value or `const` reference if you are not
488 providing write access. Returning a `const` copy means the
489 compiler will report an error if the caller tries to modify the
490 return value without creating a copy of it.
492 - Include files from the local module first, then files from the
493 other IMP modules and kernel and finally outside includes. This
494 makes any dependencies in your code obvious, and by including
495 standard headers \e after IMP headers, any missing includes in the
496 headers themselves show up early (rather than being masked by
497 other headers you include).
499 #include <IMP/mymodule/mymodule_exports.h>
500 #include <IMP/mymodule/MyRestraint.h>
501 #include <IMP/Restraint.h>
504 - Use `double` variables for all computational intermediates.
506 - Avoid using nested classes in the API as SWIG can't wrap them
507 properly. If you must use use nested classes, you will have to
508 do more work to provide a Python interface to your code.
511 - Delay initialization of keys until they are actually needed
512 (since all initialized keys take up memory within each particle,
513 more or less). The best way to do this is to have them be static
514 variables in a static function:
516 FloatKey get_my_float_key() {
517 static FloatKey k("hello");
521 - One is the almost always the right number:
522 - Information should be stored in exactly one
523 place. Duplicated information easily gets out of sync.
524 - A given piece of code should only appear once. Do not copy,
525 paste and modify to create new functionality. Instead,
526 figure out a way to reuse the existing code by pulling it
527 into an internal function and adding extra parameters. If
528 you don't, when you find bugs, you won't remember to fix
529 them in all the copies of the code.
530 - There should be exactly one way to represent any particular
531 state. If there is more than one way, anyone who writes
532 library code which uses that type of state has to handle all
533 ways. For %example, there is only one scheme for
534 representing proteins, namely the IMP::atom::Hierarchy.
535 - Each class/method should do exactly one thing. The presence
536 of arguments which dramatically change the behavior of the
537 class/method is a sign that it should be split. Splitting
538 it can make the code simpler, expose the common code for
539 others to use and make it harder to make mistakes by
540 getting the mode flag wrong.
541 - Methods should take at most one argument of each type (and
542 ideally only one argument). If there are several arguments
543 of the same types (eg two different `double` parameters) it is
544 easy for a user to mix up the order of arguments and the compiler will
545 not complain. `int` and `double` count as
546 equivalent types for this rule since the compiler will
547 transparently convert an `int` into a `double.`
550 # Further reading # {#devguide_further_reading}
552 - [Developer tools](\ref dev_tools)
553 - [Developer FAQ](http:
void report(std::string benchmark, std::string algorithm, double time, double check)
Report a benchmark result in a standard way.
void write_pdb(const Selection &mhd, base::TextOutput out, unsigned int model=1)
IMP::kernel::SingletonContainer SingletonContainer
The general base class for IMP exceptions.
ScoreStates maintain invariants in the Model.
Hierarchy read_pdb(base::TextInput input, kernel::Model *model, PDBSelector *selector=get_default_pdb_selector(), bool select_first_model=true)
void transform(Hierarchy h, const algebra::Transformation3D &tr)
Transform a hierarchy. This is aware of rigid bodies.
IMP::kernel::Refiner Refiner
A restraint is a term in an IMP ScoringFunction.
A decorator for a particle with x,y,z coordinates.
Class to handle individual model particles.
void show(Hierarchy h, std::ostream &out=std::cout)
Print out a molecular hierarchy.
A shared container for Singletons.