We have been having meetings with various of the Chimera developers discussing a file format to act as a rich format for saving and loading hierarchical (and coarse grained) structures within IMP and Chimera (with the intent to push a library supporting it as a general standard for interchanging biological data). The current idea is to support one or more conformations of an IMP::atom::Hierarchy along with markup including geometry associated with one or more particles from the hierarchy, and more abstract features (think restraints) which contain information about sets of particles, scores and other text.
The proposed IMP interface would look something like /** A file storing one or more configurations of an atom::Hierarchy \note Changes in the structure of the hierarchy after constructor are not permitted, only coordinates are allowed to change. \note XYZ, XYZR, Bond, Bonded, Hierarchy, Atom, Residue, Molecule, Chain, Domain, Fragment, information is stored when available. */ class IMPHDF5EXPORT File { boost::scoped_ptrH5::H5File file_; atom::Hierarchy hierarchy_;
void initialize_file(); void check_hierarchy() const; void create_hierarchy(); void create_mapping(); public: //! Load the data from a file into the model File(Model *m, std::string fname); //! Tie a (possibly empty) file to the given hierarchy File(atom::Hierarchy h, std::string fname); atom::Hierarchy get_hierarchy() const; /** \name ConfigurationSet-like methods @{ */ void save_conguration(); void save_configuration_as(unsigned int i); void load_configuration(unsigned int i) const ; void remove_configuration(unsigned int i); void clear_configurations(); unsigned int get_number_of_configurations() const ; /** @} */ //! Associate some geometry with a node in the hierarchy /** If the geometry is particle-derived it will have different coordinates for each frame, if not it just has one set of coordinates. */ void add_geometry(atom::Hierarchy h, display::Geometry*g); //! Associate a more general feature with a node in the hierarchy void add_feature(atom::Hierarchy h, Feature h); };
If anyone has usage scenarios that might not be covered or other comments please let me know.
The initial version of the format will be based on HDF5. This should allow "fast" random access to individual conformations (without loading all of them).