Useless technical discussion follows :-)
>> I picked my pdb reader since it is small and so can be stuck in >> the with >> rest of imp so that no one has to worry about installing external >> libraries and it does what I want, namely give me a hierarchy for >> proteins and a bond information. > > We're talking about two different things here. You want to distribute > your PDB reader with IMP. I don't want to include the code in IMP SVN. > The two issues are orthogonal; They eventually yes. At the moment they are not as people just get IMP from svn and build in SVN. If we can put a link in SVN to another repository, that would be very cool (and I wouldn't put it beyond SVN) or I guess scons could just fetch files from another repository at build time. Again, I don't see that which way it is done is very important as long as it does not require people to take extra steps (over svn update and scons in the IMP or new_imp tree) to keep it up to date.
> > How are you "getting bonds" out of a PDB file? PDB files don't provide > that information. It doesn't make sense to talk about a molecule without it bonds, so a molecule loader better handle it on one level or another. BALL handles standard bonds as a cleanup pass after reading which is a good solution too, but makes it a bit funny for connect records as these bonds are created on reading while other, more standard bonds need to wait for the cleanup.
If someone wants to put tables into IMP that build the bonds from atom names, that would be better than in the reader. Just requires more work right now. .
> For example, Modeller reads a set of residue > types from its parameter files at runtime, and after that maps every > residue type in the PDB file from the string to an integer residue > type. > Unknown residue types result in a warning, and the generation of a new > integer residue type at runtime. You could of course use Residue > objects > rather than integer types. Sure. New types are pretty trivial to add on top of any system of predefining the common atom types (just have a function which takes a string and associates it to the next free int in the internal maps). But we might as well give the standard ones standard integer codes and nice C++ names (the enum) which reduces the chance of typos. Where this data happens to reside isn't all that important at the moment.