On Oct 8, 2009, at 8:35 PM, Dina Schneidman wrote:
> Protein is more than a chain. Chain corresponds to tertiary structure. > Protein's quaternary structure can have more than one chain! > A classic example is hemoglobin, 4 chains. Another classics is > antibody, 2 chains. That is what I always assumed :-)
> So we need chains around! and also how can we add bonds without > chains? do you plan to connect them together? No one said anything about getting rid of chains. Just about getting rid the PROTEIN/CHAIN distinction.
> and let me put two more cents: > PDB format does not define any hierarchy. it is a set of atoms. It does too define a hierarchy: a pdb file contains models models contain chains and heterogens chains contain residues residues contain atoms heterogens contain atoms
> if we > want to build an hierarchy out of PDB it should clearly follow from > the format. I don't see that the second follows at all from the first. In fact I would say quite the opposite. But I disagree with the first, so it doesn't really matter :-)
> So the best way is to have 4 levels that are well defined > by the corresponding PDB fields: > Atom, Residue, Chain, Root > I think all other assumptions are only assumptions and a good source > for bugs. Plus the parallel various ligands and stuff which also need to get attached to root (preventing root from being a protein or molecule).
To put one of the problems another way, the big problem is that, ultimately, one would like a hieararchy with a molecule (protein) containing multiple chains. The PDB reader can't, in general, create such a thing since it doesn't know how the chains are grouped into molecules. As a result, it has to return something intermediate. Currently it returns a hierarchy would would have to be broken apart and put back tother in order to get the presumably desired result. I would suggest producing a vector of molecules instead so that the user can filter them/assemble them as needed. We could provide special versions of the reader to handle easy cases (like where you know the pdb file only contains one protein).