Atom::create

Keren Lasker

6 Aug 2009 6 Aug '09

6:31 p.m.

seems that now element is needed for writing atoms in pdb format. Would not it make sense to set the value of element in Atom::create()?

Show replies by date

Daniel Russel

6 Aug 6 Aug

7:05 p.m.

It should get it from the aromtype for all properly initialized atom types. If it isn't currently doing that blame Dina :-) we should be able to fix it easily.

I'm not sure there is a reason to store the element in the particle rather than just look it up from the atomtype when get_element is called. Is there?

On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote:

> seems that now element is needed for writing atoms in pdb format. > Would not it make sense to set the value of element in Atom::create > ()? > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Keren Lasker

7:12 p.m.

yes - probably should not be stored, but still doing something like: atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) connecting this atom to a molecule ( with residue and chain of course) and then writing the molecule as pdb, results in a error unknown element in get_element. On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote:

> It should get it from the aromtype for all properly initialized atom > types. If it isn't currently doing that blame Dina :-) we should be > able to fix it easily. > > I'm not sure there is a reason to store the element in the particle > rather than just look it up from the atomtype when get_element is > called. Is there? > > > > On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: > >> seems that now element is needed for writing atoms in pdb format. >> Would not it make sense to set the value of element in >> Atom::create()? >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Dina Schneidman

10:36 p.m.

the right way to determine the element is from element field in pdb, since from atom type it can be ambiguous. if the element is not determined and stored when the pdb is read than the information is lost. I suggest just to fix the writer.

On Thu, Aug 6, 2009 at 7:12 PM, Keren Laskerkerenl@salilab.org wrote: > yes - probably should not be stored, but still doing something like: > atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) > connecting this atom to a molecule ( with residue and chain of course) > and then writing the molecule as pdb, results in a error unknown element in > get_element. > On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: > >> It should get it from the aromtype for all properly initialized atom >> types. If it isn't currently doing that blame Dina :-) we should be able to >> fix it easily. >> >> I'm not sure there is a reason to store the element in the particle rather >> than just look it up from the atomtype when get_element is called. Is there? >> >> >> >> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >> >>> seems that now element is needed for writing atoms in pdb format. >>> Would not it make sense to set the value of element in Atom::create()? >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Keren Lasker

10:43 p.m.

should not just the function get_name of the element table should handle UNKNOWN_ELEMENT ? On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote:

> the right way to determine the element is from element field in pdb, > since from atom type it can be ambiguous. if the element is not > determined and stored when the pdb is read than the information is > lost. I suggest just to fix the writer. > > On Thu, Aug 6, 2009 at 7:12 PM, Keren Laskerkerenl@salilab.org > wrote: >> yes - probably should not be stored, but still doing something like: >> atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) >> connecting this atom to a molecule ( with residue and chain of >> course) >> and then writing the molecule as pdb, results in a error unknown >> element in >> get_element. >> On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: >> >>> It should get it from the aromtype for all properly initialized atom >>> types. If it isn't currently doing that blame Dina :-) we should >>> be able to >>> fix it easily. >>> >>> I'm not sure there is a reason to store the element in the >>> particle rather >>> than just look it up from the atomtype when get_element is called. >>> Is there? >>> >>> >>> >>> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >>> >>>> seems that now element is needed for writing atoms in pdb format. >>>> Would not it make sense to set the value of element in >>>> Atom::create()? >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Dina Schneidman

10:45 p.m.

it does. update your version

On Thu, Aug 6, 2009 at 10:43 PM, Keren Laskerkerenl@salilab.org wrote: > should not just the function get_name of the element table should handle > UNKNOWN_ELEMENT ? > On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote: > >> the right way to determine the element is from element field in pdb, >> since from atom type it can be ambiguous. if the element is not >> determined and stored when the pdb is read than the information is >> lost. I suggest just to fix the writer. >> >> On Thu, Aug 6, 2009 at 7:12 PM, Keren Laskerkerenl@salilab.org wrote: >>> >>> yes - probably should not be stored, but still doing something like: >>> atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) >>> connecting this atom to a molecule ( with residue and chain of course) >>> and then writing the molecule as pdb, results in a error unknown element >>> in >>> get_element. >>> On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: >>> >>>> It should get it from the aromtype for all properly initialized atom >>>> types. If it isn't currently doing that blame Dina :-) we should be able >>>> to >>>> fix it easily. >>>> >>>> I'm not sure there is a reason to store the element in the particle >>>> rather >>>> than just look it up from the atomtype when get_element is called. Is >>>> there? >>>> >>>> >>>> >>>> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >>>> >>>>> seems that now element is needed for writing atoms in pdb format. >>>>> Would not it make sense to set the value of element in Atom::create()? >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Daniel Russel

10:46 p.m.

On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote:

> the right way to determine the element is from element field in pdb, > since from atom type it can be ambiguous. When is it ambiguous? Is if often enough that we care? Certainly some have have no element (eg water), but for those it doesn't really matter what gets written out.

> if the element is not > determined and stored when the pdb is read than the information is > lost. I suggest just to fix the writer.

Dina Schneidman

10:56 p.m.

I care :) for example Calcium and Calpha atom will have the same atom name "CA" there is a reason for having this column in PDB format.

On Thu, Aug 6, 2009 at 10:46 PM, Daniel Russeldrussel@gmail.com wrote: > > On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote: > >> the right way to determine the element is from element field in pdb, >> since from atom type it can be ambiguous. > > When is it ambiguous? Is if often enough that we care? Certainly some have > have no element (eg water), but for those it doesn't really matter what gets > written out. > >> if the element is not >> determined and stored when the pdb is read than the information is >> lost. I suggest just to fix the writer. > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Daniel Russel

11:19 p.m.

Well, we certainly wouldn't want C alpha and calcium to get them the same AtomType, right? That would mess all sorts of things up since you couldn't, eg, do as Hao is doing and index his force field based on the AtomTypes.

On Aug 6, 2009, at 10:56 PM, Dina Schneidman wrote:

> I care :) > for example Calcium and Calpha atom will have the same atom name "CA" > there is a reason for having this column in PDB format. > > On Thu, Aug 6, 2009 at 10:46 PM, Daniel Russeldrussel@gmail.com > wrote: >> >> On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote: >> >>> the right way to determine the element is from element field in pdb, >>> since from atom type it can be ambiguous. >> >> When is it ambiguous? Is if often enough that we care? Certainly >> some have >> have no element (eg water), but for those it doesn't really matter >> what gets >> written out. >> >>> if the element is not >>> determined and stored when the pdb is read than the information is >>> lost. I suggest just to fix the writer. >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Dina Schneidman

7 Aug 7 Aug

10:30 a.m.

Currently AtomType is more ProteinAtomType (and nucleic acid). Having element stored resolves the problem. However, it would be nice to have less ambiguity in the AtomType as well.

On Thu, Aug 6, 2009 at 11:19 PM, Daniel Russeldrussel@gmail.com wrote: > Well, we certainly wouldn't want C alpha and calcium to get them the same > AtomType, right? That would mess all sorts of things up since you couldn't, > eg, do as Hao is doing and index his force field based on the AtomTypes. > > > On Aug 6, 2009, at 10:56 PM, Dina Schneidman wrote: > >> I care :) >> for example Calcium and Calpha atom will have the same atom name "CA" >> there is a reason for having this column in PDB format. >> >> On Thu, Aug 6, 2009 at 10:46 PM, Daniel Russeldrussel@gmail.com wrote: >>> >>> On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote: >>> >>>> the right way to determine the element is from element field in pdb, >>>> since from atom type it can be ambiguous. >>> >>> When is it ambiguous? Is if often enough that we care? Certainly some >>> have >>> have no element (eg water), but for those it doesn't really matter what >>> gets >>> written out. >>> >>>> if the element is not >>>> determined and stored when the pdb is read than the information is >>>> lost. I suggest just to fix the writer. >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Daniel Russel

10:37 a.m.

On Aug 7, 2009, at 10:30 AM, Dina Schneidman wrote:

> Currently AtomType is more ProteinAtomType (and nucleic acid). Of course, because that was handled in the original pdb reading code :-)

> Having element stored resolves the problem. However, it would be nice > to have less ambiguity in the AtomType as well. AtomType really needs to be the type of the atom. Otherwise, any code which uses them will have to have all sorts of logic to look at the element, whether it is in a protein etc.

It would be much simpler to put the logic on the edges (ie in code that creates Atoms). From what I understand, the mol2 atom types that Hao is using are compatible with the PDB protein atom types, so we perhaps should standardize on those.

A simpler hack is to just prefix hetatoms with "H" since I'm pretty sure the atom types for ATOM records are all well defined. My guess is that hetatom types in the PDB are a bit of a mess, but perhaps things are better than that :-)

> > > On Thu, Aug 6, 2009 at 11:19 PM, Daniel Russeldrussel@gmail.com > wrote: >> Well, we certainly wouldn't want C alpha and calcium to get them >> the same >> AtomType, right? That would mess all sorts of things up since you >> couldn't, >> eg, do as Hao is doing and index his force field based on the >> AtomTypes. >> >> >> On Aug 6, 2009, at 10:56 PM, Dina Schneidman wrote: >> >>> I care :) >>> for example Calcium and Calpha atom will have the same atom name >>> "CA" >>> there is a reason for having this column in PDB format. >>> >>> On Thu, Aug 6, 2009 at 10:46 PM, Daniel Russeldrussel@gmail.com >>> wrote: >>>> >>>> On Aug 6, 2009, at 10:36 PM, Dina Schneidman wrote: >>>> >>>>> the right way to determine the element is from element field in >>>>> pdb, >>>>> since from atom type it can be ambiguous. >>>> >>>> When is it ambiguous? Is if often enough that we care? Certainly >>>> some >>>> have >>>> have no element (eg water), but for those it doesn't really >>>> matter what >>>> gets >>>> written out. >>>> >>>>> if the element is not >>>>> determined and stored when the pdb is read than the information is >>>>> lost. I suggest just to fix the writer. >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Ben Webb

10:39 a.m.

On 08/07/2009 10:37 AM, Daniel Russel wrote: > A simpler hack is to just prefix hetatoms with "H" since I'm pretty sure > the atom types for ATOM records are all well defined. My guess is that > hetatom types in the PDB are a bit of a mess, but perhaps things are > better than that :-)

Fortunately as of the last remediation of PDB, they have a standardized dictionary for HETATMs. So while many things in PDB are still a mess, atom names are not (at last).

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

10:59 a.m.

> Fortunately as of the last remediation of PDB, they have a > standardized dictionary for HETATMs. So while many things in PDB are > still a mess, atom names are not (at last). Cool, so we could then just standardize on using PDB atom names for the AtomType. Do we need to add an "Het" as a prefix to the string to make it unique?

Ben Webb

11:01 a.m.

On 08/07/2009 10:59 AM, Daniel Russel wrote: > Cool, so we could then just standardize on using PDB atom names for the > AtomType. Do we need to add an "Het" as a prefix to the string to make > it unique?

No, because it's not the atom names that are unique - it's the residue+atom name pair that is. In PDB atom names only have to be unique within the residue - you can't use them as types.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Dina Schneidman

11:42 a.m.

Do you mean to his dictionary of chemical components? www.wwpdb.org/ccd.html

On Fri, Aug 7, 2009 at 10:39 AM, Ben Webbben@salilab.org wrote: > On 08/07/2009 10:37 AM, Daniel Russel wrote: >> >> A simpler hack is to just prefix hetatoms with "H" since I'm pretty sure >> the atom types for ATOM records are all well defined. My guess is that >> hetatom types in the PDB are a bit of a mess, but perhaps things are >> better than that :-) > > Fortunately as of the last remediation of PDB, they have a standardized > dictionary for HETATMs. So while many things in PDB are still a mess, atom > names are not (at last). > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Ben Webb

11:47 a.m.

On 08/07/2009 11:42 AM, Dina Schneidman wrote: > Do you mean to his dictionary of chemical components? > www.wwpdb.org/ccd.html

Yes, that looks about right. Their Ligand Expo tool will even give you pretty pictures of a given residue.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Dina Schneidman

11:49 a.m.

yes, that's of course the ultimate solution. I wanted to do that, however changed my mind when I saw the size of this dictionary. It seems not reasonable to load it for each PDB parsing. However I do think it will be nice to have it as optional.

On Fri, Aug 7, 2009 at 11:47 AM, Ben Webbben@salilab.org wrote: > On 08/07/2009 11:42 AM, Dina Schneidman wrote: >> >> Do you mean to his dictionary of chemical components? >> www.wwpdb.org/ccd.html > > Yes, that looks about right. Their Ligand Expo tool will even give you > pretty pictures of a given residue. > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Keren Lasker

11:56 a.m.

so - in the end: did you change write_pdb to support creating on Atoms using the create function ? it still does not work for me ..... :) On Aug 7, 2009, at 11:49 AM, Dina Schneidman wrote:

> yes, that's of course the ultimate solution. > I wanted to do that, however changed my mind when I saw the size of > this dictionary. > It seems not reasonable to load it for each PDB parsing. > However I do think it will be nice to have it as optional. > > On Fri, Aug 7, 2009 at 11:47 AM, Ben Webbben@salilab.org wrote: >> On 08/07/2009 11:42 AM, Dina Schneidman wrote: >>> >>> Do you mean to his dictionary of chemical components? >>> www.wwpdb.org/ccd.html >> >> Yes, that looks about right. Their Ligand Expo tool will even give >> you >> pretty pictures of a given residue. >> >> Ben >> -- >> ben@salilab.org http://salilab.org/~ben/ >> "It is a capital mistake to theorize before one has data." >> - Sir Arthur Conan Doyle >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Dina Schneidman

11:59 a.m.

yes, check it with test_element.py in atom/tests

On Fri, Aug 7, 2009 at 11:56 AM, Keren Laskerkerenl@salilab.org wrote: > so - in the end: > did you change write_pdb to support creating on Atoms using the create > function ? > it still does not work for me ..... :) > On Aug 7, 2009, at 11:49 AM, Dina Schneidman wrote: > >> yes, that's of course the ultimate solution. >> I wanted to do that, however changed my mind when I saw the size of >> this dictionary. >> It seems not reasonable to load it for each PDB parsing. >> However I do think it will be nice to have it as optional. >> >> On Fri, Aug 7, 2009 at 11:47 AM, Ben Webbben@salilab.org wrote: >>> >>> On 08/07/2009 11:42 AM, Dina Schneidman wrote: >>>> >>>> Do you mean to his dictionary of chemical components? >>>> www.wwpdb.org/ccd.html >>> >>> Yes, that looks about right. Their Ligand Expo tool will even give you >>> pretty pictures of a given residue. >>> >>> Ben >>> -- >>> ben@salilab.org http://salilab.org/~ben/ >>> "It is a capital mistake to theorize before one has data." >>> - Sir Arthur Conan Doyle >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Daniel Russel

12:02 p.m.

So where do we stand? The key question is what do we want AtomType to mean?

A couple of options: 1) - we use PDB atom types for AtomType - AtomType is only guaranteed to be unique when paired with the ResidueType or XXXType of its parent (we need LigandType or something) - code that creates atoms must fill in element for all atoms it creates (so add another argument to the setup_particle function) - we could later add a function that generates a unique atom type by combining the parent type and the atomtype - we don't have to parse the whole dictionary unless we want to get other information about the ligands or to fix broken names

2) - we generate unique atom types by prefixing the pdb atom types with, for example, the residue name - AtomType is then globally unique

I would vote for 1 but don't feel too strongly either way. We should probably get input from Hao too.

On Aug 7, 2009, at 11:49 AM, Dina Schneidman wrote:

Dina Schneidman

12:04 p.m.

I vote for 1 too, since we basically have it, except for optional dictionary.

On Fri, Aug 7, 2009 at 12:02 PM, Daniel Russeldrussel@gmail.com wrote: > So where do we stand? The key question is what do we want AtomType to mean? > > A couple of options: > 1) > - we use PDB atom types for AtomType > - AtomType is only guaranteed to be unique when paired with the ResidueType > or XXXType of its parent (we need LigandType or something) > - code that creates atoms must fill in element for all atoms it creates (so > add another argument to the setup_particle function) > - we could later add a function that generates a unique atom type by > combining the parent type and the atomtype > - we don't have to parse the whole dictionary unless we want to get other > information about the ligands or to fix broken names > > 2) > - we generate unique atom types by prefixing the pdb atom types with, for > example, the residue name > - AtomType is then globally unique > > I would vote for 1 but don't feel too strongly either way. We should > probably get input from Hao too. > > > On Aug 7, 2009, at 11:49 AM, Dina Schneidman wrote: > >> yes, that's of course the ultimate solution. >> I wanted to do that, however changed my mind when I saw the size of >> this dictionary. >> It seems not reasonable to load it for each PDB parsing. >> However I do think it will be nice to have it as optional. >> >> On Fri, Aug 7, 2009 at 11:47 AM, Ben Webbben@salilab.org wrote: >>> >>> On 08/07/2009 11:42 AM, Dina Schneidman wrote: >>>> >>>> Do you mean to his dictionary of chemical components? >>>> www.wwpdb.org/ccd.html >>> >>> Yes, that looks about right. Their Ligand Expo tool will even give you >>> pretty pictures of a given residue. >>> >>> Ben >>> -- >>> ben@salilab.org http://salilab.org/~ben/ >>> "It is a capital mistake to theorize before one has data." >>> - Sir Arthur Conan Doyle >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Ben Webb

12:17 p.m.

On 08/07/2009 12:02 PM, Daniel Russel wrote: > So where do we stand? The key question is what do we want AtomType to mean?

In molecular mechanics packages atom types are usually separated from the PDB atom names (they are defined in the forcefield). Multiple PDB atom name/residue name pairs can map to a single atom type.

To avoid confusing our users we should probably not refer to "PDB atom types" anywhere, because what you are calling an atom type is really an atom name.

I think types that are not globally unique is just asking for trouble long term, so my vote is for 2, where the unique type name is derived from one of the existing lists of unique atom types (e.g. CHARMM or Mol2). This is distinct from the name of the atom, however, so a given atom Particle would need both a type and a name to play nicely with both PDB and forcefields.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

12:36 p.m.

I think we need to make a list of what we want to be able to do with the names. It seems like: 1) figure out which atoms are bonded to which other atoms 2) figure out how to set up a forcefield for the atoms 3) write a pdb file 4) write a mol2 file 5) allow one to figure out the element what else?

The combination of the PDB name with the residue name containing allows all of these (assuming we get the mapping right for mol2).

It sounds like Charmm by itself isn't enough to (3) at least (what about (1)?). What about Mol2?

It is reasonable for a Charmm force field setup code to assign Charmm atom types to accelerate its runtime looking of scoring functions.

> > To avoid confusing our users we should probably not refer to "PDB > atom types" anywhere, because what you are calling an atom type is > really an atom name. perhaps, although we tend to use name to mean "user setable-string" which it definitely is not.

Ben Webb

12:39 p.m.

On 08/07/2009 12:36 PM, Daniel Russel wrote: > I think we need to make a list of what we want to be able to do with the > names. It seems like: > 1) figure out which atoms are bonded to which other atoms > 2) figure out how to set up a forcefield for the atoms > 3) write a pdb file > 4) write a mol2 file > 5) allow one to figure out the element > what else? > > The combination of the PDB name with the residue name containing allows > all of these (assuming we get the mapping right for mol2). > > It sounds like Charmm by itself isn't enough to (3) at least (what about > (1)?). What about Mol2?

The topology file maps CHARMM atom types to PDB atom names, and defines the connectivity for each residue type.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

12:43 p.m.

Cool, so then option 2 can be: Use Charmm atom types as AtomType - pdb reader/writer translates to and from them using the topology file - they are unique and so can be used to index things

On Aug 7, 2009, at 12:39 PM, Ben Webb wrote:

> On 08/07/2009 12:36 PM, Daniel Russel wrote: >> I think we need to make a list of what we want to be able to do >> with the >> names. It seems like: >> 1) figure out which atoms are bonded to which other atoms >> 2) figure out how to set up a forcefield for the atoms >> 3) write a pdb file >> 4) write a mol2 file >> 5) allow one to figure out the element >> what else? >> >> The combination of the PDB name with the residue name containing >> allows >> all of these (assuming we get the mapping right for mol2). >> >> It sounds like Charmm by itself isn't enough to (3) at least (what >> about >> (1)?). What about Mol2? > > The topology file maps CHARMM atom types to PDB atom names, and > defines the connectivity for each residue type. > > Ben > -- > ben@salilab.org http://salilab.org/~ben/ > "It is a capital mistake to theorize before one has data." > - Sir Arthur Conan Doyle > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Ben Webb

12:51 p.m.

On 08/07/2009 12:43 PM, Daniel Russel wrote: > Cool, so then option 2 can be: > Use Charmm atom types as AtomType > - pdb reader/writer translates to and from them using the topology file > - they are unique and so can be used to index things

Note, however, my earlier point, so there is no room for confusion: multiple atoms in a given residue can have the same atom type (e.g. all the hydrogens may have the same type). The atom type does not suffice to uniquely define the atom name - that's why you need a separate atom name. And this would certainly need to be user-settable if the user wants to make any novel molecules that don't currently exist in the topology file/ligand database.

The important distinction here is that atom names are just a construct to make humans' lives easier, while atom types are what the universe cares about.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

1:06 p.m.

> Note, however, my earlier point, so there is no room for confusion: > multiple atoms in a given residue can have the same atom type (e.g. > all the hydrogens may have the same type). The atom type does not > suffice to uniquely define the atom name - that's why you need a > separate atom name. And this would certainly need to be user- > settable if the user wants to make any novel molecules that don't > currently exist in the topology file/ligand database. I thought you said you could use the topology file to go from charmm atom types to pdb names? Or do you also need bonds or other non-type information? Which might be OK, but perhaps complicated.

> > The important distinction here is that atom names are just a > construct to make humans' lives easier, while atom types are what > the universe cares about. If a user wants to name an atom, they can just name the particle, so that is well solved.

The important thing is to have a minimal scheme for tagging atoms which supports something like > 1) figure out which atoms are bonded to which other atoms > 2) figure out how to set up a forcefield for the atoms > 3) write a pdb file > 4) write a mol2 file > 5) allow one to figure out the element

Identifying atoms via their PDB names and the type of their parent as is currently sort of done, works fine.

So what is a minimal set of CHARMM-related information that can do what we need? Is it simpler/more useful than the PDB-based alternative?

Ben Webb

1:52 p.m.

On 08/07/2009 01:06 PM, Daniel Russel wrote: > I thought you said you could use the topology file to go from charmm > atom types to pdb names? Or do you also need bonds or other non-type > information? Which might be OK, but perhaps complicated.

The topology file is used to map PDB atom name to CHARMM atom types. You can't easily go the other way, since an atom type is basically just the element and some rough categorization of the electronic environment (e.g. aromatic, sp2, sp3 etc.). For example, in benzene all 6 carbons will have the same type (they are sterically and electronically identical, of course), but they'll have different atom names. In principle you can determine the atom name from the topology, but in general it's far easier just to store that name in the particle, particularly once you start modifying the topology (mutating residues, adding/removing sidechains, etc.).

> Identifying atoms via their PDB names and the type of their parent as is > currently sort of done, works fine.

Sure, but what we call the atom type isn't the atom type - it's the name. It would make more sense to use the particle name for that purpose, therefore, and put real type information (e.g. CHARMM or Mol2 atom type name, element) in a "real" AtomType.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

2:25 p.m.

> >> Identifying atoms via their PDB names and the type of their parent >> as is >> currently sort of done, works fine. > > Sure, but what we call the atom type isn't the atom type - it's the > name. Sure it is. Why do you say it is not? It happens that the proposed type is derived from the PDB name and only makes sense within the context of the surrounding molecule and hence very structure focussed rather than chemistry focused. But that doesn't make it any less a type.

- It is canonical - it provides a fine enough labeling of the atoms that we can determine what we need to determine about them - it has the virtue of being similar to something everyone is familiar with - there doesn't seem to be a standard convention for assigning atom types (every bit of software seems to do its own thing) etc.

> It would make more sense to use the particle name for that purpose, > therefore, and put real type information (e.g. CHARMM or Mol2 atom > type name, element) in a "real" AtomType. Why would this make more sense?

The issue that the user has to provide some identifier for atoms when they create them. This identifier needs to be enough such that, which we have a full structure, all other information can be derived from it. If possible (and it is) there should be only one such piece of information. If the user just provides the CHARMM atom type, then there is no way to figure out bonds for a residue or the PDB atom names. If we use the PDB name-derived types, then IMP can determine the CHARMM atom type, bonds, use other forcefields, etc.

If some piece of code wants chimera atom types, it can easily derive them from the PDB atom types (and possibly store them back in the Particle).

What am I confused about?

Ben Webb

3:37 p.m.

On 08/07/2009 02:25 PM, Daniel Russel wrote: >> Sure, but what we call the atom type isn't the atom type - it's the name. > Sure it is. Why do you say it is not?

I think anybody that has used any other package will be rather confused that our atom "type" is what PDB calls the "name", and not the forcefield type, particularly given that we already have a particle attribute called "name". (I also have a problem with atoms that are clearly identical, such as benzene carbons, being labeled as different types.) This is why I propose we use the particle name for this purpose. To determine elements, forcefield types, etc. using the combination of the atom name and the residue name (as you appear to be proposing) certainly makes sense to me.

Ben

-- ben@salilab.org http://salilab.org/~ben/ "It is a capital mistake to theorize before one has data." - Sir Arthur Conan Doyle

Daniel Russel

3:55 p.m.

> I think anybody that has used any other package will be rather > confused that our atom "type" is what PDB calls the "name", and not > the forcefield type, PDB happened to pick the word "name" and charmm the word "type" for the equivalent concepts in their two systems. We are conflicting with one either way we choose. Since "name" is already taken in IMP to be a label for objects for the purpose of user inspection and having prettier log messages, we are stuck with type (or could switch all Type to Label throughout the code if it makes you happier).

I think it will be clearer once we have charmm working, since then we will have both AtomTypes and CharmmTypes floating around. We already have FormFactorAtomType.

> particularly given that we already have a particle attribute called > "name". No there isn't. All IMP::Objects have names, but it is not an attribute.

Dina Schneidman

11:24 a.m.

> atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) > connecting this atom to a molecule ( with residue and chain of course)

I tried to do the same thing now, i.e. print single coordinate in PDB format. something that normally should take 2 lines of code and ended up with 14 lines. m = IMP.Model() rp = IMP.Particle(m) ap = IMP.Particle(m) cp = IMP.Particle(m); chain = IMP.atom.Chain.create(cp, 'A') residue = IMP.atom.Residue.create(rp) atom = IMP.atom.Atom.create(ap, IMP.atom.AT_CA) xyz = IMP.core.XYZ.create(ap) hcd = IMP.atom.Hierarchy.cast(cp) hrd = IMP.atom.Hierarchy.cast(rp) had = IMP.atom.Hierarchy.cast(ap) hcd.add_child(hrd) hrd.add_child(had) print atom.get_pdb_string()

Is there a way to make it simpler? Am I the only one who thinks it should be simpler?

Dina

> On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: > >> It should get it from the aromtype for all properly initialized atom >> types. If it isn't currently doing that blame Dina :-) we should be able to >> fix it easily. >> >> I'm not sure there is a reason to store the element in the particle rather >> than just look it up from the atomtype when get_element is called. Is there? >> >> >> >> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >> >>> seems that now element is needed for writing atoms in pdb format. >>> Would not it make sense to set the value of element in Atom::create()? >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

Daniel Russel

11:34 a.m.

Having methods to create standard bits of molecules would be great. I think they are on the list I high level functionality I proposed some time back :-)

That said, creating a single coordinate in PDB format is kind of an odd end goal, and if you just want to do that, printf seems the right way to go :-)

Without such methods, you can't do much better although Chain and Residue and Atom all inherit from Hierarchy and so the casts are not needed. Nor are the separate creation of the Particles:

> m = IMP.Model() > chain = IMP.atom.Chain.create(IMP.Particle(m), 'A') > residue = IMP.atom.Residue.create(IMP.Particle(m)) > atom = IMP.atom.Atom.create(IMP.Particle(m), IMP.atom.AT_CA) > xyz = IMP.core.XYZ.create(ap) > residue.add_child(atom) > chain.add_child(residue) > print atom.get_pdb_string()

On Aug 7, 2009, at 11:24 AM, Dina Schneidman wrote:

>> atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) >> connecting this atom to a molecule ( with residue and chain of >> course) > > I tried to do the same thing now, i.e. print single coordinate in > PDB format. > something that normally should take 2 lines of code and ended up > with 14 lines. > m = IMP.Model() > rp = IMP.Particle(m) > ap = IMP.Particle(m) > cp = IMP.Particle(m); > chain = IMP.atom.Chain.create(cp, 'A') > residue = IMP.atom.Residue.create(rp) > atom = IMP.atom.Atom.create(ap, IMP.atom.AT_CA) > xyz = IMP.core.XYZ.create(ap) > hcd = IMP.atom.Hierarchy.cast(cp) > hrd = IMP.atom.Hierarchy.cast(rp) > had = IMP.atom.Hierarchy.cast(ap) > hcd.add_child(hrd) > hrd.add_child(had) > print atom.get_pdb_string() > > Is there a way to make it simpler? > Am I the only one who thinks it should be simpler? > > Dina > >> On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: >> >>> It should get it from the aromtype for all properly initialized atom >>> types. If it isn't currently doing that blame Dina :-) we should >>> be able to >>> fix it easily. >>> >>> I'm not sure there is a reason to store the element in the >>> particle rather >>> than just look it up from the atomtype when get_element is called. >>> Is there? >>> >>> >>> >>> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >>> >>>> seems that now element is needed for writing atoms in pdb format. >>>> Would not it make sense to set the value of element in >>>> Atom::create()? >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev >> > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev

Dina Schneidman

11:39 a.m.

> That said, creating a single coordinate in PDB format is kind of an odd end > goal, and if you just want to do that, printf seems the right way to go :-) it is, but I don't want to worry about field width, alignments, etc. and therefore it makes sense to use existing functionality.

> > Without such methods, you can't do much better although Chain and Residue > and Atom all inherit from Hierarchy and so the casts are not needed. Nor are > the separate creation of the Particles: > >> m = IMP.Model() >> chain = IMP.atom.Chain.create(IMP.Particle(m), 'A') >> residue = IMP.atom.Residue.create(IMP.Particle(m)) >> atom = IMP.atom.Atom.create(IMP.Particle(m), IMP.atom.AT_CA) >> xyz = IMP.core.XYZ.create(ap) >> residue.add_child(atom) >> chain.add_child(residue) >> print atom.get_pdb_string() > > > On Aug 7, 2009, at 11:24 AM, Dina Schneidman wrote: > >>> atom = IMP::atom::Atom::create(p,IMP::atom::AT_CA) >>> connecting this atom to a molecule ( with residue and chain of course) >> >> I tried to do the same thing now, i.e. print single coordinate in PDB >> format. >> something that normally should take 2 lines of code and ended up with 14 >> lines. >> m = IMP.Model() >> rp = IMP.Particle(m) >> ap = IMP.Particle(m) >> cp = IMP.Particle(m); >> chain = IMP.atom.Chain.create(cp, 'A') >> residue = IMP.atom.Residue.create(rp) >> atom = IMP.atom.Atom.create(ap, IMP.atom.AT_CA) >> xyz = IMP.core.XYZ.create(ap) >> hcd = IMP.atom.Hierarchy.cast(cp) >> hrd = IMP.atom.Hierarchy.cast(rp) >> had = IMP.atom.Hierarchy.cast(ap) >> hcd.add_child(hrd) >> hrd.add_child(had) >> print atom.get_pdb_string() >> >> Is there a way to make it simpler? >> Am I the only one who thinks it should be simpler? >> >> Dina >> >>> On Aug 6, 2009, at 7:05 PM, Daniel Russel wrote: >>> >>>> It should get it from the aromtype for all properly initialized atom >>>> types. If it isn't currently doing that blame Dina :-) we should be able >>>> to >>>> fix it easily. >>>> >>>> I'm not sure there is a reason to store the element in the particle >>>> rather >>>> than just look it up from the atomtype when get_element is called. Is >>>> there? >>>> >>>> >>>> >>>> On Aug 6, 2009, at 6:31 PM, Keren Lasker kerenl@salilab.org wrote: >>>> >>>>> seems that now element is needed for writing atoms in pdb format. >>>>> Would not it make sense to set the value of element in Atom::create()? >>>>> _______________________________________________ >>>>> IMP-dev mailing list >>>>> IMP-dev@salilab.org >>>>> https://salilab.org/mailman/listinfo/imp-dev >>>> >>>> _______________________________________________ >>>> IMP-dev mailing list >>>> IMP-dev@salilab.org >>>> https://salilab.org/mailman/listinfo/imp-dev >>> >>> _______________________________________________ >>> IMP-dev mailing list >>> IMP-dev@salilab.org >>> https://salilab.org/mailman/listinfo/imp-dev >>> >> _______________________________________________ >> IMP-dev mailing list >> IMP-dev@salilab.org >> https://salilab.org/mailman/listinfo/imp-dev > > _______________________________________________ > IMP-dev mailing list > IMP-dev@salilab.org > https://salilab.org/mailman/listinfo/imp-dev >

5716

Age (days ago)

5716

Last active (days ago)

List overview

Download

33 comments

4 participants

tags (0)

participants (4)

Ben Webb
Daniel Russel
Dina Schneidman
Keren Lasker