Hello,
I am building a system using PMI and IMP.pmi.topology.TopologyReader. I have proteins, DNA and RNA. In my PDBs, the DNA is encoded as DA, DC,DT,DG and the RNA as A,C,U,G. In topology, I specify the objects as DNA and RNA:
|DNA1N |dark magenta|myfasta.fasta|DNAN,DNA
|longRNAR1 |magenta |myfasta.fasta|longRNAR,RNA|
If I have A,C,G,U and T in my fasta file I get the following warnings and errors for DNA.WARNING: Inconsistency between FASTA sequence and PDB sequence. FASTA type 1 "G" and PDB type "DG" WARNING: Inconsistency between FASTA sequence and PDB sequence. FASTA type 2 "G" and PDB type "DG" WARNING: Inconsistency between FASTA sequence and PDB sequence. FASTA type 3 "G" and PDB type "DG"The script then continues and modeling proceeds. However, when I try to implement the "deposition" part covered in https://integrativemodeling.org/tutorials/deposition/ , there are several errors due to not recognizing residue types, since the alphabet is defined as the peptide alphabet in the ihm module by default. How should the DNA be encoded in the fasta file?
Additionally, we have two copies of one protein. This causes the second copy to not be created as an asymmetric unit in create_component in the protocol output of mmcif.py. We could fix this by changing the function as followed:
def create_component(self, state, name, modeled, asym_name=None): if asym_name is None: asym_name = name new_comp = name not in self._all_components self._all_components[name] = None if modeled: state.all_modeled_components.append(name) self.asym_units[asym_name] = None # this was originally in the if statement below if new_comp: # assign asym once we get sequence self.all_modeled_components.append(name)?
Thanks, Swantje ?
-- ________________________________ Swantje Lenz M. Sc. Biotechnology Fachgebiet Bioanalytik (TIB 4/4-3), Institut für Biotechnologie, Technische Universität Berlin Technische Universität Berlin | Gustav-Meyer-Allee 25 | Gebäude 17a | 13355 Berlin Tel +49 30 314-72906 | web: http://www.bioanalytik.tu-berlin.de/
On 12/5/19 6:39 AM, Lenz, Swantje wrote: > I am building a system using PMI and IMP.pmi.topology.TopologyReader. I > have proteins, DNA and RNA.
You may be the first person to try this so I wouldn't be surprised if you run into issues - IMP's support for nucleic acids isn't well tested. If you can get me a simplified version of your setup so I can reproduce your issues I should be able to fix them.
> If I have A,C,G,U and T in my fasta file I get the following warnings > and errors for DNA.
Looks like IMP's FASTA reader currently assumes all nucleic acids are RNA. But this is easy to fix.
> The script then continues and modeling > proceeds. However, when I try to implement the "deposition" part covered > in https://integrativemodeling.org/tutorials/deposition/, there are > several errors due to not recognizing residue types, since the alphabet > is defined as the peptide alphabet in the ihm module by default.
There's no support for this in PMI currently, unfortunately. But it shouldn't be too difficult to add.
> Additionally, we have two copies of one protein. This causes the second > copy to not be created as an asymmetric unit in create_component in the > protocol output of mmcif.py.
Fixed in git.
Ben
participants (2)
-
Ben Webb
-
Lenz, Swantje