Hello all, I am struggling to write a python program that takes protein sequence as input and produces some conformations as the output (with no any other input data). With the help of previous replies, I tried merging the examples: “structure_from_sequence.py”, “charm_forcefield.py” and “basic_optimization.py” and wrote a script that is attached. This script runs and produces chimera python files that I can visualize through chimera.
1. Will the script that I have prepared work as an example to produce 3D structures with just sequence as input? OR have I made any fundamental/conceptual mistakes while merging them? 2. I can produce chimera python files for each of the predicted conformations. But I want pdb files. Is it possible to do that? Are there any examples?
3. Is it possible to add contacts restrain (distance between atoms as a restrain) moving forward with this approach? Sincerely, Badri Adhikari CS Graduate Student, University of Missouri-Columbia, Columbia, Missouri |
import IMP.atom import IMP.container import IMP.example import IMP.statistics ########################################################################################### # Structure from sequence ########################################################################################### # Use the CHARMM all-atom (i.e. including hydrogens) topology and parameters topology = IMP.atom.CHARMMTopology(IMP.atom.get_all_atom_CHARMM_parameters()) # Create a single chain of amino acids and apply the standard C- and N- # termini patches topology.add_sequence('IACGACKPECPVNIIQGS') topology.apply_default_patches() # Make an IMP Hierarchy (atoms, residues, chains) that corresponds to # this topology m = IMP.Model() h = topology.create_hierarchy(m) # Generate coordinates for all atoms in the Hierarchy, using CHARMM internal # coordinate information (an extended chain conformation will be produced). # Since in some cases this information can be incomplete, better results will # be obtained if the atom types are assigned first and the CHARMM parameters # file is loaded, as we do here, so missing information can be filled in. # It will still work without that information, but will approximate the # coordinates. topology.add_atom_types(h) topology.add_coordinates(h) # Write out the final structure to a PDB file IMP.atom.write_pdb(h, 'structure.pdb') ########################################################################################### # Charmm forcefield ########################################################################################### # Create an IMP model and add a heavy atom-only protein from a PDB file m = IMP.Model() # example_protein.pdb is assumed to be just extended chain structure obtained using structure_from_sequence example prot = IMP.atom.read_pdb('structure.pdb', m, IMP.atom.NonWaterNonHydrogenPDBSelector()) IMP.atom.show_molecular_hierarchy(prot) # Read in the CHARMM heavy atom topology and parameter files ff = IMP.atom.get_heavy_atom_CHARMM_parameters() # Using the CHARMM libraries, determine the ideal topology (atoms and their # connectivity) for the PDB file's primary sequence topology = ff.create_topology(prot) # Typically this modifies the C and N termini of each chain in the protein by # applying the CHARMM CTER and NTER patches. Patches can also be manually # applied at this point, e.g. to add disulfide bridges. topology.apply_default_patches() # Make the PDB file conform with the topology; i.e. if it contains extra # atoms that are not in the CHARMM topology file, remove them; if it is # missing atoms (e.g. sidechains, hydrogens) that are in the CHARMM topology, # add them and construct their Cartesian coordinates from internal coordinate # information. topology.setup_hierarchy(prot) # Set up and evaluate the stereochemical part (bonds, angles, dihedrals, # impropers) of the CHARMM forcefield r = IMP.atom.CHARMMStereochemistryRestraint(prot, topology) m.add_restraint(r) # Add non-bonded interaction (in this case, Lennard-Jones). This needs to # know the radii and well depths for each atom, so add them from the forcefield # (they can also be assigned manually using the XYZR or LennardJones # decorators): ff.add_radii(prot) ff.add_well_depths(prot) # Get a list of all atoms in the protein, and put it in a container atoms = IMP.atom.get_by_type(prot, IMP.atom.ATOM_TYPE) cont = IMP.container.ListSingletonContainer(atoms) # Add a restraint for the Lennard-Jones interaction. This is built from # a collection of building blocks. First, a ClosePairContainer maintains a list # of all pairs of Particles that are close. Next, all 1-2, 1-3 and 1-4 pairs # from the stereochemistry created above are filtered out. # Then, a LennardJonesPairScore scores a pair of atoms with the Lennard-Jones # potential. Finally, a PairsRestraint is used which simply applies the # LennardJonesPairScore to each pair in the ClosePairContainer. nbl = IMP.container.ClosePairContainer(cont, 4.0) nbl.add_pair_filter(r.get_pair_filter()) sf = IMP.atom.ForceSwitch(6.0, 7.0) ps = IMP.atom.LennardJonesPairScore(sf) m.add_restraint(IMP.container.PairsRestraint(ps, nbl)) ps=IMP.core.create_xyzr_particles(m, m.get_number_of_particles(), 1.0) chain= IMP.container.ListSingletonContainer(ps, "chain") ########################################################################################### # Basic Optimization and Chain ########################################################################################### s= IMP.core.MCCGSampler(m) s.set_number_of_attempts(2) # but we do want something to watch s.set_log_level(IMP.TERSE) s.set_number_of_monte_carlo_steps(2) # find some configurations which move the particles far apart configs= s.get_sample() print "Found ", configs.get_number_of_configurations(), " configurations" for i in range(0, configs.get_number_of_configurations()): configs.load_configuration(i) d=IMP.display.ChimeraWriter("solution"+str(i)+".py") for p in chain.get_particles(): d.add_geometry(IMP.core.XYZRGeometry(p))