[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [modeller_usage] Including hetatm when hetatm does not immediately follow ATOM list

To: Shuo Huai Johnny Wu <wushuohuai@gmail.com>
Subject: Re: [modeller_usage] Including hetatm when hetatm does not immediately follow ATOM list
From: Modeller Caretaker <modeller-care@salilab.org>
Date: Fri, 04 Sep 2009 15:23:34 -0700
Cc: modeller_usage@salilab.org

On 09/02/2009 01:54 PM, Shuo Huai Johnny Wu wrote:

  I used the simple automodel script, and it worked well. When I tried
to include both the ligands and the carbohydrates (hetatms) into the
structure by setting env.io.hetatm = True, I get errors that do not
make much sense to me. My structure pdb is different from the hetatm
example pdb for in that the ligands hetatm residue numbers do not
immediately follow the atom residue sequence.

...

The way Modeller handles this situation is very simple and predictable.(Admittedly it would be nice if it were more complex and did the "rightthing", but then it would be less predictable. ;)

When Modeller reads a PDB file, it reads it sequentially, from beginningto end. Each residue (ATOM or HETATM or water, if you have env.io.hetatmand/or env.io.water turned on) is read in, in exactly the order given inthe PDB file. So if your PDB file contains 10 amino acid residues inchain A, then 10 more in chain B, then two ligands in chain A then twoligands in chain B, Modeller will read the following sequence from thePDB file, where a and b are amino acids and A and B ligands:

aaaaaaaaaa/bbbbbbbbbb/AA/BB

When you read an alignment that contains a structure, Modeller needs toread the PDB file to match the sequence. This sequence must matchexactly. Because you often only want a subset of the PDB, the alignmentfile header can specify the first residue and chain to start reading at,and the residue and chain to finish at. So if in the example above yourA chain amino acids are numbered 1 through 10, the B chain also 1through 10, and the four ligands are labeled 11:A, 12:A, 11:B, and 12:Band you tell Modeller to read from 5:A to 11:A, you will get (rememberthat it reads the PDB file sequentially):

aaaaa/bbbbbbbbbb/A

i.e. the sequence of residues starting at 5:A and ending at 11:A. Notethat since the entire B chain lies between 10:A and 11:A, it'll alsoread that.

I first tried to include all hetatms as '.' with the unspecified
residues past the end of the protein (aa 348) replaced with '-'.

P1;B._taurus

structureY:B._taurus:1    :A:  977:A:ground state rhodopsin:Bovine: :
MNGTEGPNFYVPFSNKTGVVRSPFEAPQYYLAEPWQFSMLAAYMFLLIMLGFPINFLTLYVTVQHKKLRTPLNYILLNLA
VADLFMVFGGFTTTLYTSLHGYFVFGPTGCNLEGFFATLGGEIALWSLVVLAIERYVVVCKPMSNFRFGENHAIMGVAFT
WVMALACAAPPLVGWSRYIPEGMQCSCGIDYYTPHEETNNESFVIYMFVVHFIIPLIVIFFCYGQLVFTVKEAAA----S
ATTQKAEKEVTRMVIIMVIAFLICWLPYAGVAFYIFTHQGSDFGPIFMTIPAFFAKTSAVYNPVIYIMMNKQFRNCMVTT
LCCGKNP------STTVSKTETSQVAPA----------------------------------------------------
--------------------------------------------------------------------------------
--------------------..----------------------------------------------------------
--------------------------------------------------------------------------------
------------------------------------------------------------..------------------
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--------------------.-.-.----.--------------------------------------------------
----------------.*

You have a typo here - 'structureY' should be 'structureX'. But you areasking Modeller to read residues 1:A through 977:A, so it'll read all ofthe A chain, then all of the B chain (since it hasn't reached 977:Ayet), then all the HETATM residues until it gets to 977:A (a bunch ofNAG, HG, and ZN residues).

It results in this error:

_modeller.ModellerError: read_te_290E>  Number of residues in the
alignment and  pdb files are different:      347      650 For
alignment entry:        1  B._taurus

Remember, Modeller needs to match the alignment sequence against the PDBsequence. So you need the full sequence of the B chain in your B._taurusalignment entry - you can't have any "unspecified residues". (So yoursolution above is close, but you need the gaps in the target sequence,not the template structure.)

Alternatively, you can edit the PDB file in a text editor and make amodified PDB file that only contains the residues you're interested in,if you don't want to specify the whole B sequence in your alignment.

I've seen the following thread and tried to follow the advice
enclosed, but I am still having trouble.
http://salilab.org/archives/modeller_usage/2008/msg00183.html
http://salilab.org/archives/modeller_usage/2008/msg00186.html

The problem in that thread is different - they had the wrong endingresidue number in their alignment file header, so Modeller simplystopped reading the PDB file before it reached the ligands.

	Ben Webb, Modeller Caretaker
--
modeller-care@salilab.org             http://www.salilab.org/modeller/
Modeller mail list: http://salilab.org/mailman/listinfo/modeller_usage

References:
- [modeller_usage] Including hetatm when hetatm does not immediately follow ATOM list
  - From: Shuo Huai Johnny Wu <wushuohuai@gmail.com>

Prev by Date: [modeller_usage] Including hetatm when hetatm does not immediately follow ATOM list
Next by Date: [modeller_usage] Loop Modeling using only part of a PDB file
Previous by thread: [modeller_usage] Including hetatm when hetatm does not immediately follow ATOM list
Next by thread: [modeller_usage] Loop Modeling using only part of a PDB file
Index(es):
- Date
- Thread