[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [modeller_usage] PDB updates for Modeller

To: procter@zbh.uni-hamburg.de
Subject: Re: [modeller_usage] PDB updates for Modeller
From: Eswar Narayanan <eashwar@salilab.org>
Date: Mon, 4 Oct 2004 17:41:05 -0700
Cc: modeller_usage@salilab.org

Procter is right. BUILD_PROFILE can be seen as a command thatsupersedes SEQUENCE_SEARCH, to identify potential templates and get areliable alignment for modeling.

Eswar.


On Oct 4, 2004, at 6:37 AM, J B Procter wrote:

It is possible to build new sequence databases for modeller - and, as

Eswar said, there are two relevant commands. Writing a script to dothis

is unavoidable, though, unless the caretaker has one ready for everyone
to download!

As a very quick fix, you could get the current pdb sequence list from
here :
ftp://ftp.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt

Then, follow the script in
modeller7v7/examples/commands/build_profile.top, which shows how you

can read in a simple FASTA sequence flatfile database, like the onefromthe pdb website, and then use it to align against your sequence inorder

to build a sequence profile (and by that, retrieve all homologous
sequences from the PDB).

To do the job properly, you need to apply the make_chains command
(modeller7v7/examples/commands/make_chains.top) to generate the extra

information that is written into the PIR information fields, and usedby

modeller fetch the correct PDB file for each sequence in the database.

If you have a mirror of the PDB, then this script (for unix) mightwork:

#!/bin/bash

# makes chain records and places them pdb_seq.chn in the currentworking

# directory.
# you need to change this to point to your local copy of the PDB,

PDBDIR="/projects/biodata/pdb/data/structures/all/pdb"

for p in `ls -1 $PDBDIR`
 do
    y=`basename $p .ent.Z`;
    if [[ $p != $y ]]; then
        echo READ_MODEL FILE = \'$PDBDIR/$p\' > make_chains_.top
        echo MAKE_CHAINS MINIMAL_CHAIN_LENGTH = 30, \
	MINIMAL_RESOLUTION = 2.0, MINIMAL_STDRES = 30, \
	CHOP_NONSTD_TERMINII = on, \
	STRUCTURE_TYPES =\'structureN structureX\' >> make_chains_.top
 	mod7v7 make_chains_.top
	cat ${y/pdb/\.\/}.*.chn >> pdb_seq.chn
        rm ${y/pdb/\.\/}.*.chn
    fi
 done;

After that, which will take some time to run, pdb_seq.chn will containa

subset of all the PDB chains, in a similar form to the CHAINS_all.seq
file.

You should, then, be able to read this new database in, apply SEQFILTER
(see the example/command/seqfilter.top) , and write out the list of
chain representatives (at 95%, for instance). For best use, you should
rewrite the database (via READ_SEQUENCE_DB and WRITE_SEQUENCE_DB) in
binary format and limit it to just the representative sequences
generated by SEQFILTER (by specifying the CHAINS_LIST option on
READ_SEQUENCE_DB).


Enjoy!
j.

_______________________________________________________________________
Dr JB Procter:Biomolecular Modelling at ZBH - Center for Bioinformatics
Hamburg       http://www.zbh.uni-hamburg.de/staff.php
_______________________________________________
modeller_usage mailing list
modeller_usage@salilab.org
http://salilab.org/mailman/listinfo/modeller_usage

References:
- [modeller_usage] PDB updates for Modeller
  - From: Bruno Afonso <brunomiguel@dequim.ist.utl.pt>
- Re: [modeller_usage] PDB updates for Modeller
  - From: Eswar Narayanan <eashwar@salilab.org>
- Re: [modeller_usage] PDB updates for Modeller
  - From: Bruno Afonso <brunomiguel@dequim.ist.utl.pt>
- Re: [modeller_usage] PDB updates for Modeller
  - From: J B Procter <procter@zbh.uni-hamburg.de>

Prev by Date: Re: [modeller_usage] PDB updates for Modeller
Next by Date: Re: [modeller_usage] Hard limits for size of sequence database
Previous by thread: Re: [modeller_usage] PDB updates for Modeller
Next by thread: Re: [modeller_usage] PDB updates for Modeller
Index(es):
- Date
- Thread