Re: [modeller_usage] PDB updates for Modeller

4 Oct 2004


      Procter is right. BUILD_PROFILE can be seen as a command that 
supersedes SEQUENCE_SEARCH, to identify potential templates and get a 
reliable alignment for modeling.
Eswar.
On Oct 4, 2004, at 6:37 AM, J B Procter wrote:
>
> It is possible to build new sequence databases for modeller - and, as
> Eswar said, there are two relevant commands. Writing a script to do 
> this
> is unavoidable, though, unless the caretaker has one ready for everyone
> to download!
>
> As a very quick fix, you could get the current pdb sequence list from
> here :
> ftp://ftp.rcsb.org/pub/pdb/derived_data/pdb_seqres.txt
>
> Then, follow the script in
> modeller7v7/examples/commands/build_profile.top, which shows how you
> can read in a simple FASTA sequence flatfile database, like the one 
> from
> the pdb website, and then use it to align against your sequence in 
> order
> to build a sequence profile (and by that, retrieve all homologous
> sequences from the PDB).
>
> To do the job properly, you need to apply the make_chains command
> (modeller7v7/examples/commands/make_chains.top) to generate the extra
> information that is written into the PIR information fields, and used 
> by
> modeller fetch the correct PDB file for each sequence in the database.
>
> If you have a mirror of the PDB, then this script (for unix) might 
> work:
>
> #!/bin/bash
> # makes chain records and places them pdb_seq.chn in the current 
> working
> # directory.
> # you need to change this to point to your local copy of the PDB,
>
> PDBDIR="/projects/biodata/pdb/data/structures/all/pdb"
>
> for p in `ls -1 $PDBDIR`
>  do
>     y=`basename $p .ent.Z`;
>     if [[ $p != $y ]]; then
>         echo READ_MODEL FILE = '$PDBDIR/$p' > make_chains_.top
>         echo MAKE_CHAINS MINIMAL_CHAIN_LENGTH = 30, \
> 	MINIMAL_RESOLUTION = 2.0, MINIMAL_STDRES = 30, \
> 	CHOP_NONSTD_TERMINII = on, \
> 	STRUCTURE_TYPES ='structureN structureX' >> make_chains_.top
>  	mod7v7 make_chains_.top
> 	cat ${y/pdb/./}.*.chn >> pdb_seq.chn
>         rm ${y/pdb/./}.*.chn
>     fi
>  done;
>
> After that, which will take some time to run, pdb_seq.chn will contain 
> a
> subset of all the PDB chains, in a similar form to the CHAINS_all.seq
> file.
>
> You should, then, be able to read this new database in, apply SEQFILTER
> (see the example/command/seqfilter.top) , and write out the list of
> chain representatives (at 95%, for instance). For best use, you should
> rewrite the database (via READ_SEQUENCE_DB and WRITE_SEQUENCE_DB) in
> binary format and limit it to just the representative sequences
> generated by SEQFILTER (by specifying the CHAINS_LIST option on
> READ_SEQUENCE_DB).
>
>
> Enjoy!
> j.
>
> _______________________________________________________________________
> Dr JB Procter:Biomolecular Modelling at ZBH - Center for Bioinformatics
> Hamburg       http://www.zbh.uni-hamburg.de/staff.php
> _______________________________________________
> modeller_usage mailing list
> modeller_usage@salilab.org
> http://salilab.org/mailman/listinfo/modeller_usage

Re: [modeller_usage] PDB updates for Modeller

Eswar Narayanan