Re: [modeller_usage] Searching the PDB for sequence with structure

28 May 2008


      Hi Christian,
> If no one will come up with a better solution, you might want to build
> this tool yourself:
That sounds like fun!
>
> All you need to do is to download the PDB files and check whether the
> sequences in question are contained.
The thing is that the sequence might be there but with gaps. It would  
be nice to have that be matched by BLAST, since regular expressions  
are not suited for this.
If I am not the only one who has this problem I might do something like:
- Download all PDB Structures.
- Extract the sequences of the proteins as well as the sequences that  
have ATOM records.
- Make a quick alignment between each of those two sequences.
- Put the two sequences and the PDB ID in a database.
- Make a BLAST database file witch the full sequences.
- Make BLAST accessible on a web server.
The search results could be presented as an alignment of the sequence  
that was searched for and the sequence that actually has structure.
It would be easy to implement queries like "search for sequence X and  
return only results where more than 30% has structure".
> checks) is contained in the biopython project ( http:// 
> biopython.org/ ).
I would prefer to do it in ruby, but same same.
> MD is not an option for you?
I think the stretches are too long (several dozen aas) and to numerous  
(~100 structures * several gabs/structure). I would also feel better  
if I could use experimental data.
Cheers,
Florian

Re: [modeller_usage] Searching the PDB for sequence with structure

Florian Odronitz