Re: [modeller_usage] How to get Secondary Structure in Alignment file

Thomas,
I don't expect all this what you replied.

See i understand all that what you told. But there is a big problem there, to try multiple alignments and iterated trials to land on good models(which itself is ambiguous, as you can't really discriminate good and bad model, until you have answer or GDT).

Two important factors influence the ability to predict accurate models: the extent of structural conservation between target and template, and the correctness of alignment (Kryshtafovych et al 2005, PMID: 16187365). The alternative alignments serve as a means to optimize the alignment at regions of sparse sequence similarity, which usually do not include common SSEs. Apart from statistical potentials and protein geometry checks, I'm not aware of any other method to discriminate good and bad structures.

So i am developing a complete in-house algorithm to result in the best representative alignment from
the considered templates against the target. So i want to incorporate secondary structure information or localized fold topologies to generate a 3D alignment file. Hence i am looking forward for a python script which can be integrated with MODELLER to result in the information like this.

Template1> HCLCH
Template2> HCLHH
Target> xxxxx
(H=Helix. C=Coil, L=Loop)

I JUST NEED SECONDARY STRUCTURE INFORMATION OF EACH ALIGNED RESIDUE OF EACH TEMPLATE, BASED ON ITS PROBABILITY TO LAND IN SAME FOLD, AS PER THE AVAILABLE PROXIMAL RESIDUES.

I'm not aware how you can get the sequence alignment in that form from MODELLER, but there must be a way perhaps with another program. Alternatively you can write a code snippet in python that will call PyMol or VMD to get the secondary structure of each residue of your templates (I can give you my scripts in case you are interested).

Ya, and by the way, SCRWL won't help if you choose bad template/alignment. Refinement improves GDT_TS scores by just 4 or 5%. Good coordinate selection is most important. MODELLER also can't result in good models, if you mix up incorrect template informations, and the problem becomes more complex, when the new fold comes up.

I never said that. When sequence identity <35%, the rotamers of conserved residues may differ in up to 45% of the cases (Sanchez et al. 1997, PMID: 9485495 ). SCWRL4 is used upon optimum template(s) and alignment selection to optimize the rotamers. This approach has been also implemented by the best performing human-expert group in CASP8 (Venclovas et al 2009, PMID: 19639635).

In essence, I use the alternative-alignment approach when I have to choose between templates that have 25-40% sequence identity with the target. Provided that all templates share the same fold (i.e. may belong to the same SCOP family or have high CE-MC mean Z-scores), the selection of multiple templates in that low sequence similarity level produces better models than those constructed from a single parent (Moult J 2005, PMID: 15939584; Dalton et al 2007, PMID: 17510171). HMOPAL (that's the name of my pipeline) aims to select the best template combination and alignment based on the native-likeness of the resulting homology models, as judged primarily by the DOPE-HR raw score, and secondarily by PROCHECK and WHATCHECK.

regards,
Thomas

Thanks for support.

Ashish

Ashish Runthala,
Lecturer, Structural Biology Cell,
Biological Sciences Group,
BITS, Pilani
Rajasthan, INDIA

----- Original Message -----
From: "Thomas Evangelidis" <tevang3 AT gmail.com>
To: "Ashish Runthala" <ashishr AT bits-pilani.ac.in>
Cc: "modeller usage" <modeller_usage@listsrv.ucsf.edu>

Sent: Tuesday, March 29, 2011 4:46:20 AM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: [modeller_usage] How to get Secondary Structure in Alignment file

On 27 March 2011 10:05, Ashish Runthala < ashishr AT bits-pilani.ac.in > wrote:

Dear Thomas,
I will write it clearly this time.
Take simplest case.Suppose, I have a structural/sequence alignment of 3 templates against a target sequence. Now, as modeller predicts tertiary conformation for the target sequence, it surely predicts the alignment based secondary structure information against all the templates. So now i want that specific secondary structure information along with the alignment file for the aligned templates against the target sequence.
Now my C script will take on this alignment file to generate the representative best possible optimum alignment for the considered target sequence. These models in my tried iterations for the CASP8 targets gave excellent results.

So if I understand correctly, your script creates alternative alignments, then runs Modeller to build a model of the target from each alignment, it reads the secondary structure (SS) from each model, and based on this SS it judges which alternative alignment is the optimum. Is this how it works? Can you elaborate on the algorithm? What are the criteria to select the optimum alignment and what is the answer you seek from the mailing list?

I think i made myself clear this time.
By the way, what you are working on. I think you are also working on the same topics. What you have redefined? I am interested to know that.

I've been working on optimization at the alignment level in the past, but in my approach the selection of the optimum alignment and template combination is based on the overall native-likeness of the resulting homology models, not just the secondary structure of the target. More specifically, I create homology models at several levels of optimization and refinement from alternative alignments and different template combinations. The resulting models are sorted according to the DOPE-HR score, and the best of them undergo side-chain optimization with SCWRL4, and extensive stereochemical tests with the programs PROCHECK and WHAT_CHECK to select the best of them. The whole procedure is streamlined and parallelized with a combination of bash and python scripts. However, it has been tested only on my cases and a few CASP9 targets. More details can be found at:

https://sites.google.com/site/hmopalpipeline/home (I will update the website tomorrow)

Thanks in advance,

Ashish

Ashish Runthala,
Lecturer, Structural Biology Cell,
Biological Sciences Group,
BITS, Pilani
Rajasthan, INDIA

----- Original Message -----
From: "Thomas Evangelidis" < tevang3 AT gmail.com >
To: "Ashish Runthala" < ashishr AT bits-pilani.ac.in >

Cc: "modeller usage" < modeller_usage@listsrv.ucsf.edu >
Sent: Saturday, March 26, 2011 6:06:02 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: [modeller_usage] How to get Secondary Structure in Alignment file

On 26 March 2011 07:47, Ashish Runthala < ashishr AT bits-pilani.ac.in > wrote:

Dear Thomas,
You mentioned the free tools in details. I appreciate that. But what i don't understand is to manually call everyone, and then keep on doing it, is tedious but fun it is.

So if I get it right this time, you didn't ask what commercial free tools are available to display your sequence alignment along with the secondary structure, is that correct? Could you please be more careful with your syntax cause it's kind of difficult to follow you.

I had a query that template secondary structure information is there in templates, but with indels, that may be disturbed. I mean the native information is deteriorated.

So if based on indels, available in current alignment, if i can link directly to a python script. So that my next programme to generate the optimal alignment will become successful, as i did all of these several times manually, to design the algorithm to generate best representative alignment.

So you want to write a python script that can cope with indels inside SSEs. Namely, given a structure-based alignment between your target and templates, you want to select which of the following alignments is the optimum:

target > HCSIHHSC
Template1> HCSIHISC
Template2> HCSHHISC
-------------------------------------------------------
target > HCS-IHHSC
Template1> HCS-IHISC
Template2> HCSH-HISC
-------------------------------------------------------
target > HCSI-HHSC
Template1> HCS-IHISC
Template2> HCS-HHISC

etc.

Is this what you seek?

Will you please elaborate this time.

Thanks,
Ashish

Ashish Runthala,
Lecturer, Structural Biology Cell,
Biological Sciences Group,
BITS, Pilani
Rajasthan, INDIA

----- Original Message -----
From: "Thomas Evangelidis" < tevang3 AT gmail.com >
To: "Ashish Runthala" < ashishr AT bits-pilani.ac.in >
Cc: "modeller" < modeller_usage@listsrv.ucsf.edu >
Sent: Friday, March 25, 2011 4:50:42 PM GMT +05:30 Chennai, Kolkata, Mumbai, New Delhi
Subject: Re: [modeller_usage] How to get Secondary Structure in Alignment file

Maestro from Schrodinger can display both the predicted secondary structure elements (SSE) of the target and the SSE of the templates above the alignment dynamically (namely if you change the alignment the SSEs are changed accordingly).

Alternatively you can use the following commercial free tools:

ESPript ( http://espript.ibcp.fr/ESPript/ESPript/ ): can create an image of your alignment with the SSEs of your templates above it. No SSE of the target is displayed though.

MolIDE ( http://dunbrack.fccc.edu/molide/index.php ): I have the notion that this alignment editor (and not only) can display both the predicted SSE of the target and the SSE of the templates. However, I didn't manage to run it yet due to compilation problems. I intend to give it a try in the near future.

STRAP ( http://3d-alignment.eu/ ): it can do both but the GUI is not handy in Linux (windows and fonts are unequal, etc.).

Finally you can add a row with the secondary structure (as you wrote it in your email) above each sequence in your alignment editor. I prefer Jalview and like to play around with groups and colors. See the attached image for an example. SSE of your templates can be retrieved from PDB (e.g. http://www.pdb.org/pdb/explore/sequenceText.do?structureId=2IXF&chainId=A ) whereas the SSE of your target can be predicted by Jpred3 ( http://www.compbio.dundee.ac.uk/www-jpred/ ).

HTH,

Thomas

On 25 March 2011 05:47, Ashish Runthala < ashishr AT bits-pilani.ac.in > wrote:

Dear Modellers,
Suppose i have two templates, and a target sequence. Instead of sequence alignment based file, i want to consider the secondary structure information, like given below

Template1> HCSIHISC
Template2> HCSHHISC

H=Helix, C=Coil, S-Strand, I=Indels

Is it possible to get the alignment in this case for multiple templates against the target sequence( whose structure can also be represented as best aligned folds)

Thanks
Ashish

Ashish Runthala,
Lecturer, Structural Biology Cell,
Biological Sciences Group,
BITS, Pilani
Rajasthan, INDIA
_______________________________________________
modeller_usage mailing list
modeller_usage@listsrv.ucsf.edu
https://salilab.org/mailman/listinfo/modeller_usage

--

======================================================================

Thomas Evangelidis

PhD student

Biomedical Research Foundation, Academy of Athens

4 Soranou Ephessiou , 115 27 Athens, Greece

email: tevang AT bioacademy.gr

tevang3 AT gmail.com

website: https://sites.google.com/site/thomasevangelidishomepage/

--

======================================================================

Thomas Evangelidis

PhD student

Biomedical Research Foundation, Academy of Athens

4 Soranou Ephessiou , 115 27 Athens, Greece

email: tevang AT bioacademy.gr

tevang3 AT gmail.com

website: https://sites.google.com/site/thomasevangelidishomepage/

--

======================================================================

Thomas Evangelidis

PhD student

Biomedical Research Foundation, Academy of Athens

4 Soranou Ephessiou , 115 27 Athens, Greece

email: tevang AT bioacademy.gr

tevang3 AT gmail.com

website: https://sites.google.com/site/thomasevangelidishomepage/