Thanks Karsten,
Your message was helpfull. I think I'll try T-COFEE/SAP along with the ALIGN2D routine. I've developed a few programs/scripts that helped me correct sequence ID codes to match atom files. The TOP scripts also came in very handy in preparing these files.
Cvetan
Quoting Karsten Suhre Karsten.Suhre@igs.cnrs-mrs.fr:
> Hello Cvetan, > > from my own experience I know that Modeller is quite picky about correct > sequence (as it should). > > Anyhow, I would not use ClustalW alignments in the first place. When you > model > a particular protein using several templates, you are *much* better off using > > structural alignments instead of sequence based alignments alone. You could > use for example T-COFFEE together with SAP (or maybe Fugue, but I have no > experience with it). You would then use your PDB files from the start in the > > alignment process, not unrelated Genbank sequences, and Modeller would thus > find all residues it needs in the alignment. Alternatively, there are > structrual alignments readily available at Homstrad for a large number of > proteins. > > Note also that Modeller comes with a file modlib/CHAINS_all.seq. If you took > > the sequences from this file in your ClustalW alignments it should work with > > Modeller. > > Hope this helps, > > Kind regards, > > Karsten. > > > I'm trying to model 400+ proteins based on ~100 templates. I have an > > alignmentfile of 1500+ sequences comprising of the templates, targets and > > others. ClustalW was used to align the sequences. > > > > I have a few problems. > > - The sequences in the alignment file do not match the aminoacids present > > in the pdb files. _Generally_ the pdb files contain more residues than > > specified in the aligned sequence. Therefore I have to either concatonate > > the pdb files or specify the residues in the appropriate residues in the > > alignment file. - The ID codes in the alignment file do not match the atom > > file names. - There is no "second" line in each entry in the alignment > > file. > > > > Although all this can be done manually, I can't help but wonder if there > is > > a way to automate/expidate the process. A paper published by Sanhez and > > Sali (1998) mentioned perl script that allowed for rapid progress through > > the various steps involved with modelling. Suggestions will be most > > appreciated. > > > > Some of the pdb files are complexes. If it can be avoided I'd prefer not > > to use these structures . However if I do decide to use some of them, I > > plan to minimise the E via MD (cns) of the protein (minus the ligand) > > before using it as a template. What are people's thoughts about this? > > > > Many thanks >