Hey Modeller users,
I'm trying to model 400+ proteins based on ~100 templates. I have an
alignmentfile of 1500+ sequences comprising of the templates, targets and
others. ClustalW was used to align the sequences.
I have a few problems.
- The sequences in the alignment file do not match the aminoacids present in
the pdb files. _Generally_ the pdb files contain more residues than specified
in the aligned sequence. Therefore I have to either concatonate the pdb files
or specify the residues in the appropriate residues in the alignment file.
- The ID codes in the alignment file do not match the atom file names.
- There is no "second" line in each entry in the alignment file.
Although all this can be done manually, I can't help but wonder if there is a
way to automate/expidate the process. A paper published by Sanhez and Sali
(1998) mentioned perl script that allowed for rapid progress through the
various steps involved with modelling. Suggestions will be most appreciated.
Some of the pdb files are complexes. If it can be avoided I'd prefer not to
use these structures . However if I do decide to use some of them, I plan to
minimise the E via MD (cns) of the protein (minus the ligand) before using it
as a template. What are people's thoughts about this?
Many thanks