Hey Modeller users,
I'm trying to model 400+ proteins based on ~100 templates. I have an alignmentfile of 1500+ sequences comprising of the templates, targets and others. ClustalW was used to align the sequences.
I have a few problems. - The sequences in the alignment file do not match the aminoacids present in the pdb files. _Generally_ the pdb files contain more residues than specified in the aligned sequence. Therefore I have to either concatonate the pdb files or specify the residues in the appropriate residues in the alignment file. - The ID codes in the alignment file do not match the atom file names. - There is no "second" line in each entry in the alignment file.
Although all this can be done manually, I can't help but wonder if there is a way to automate/expidate the process. A paper published by Sanhez and Sali (1998) mentioned perl script that allowed for rapid progress through the various steps involved with modelling. Suggestions will be most appreciated.
Some of the pdb files are complexes. If it can be avoided I'd prefer not to use these structures . However if I do decide to use some of them, I plan to minimise the E via MD (cns) of the protein (minus the ligand) before using it as a template. What are people's thoughts about this?
Many thanks