Trouble running superpose on subsets of multichain pdbs
Dear Modelling Community,
I have two pdb structures I am interested in superposing, using an alignment I have created outside of modeller. The alignment is in pir format, and only represents a single chain from each PDB (each PDB has multiple chains in this case). Ideally, I would like to have the superposition be performed by only considering a subset of the aligned residues that I define (the most conserved positions).
I have already looked through the superpose.py example, converted my alignment into pir format, and familiarized myself with the modeller 'selection' object. Unfortunately, I am running into problems when I attempt to run the superposition. The pir alignment I am using is below:
>P1;1ck0 structureX:1ck0:1:H:216:H:Ab1:Human:2.0:0.10 QVQLQESGGGLVQPRGSLKLSCAASGFTFNTDAMNWVRQAPGKGLEWVARIRSKGFNFATYYADSVRDRFTISRD DSQSMLYLQMNNLKTEDTGIYYCVRGRDGEAMDYWGQGTTLTVSSAKTTPPSVYPLAPM--------VTLGCLVK GYFPEPVTVTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVPSSPRPSETVTCNVAHPASSTKVDKKIVN * >P1;1e4w structureX:1e4w:1:H:211:H:Ab2:Human:2.0:0.10 QVQLQQPGAELVKPGASVKLSCKASGFTFTNYWMHWVKQRPGQGLEWIGEILPS--NGRTNYNEKFKTKATLTVD KSSNTAYMQLSSLTSEDSAVYYCARSPS----DYWGQGTTLTVSSAKTTAPSVYPLAPVCGDTTGSSVTLGCLVK GYFPEPVTLTWNSGSLSSGVHTFPAVLQSDLYTLSSSVTVTSSTWPSQSITCNVAHPASSTKVDKKIEP *
When I attempt to just swap out these pdbs and this alignment with the ones in the example superpose.py, it returns an error, complaining that my submitted sequence length does not match the pdb sequence length.
-bash-3.00$ mod9v3 superpose.py Traceback (most recent call last): File "superpose.py", line 17, in ? r = atmsel2.superpose(mdl2, aln) File "/usr/lib/modeller9v3/modlib/modeller/selection.py", line 476, in superpose fit, refine_local, rms_cutoff) _modeller.ModellerError: chk_aln_340E> Number of residues in model ( 427) does not match that in alignment ( 211).
What am I doing wrong? The superpose.py I used is shown below.
# Example for: selection.superpose()
# This will use a given alignment to superpose Calpha atoms of # one structure (2ctx) on the other (1fas).
from modeller import *
env = environ() env.io.atom_files_directory = '/datasets/pdb/'
mdl = model(env, file='1ck0') mdl2 = model(env, file='1e4w') aln = alignment(env, file='test.ali', align_codes=('1ck0', '1e4w'))
atmsel = selection(mdl).only_atom_types('CA') r = atmsel.superpose(mdl2, aln)
# We can now use the calculated RMS, DRMS, etc. from the returned 'r' object: rms = r.rms drms = r.drms print "%d equivalent positions" % r.num_equiv_pos
mdl2.write(file='1e4w.fit')
Jake Gunn-Glanville wrote: > -bash-3.00$ mod9v3 superpose.py > Traceback (most recent call last): > File "superpose.py", line 17, in ? > r = atmsel2.superpose(mdl2, aln) > File "/usr/lib/modeller9v3/modlib/modeller/selection.py", line 476, in > superpose > fit, refine_local, rms_cutoff) > _modeller.ModellerError: chk_aln_340E> Number of residues in model ( > 427) does not match that in alignment ( 211). > > > What am I doing wrong? The superpose.py I used is shown below. ...
> mdl = model(env, file='1ck0') > mdl2 = model(env, file='1e4w') > aln = alignment(env, file='test.ali', align_codes=('1ck0', '1e4w'))
The two models have to match the alignment sequences, and the problem here is that while your alignment file header specifies only single chains (1:H through 216:H in 1ck0, and 1:H through 211:H in 1e4w) when you read the models you don't ask for those same ranges, so by default Modeller reads all chains. To fix this you need to specify model_segment when you read in the models, i.e.
mdl = model(env, file='1ck0', model_segment=('1:H', '216:H')) mdl2 = model(env, file='1e4w', model_segment=('1:H', '211:H')) aln = alignment(env, file='test.ali', align_codes=('1ck0', '1e4w'))
Of course, this is a little error prone because you have to specify everything twice. But fortunately the alignment object remembers the residue:chain range specified in the alignment header, so you can simplify this to:
aln = alignment(env, file='test.ali', align_codes=('1ck0', '1e4w')) mdl = model(env, file='1ck0', model_segment=aln['1ck0'].range) mdl2 = model(env, file='1e4w', model_segment=aln['1e4w'].range)
Ben Webb, Modeller Caretaker
participants (2)
-
Jake Gunn-Glanville
-
Modeller Caretaker