I cannot get Chain.join() working properly for multi-chain PDB files.
For example, for PDB code 2YFH, which has chains A, B, C, D, E, F,: * when I join D and E I get A, B, C, D, F as expected * but if I try to join A and B I get A, F only
To reproduce run the following script: ----- from modeller import *
env = environ()
mdl = model(env) mdl.read(file='2yfh')
print "before:" for c in mdl.chains: print c
mdl.chains['A'].join(mdl.chains['B'])
print "after:" for c in mdl.chains: print c
mdl.write('out.pdb')
print "reading new file:" mdl1 = model(env) mdl1.read('out.pdb')
for c in mdl1.chains: print c
------------ will print in the log: before: <Chain 'A'> <Chain 'B'> <Chain 'C'> <Chain 'D'> <Chain 'E'> <Chain 'F'> after: <Chain 'A'> <Chain 'F'> <Chain 'F'> <Chain 'F'> <Chain 'F'> reading new file: relabel_387W> Model has multiple chains, and they do not all have a unique chain ID. Suggest you relabel them as A, B, C, etc. relabel_469W> At least one residue number in the model is duplicated. (First duplication is residue " 1", chain " A".) Suggest you renumber the residues and/or chains to avoid selecting the wrong residues (e.g. with model_segment). <Chain 'A'> <Chain 'A'> <Chain 'F'>
Tested with Modeller 9v8 and Modeller 9v9.
Is it a bug or expected behavior?
Best regards, Jan Kosinski
On 4/5/12 4:00 AM, Jan Kosinski wrote: > I cannot get Chain.join() working properly for multi-chain PDB files. > > For example, for PDB code 2YFH, which has chains A, B, C, D, E, F,: > * when I join D and E I get A, B, C, D, F as expected > * but if I try to join A and B I get A, F only
Yes, you're right. There is a bug that affects joining adjacent chains when the second chain is *not* the last chain in the PDB (but as you noticed, joining non-adjacent chains works fine). The fix for this will be in the next release (out within about a month). Below is a workaround you can use in the meantime:
from modeller import * from modeller.scripts import complete_pdb
env = environ() env.libs.topology.read('${LIB}/top_heav.lib') env.libs.parameters.read('${LIB}/par.lib')
mdl = model(env) mdl.read(file='2yfh')
mdl.chains['B'].name = 'A' mdl.write(file='out.pdb', no_ter=True) mdl = complete_pdb(env, 'out.pdb')
Note that even using Chain.join() you would still need to write out the file then read it back with complete_pdb(), in order to rebuild the topology (Chain.join() does not remove OXT atoms, for example).
Note that the workaround uses no_ter=True. This generates a PDB file containing no TER records. When complete_pdb() reads in such a model, it has to assume that chain breaks occur between residues in differently-named chains. Since we forced the B chain to also be labeled 'A', it will merge the two 'A' chains at this point.
Ben Webb, Modeller Caretaker
The workaround works great, thanks!
And great that I could contribute to making Modeller even better ! ;-)
Cheers, Jan Kosinski
On Apr 5, 2012, at 7:40 PM, Modeller Caretaker wrote:
> On 4/5/12 4:00 AM, Jan Kosinski wrote: >> I cannot get Chain.join() working properly for multi-chain PDB files. >> >> For example, for PDB code 2YFH, which has chains A, B, C, D, E, F,: >> * when I join D and E I get A, B, C, D, F as expected >> * but if I try to join A and B I get A, F only > > Yes, you're right. There is a bug that affects joining adjacent chains when the second chain is *not* the last chain in the PDB (but as you noticed, joining non-adjacent chains works fine). The fix for this will be in the next release (out within about a month). Below is a > workaround you can use in the meantime: > > from modeller import * > from modeller.scripts import complete_pdb > > env = environ() > env.libs.topology.read('${LIB}/top_heav.lib') > env.libs.parameters.read('${LIB}/par.lib') > > mdl = model(env) > mdl.read(file='2yfh') > > mdl.chains['B'].name = 'A' > mdl.write(file='out.pdb', no_ter=True) > mdl = complete_pdb(env, 'out.pdb') > > Note that even using Chain.join() you would still need to write out the file then read it back with complete_pdb(), in order to rebuild the topology (Chain.join() does not remove OXT atoms, for example). > > Note that the workaround uses no_ter=True. This generates a PDB file containing no TER records. When complete_pdb() reads in such a model, it has to assume that chain breaks occur between residues in differently-named chains. Since we forced the B chain to also be labeled 'A', it will merge the two 'A' chains at this point. > > Ben Webb, Modeller Caretaker > -- > modeller-care@salilab.org http://www.salilab.org/modeller/ > Modeller mail list: http://salilab.org/mailman/listinfo/modeller_usage
participants (2)
-
Jan Kosinski
-
Modeller Caretaker