Hi All,
I would like create a hexameric model using a target sequence and a hexameric PDB template. I am trying to create an alignment using input files that have the \ character to separate the chains. Example:
>P1;target sequence:target:::::::0.00: 0.00 AAAAA\ BBBBB\ CCCCC\ DDDDD\ EEEEE\ FFFFF*
I'm using the align2d.py example, as detailed in tutorial 2. My input PDB file has six chains (A-F). My align2d.py file is as follows:
""" align2d.py START from modeller import * log.verbose() env = environ() aln = alignment(env) mdl = model(env, file='PDBfile_hexamer', model_segment= ('FIRST:A','LAST:F')) aln.append_model(mdl, align_codes='PDBfile_hexamer', atom_files='PDBfile_hexamer.pdb') aln.append(file='target_hexamer.ali', align_codes='target') aln.align2d() aln.write(file='target_hexamer-PDBfile_hexamer.ali', alignment_format='PIR') aln.write(file='target_hexamer-PDBfile_hexamer.pap', alignment_format='PAP') """align2d.py END
Using modeller9v1 (linux RPM), I started the above calculation (python align2d.py), but it has not finished after 20 hours (and is still going; 100% CPU and 22.6% memory usage out of 3 GB total, according to 'top'). Although I used log.verbose(), there is very little output in the associated log (see below).
Any ideas?
Thanks, Doug
""" LOG START
MODELLER 9v1, 2007/01/19, r4822
PROTEIN STRUCTURE MODELLING BY SATISFACTION OF SPATIAL RESTRAINTS
Copyright(c) 1989-2007 Andrej Sali All Rights Reserved
Written by A. Sali with help from B. Webb, M.S. Madhusudhan, M-Y. Shen, M.A. Marti-Renom, N. Eswar, F. Alber, M. Topf, B. Oliva, A. Fiser, R. Sanchez, B. Yerkovich, A. Badretdinov, F. Melo, J.P. Overington, E. Feyfant University of California, San Francisco, USA Rockefeller University, New York, USA Harvard University, Cambridge, USA Imperial Cancer Research Fund, London, UK Birkbeck College, University of London, London, UK
Kind, OS, HostName, Kernel, Processor: 4, Linux carbo.msbb.uc.edu 2.6.18-1.2869.fc6 i686 Date and time of compilation : 2007/01/19 13:47:34 MODELLER executable type : i386-intel8 Job starting time (YY/MM/DD HH:MM:SS): 2007/05/11 11:26:27
openf___224_> Open $(LIB)/restyp.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL9v1}/modlib/ resgrp.lib rdresgr_266_> Number of residue groups: 2 openf___224_> Open ${MODINSTALL9v1}/modlib/sstruc.lib
Dynamically allocated memory at amaxlibraries [B,KiB,MiB]: 3216122 3140.744 3.067
Dynamically allocated memory at amaxlibraries [B,KiB,MiB]: 3216650 3141.260 3.068 openf___224_> Open ${MODINSTALL9v1}/modlib/resdih.lib
Dynamically allocated memory at amaxlibraries [B,KiB,MiB]: 3265250 3188.721 3.114 rdrdih__263_> Number of dihedral angle types : 9 Maximal number of dihedral angle optima: 3 Dihedral angle names : Alph Phi Psi Omeg chi1 chi2 chi3 chi4 chi5 openf___224_> Open ${MODINSTALL9v1}/modlib/radii.lib
Dynamically allocated memory at amaxlibraries [B,KiB,MiB]: 3280562 3203.674 3.129 openf___224_> Open ${MODINSTALL9v1}/modlib/radii14.lib openf___224_> Open ${MODINSTALL9v1}/modlib/solv.lib openf5__224_> Open 11 OLD SEQUENTIAL ${MODINSTALL9v1}/modlib/ af_mnchdef.lib rdwilmo_274_> Mainchain residue conformation classes: APBLE openf___224_> Open ${MODINSTALL9v1}/modlib/mnch.lib rdclass_257_> Number of classes: 5 openf___224_> Open ${MODINSTALL9v1}/modlib/mnch1.lib openf___224_> Open ${MODINSTALL9v1}/modlib/mnch2.lib openf___224_> Open ${MODINSTALL9v1}/modlib/mnch3.lib openf___224_> Open ${MODINSTALL9v1}/modlib/xs4.mat rdrrwgh_268_> Number of residue types: 21 runcmd______> model.read(file='PDBfile_hexamer', (def) model_format='PDB', model_segment=('FIRST:A', 'LAST:F')) openf___224_> Open PDBfile_hexamer.pdb
Dynamically allocated memory at amaxmodel [B,KiB,MiB]: 7756771 7574.972 7.397 openf___224_> Open PDBfile_hexamer.pdb read_mo_297_> Segments, residues, atoms: 24 4818 38574 read_mo_298_> Segment: 1 4 A 234 A 1790 read_mo_298_> Segment: 2 246 A 271 A 214 read_mo_298_> Segment: 3 291 A 636 A 2799 read_mo_298_> Segment: 4 651 A 850 A 1626 read_mo_298_> Segment: 5 4 B 234 B 1790 read_mo_298_> Segment: 6 246 B 271 B 214 read_mo_298_> Segment: 7 291 B 636 B 2799 read_mo_298_> Segment: 8 651 B 850 B 1626 read_mo_298_> Segment: 9 4 C 234 C 1790 read_mo_298_> Segment: 10 246 C 271 C 214 read_mo_298_> Segment: 11 291 C 636 C 2799 read_mo_298_> Segment: 12 651 C 850 C 1626 read_mo_298_> Segment: 13 4 D 234 D 1790 read_mo_298_> Segment: 14 246 D 271 D 214 read_mo_298_> Segment: 15 291 D 636 D 2799 read_mo_298_> Segment: 16 651 D 850 D 1626 read_mo_298_> Segment: 17 4 E 234 E 1790 read_mo_298_> Segment: 18 246 E 271 E 214 read_mo_298_> Segment: 19 291 E 636 E 2799 read_mo_298_> Segment: 20 651 E 850 E 1626 read_mo_298_> Segment: 21 4 F 234 F 1790 read_mo_298_> Segment: 22 246 F 271 F 214 read_mo_298_> Segment: 23 291 F 636 F 2799 read_mo_298_> Segment: 24 651 F 850 F 1626 relabel_387W> Model has multiple chains, and they do not all have a unique chain ID. Suggest you relabel them as A, B, C, etc. runcmd______> alignment.append_model (atom_files='PDBfile_hexamer.pdb', align_codes='PDBfile_hexamer')
Dynamically allocated memory at amaxalignment [B,KiB,MiB]: 7896464 7711.391 7.531
Dynamically allocated memory at amaxstructure [B,KiB,MiB]: 10639518 10390.154 10.147 runcmd______> alignment.append(align_codes=['PDBfile_hexamer', 'target'], atom_files=['PDBfile_hexamer.pdb'], file='target_hexamer.ali', (def)remove_gaps=True, (def) alignment_format='PIR', (def)rewind_file=False, (def)close_file=True) openf___224_> Open target_hexamer.ali
Dynamically allocated memory at amaxalignment [B,KiB,MiB]: 12123550 11839.404 11.562
Dynamically allocated memory at amaxalignment [B,KiB,MiB]: 13607494 13288.568 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13607522 13288.596 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13607550 13288.623 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13607578 13288.650 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13607606 13288.678 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13607634 13288.705 12.977
Dynamically allocated memory at amaxsequence [B,KiB,MiB]: 13627070 13307.686 12.996
Read the alignment from file : target_hexamer.ali Total number of alignment positions: 4865
# Code #_Res #_Segm PDB_code Name ------------------------------------------------------------------------ ------- 1 PDBfile_hexa 4818 24 PDBfile_hexame undefined 2 target 4860 6 target runcmd______> alignment.align2d((def)overhang=0, (def)align_block=0, (def)rr_file='$(LIB)/as1.sim.mat', (def)align_what='BLOCK', (def) off_diagonal=100, (def)max_gap_length=999999, (def) local_alignment=False, (def)matrix_offset=0.0, (def)gap_penalties_1d= (-900.0, -50.0), (def)gap_penalties_2d=(3.5, 3.5, 3.5, 0.20000000000000001, 4.0, 6.5, 2.0, 0.0, 0.0), (def)surftyp=1, (def) fit=True, (def)fix_offsets=(0.0, -1.0, -2.0, -3.0, -4.0), (def) read_weights=False, (def)write_weights=False, (def) input_weights_file='', (def)output_weights_file='', (def)n_subopt=1, (def)subopt_offset=0.0, (def)read_profile=False, (def) input_profile_file='', (def)write_profile=False, (def) output_profile_file='', (def)weigh_sequences=False, (def) smooth_prof_weight=10, (def)weights_type='SIMILAR') align2d_276_> 'align_block' changed to 1. openf___224_> Open PDBfile_hexamer.pdb
Dynamically allocated memory at amaxstructure [B,KiB,MiB]: 13627550 13308.154 12.996 openf___224_> Open PDBfile_hexamer.pdb mkapsa__637W> No residue topology library is in memory. Better radii would be used if topology.read() is called first. iup2crm_280W> No topology library in memory or assigning a BLK residue. Default CHARMM atom type assigned: N --> N This message is written only for the first such atom. openf___224_> Open $(LIB)/as1.sim.mat rdrrwgh_268_> Number of residue types: 20 """ LOG END
Douglas Kojetin wrote: > I would like create a hexameric model using a target sequence and a > hexameric PDB template. I am trying to create an alignment using > input files that have the \ character to separate the chains. Example: > > >P1;target > sequence:target:::::::0.00: 0.00 > AAAAA\ > BBBBB\ > CCCCC\ > DDDDD\ > EEEEE\ > FFFFF*
Do you actually mean the / character here? '/' designates a chain break, but '' has no special meaning in Modeller alignments.
> Using modeller9v1 (linux RPM), I started the above calculation > (python align2d.py), but it has not finished after 20 hours
I suggest you try setting the max_gap_length argument to align2d to something smaller than the default - say 20. By default it will backtrack through your alignment matrix to try and find the very best alignment, and for long sequences this can be extremely time consuming, so it is best to limit the gap length.
> according to 'top'). Although I used log.verbose(), there is very > little output in the associated log (see below).
Yes - you won't see any output from align2d until it has finished.
You may also want to try aligning each chain individually. There is no clear way to align chain breaks, so most of Modeller's alignment routines will either remove them entirely or simply ignore them. In many cases you want the chain breaks in your templates to be "aligned" with the chain breaks in the model, and one way to do that is to make separate alignments for each chain, and then recombine them later.
Ben Webb, Modeller Caretaker