November 2010 - modeller_usage

last residue no. when there are x-ray invisible residues
by Irene Newhouse 18 Nov '10

18 Nov '10

Now that, thanks to you, I've got the order of chain/residue number in the pir file fixed, I'm random walking my way through other errors. I had to drop one of my matches because there are differences between the sequence as deposited in the PDB & the pdb file[!] [At least that has nothing to do with me]. Now another of my matches is giving me fits. It has missing residues 74-85. I carefully replaced them with -, but I'm still getting the same mismatch error message that I got before I did that. I'm wondering if I have to alter the final residue number in line 2 of the entry for that protein. I have 242, which is the position of the last residue in the sequence, but the pdb file's residue numbers jump from 73 to 86, reflecting the missing residues. I can just drop this entry from the multi-alignment I'm using, which would make the 'casualty rate' 40%, so I'd rather understand better how to fix this issue correctly. I already checked, & this is the only deposited structure for that protein, so it's either fix it or drop it. Thanks! Irene Newhouse

2 1

more sequence difference issues
by Irene Newhouse 18 Nov '10

18 Nov '10

In the interests of appearing to make progress, I dropped the alignment of which I wrote an hour or so ago from the mulitple alignments considered. Now I have another sequence difference issue with 1x06.pdb. This is how it aligns with my sequence [clustal w server pir output, with line 2 edited by hand] :>P1;1x06structureX:1x06:12:A:240:A:UPP :Erischeria coli : :-------------------------------------MMLSATQPLSEKLPAHGCR-HVAIIMDGNGRWAKKQGK-IRAFGHKAGAKSVRRAVSFAANNGIEALTLYAFSSENWNRPAQEVSALMELFVWALD---SEVKSLHRHNVRLRIIGDTSRFNSRLQERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVRQLAEKVQQ----GNLQPDQIDEEMLNQHVCMHELA------------------PVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRERRFGGTEPGDETAI checked it myself against the original fasta sequence deposited in the pdb & can't find any differences. The only missing residues are those below 12 & above 240. I scrolled through the pdb file & can't find any residues missing that might not have been noted in the header. Is there a way to get more detailed information on the problem region, or a tool to check fasta sequences against a pdb file? Thanks! IreneThe relevant end of the error log is:Dynamically allocated memory at amaxstructure [B,KiB,MiB]: 3879167 3788.249 3.699read_te_291E> Sequence difference between alignment and pdb : x (mismatch at alignment position 1) Alignment MMLSATQPLSEKLPAHGCRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSF PDB KLPAHGCRHVAIIMDGNGRWAKKQGKIRAFGHKAGAKSVRRAVSFAANNGIEALTL Match * * * * * Alignment residue type 11 (M, MET) does not match pdb residue type 9 (K, LYS), for align code 1x06 (atom file 1x06), pdb residue number "12", chain "A" Please check your alignment file header to be sure you correctly specified the starting and ending residue numbers and chains. The alignment sequence must match that from the atom file exactly. Another possibility is that some residues in the atom file are missing, perhaps because they could not be resolved experimentally. (Note that Modeller reads only the ATOM and HETATM records in PDB, NOT the SEQRES records.) In this case, simply replace the section of your alignment corresponding to these missing residues with gaps.read_te_288W> Protein not accepted: 3 1x06

3 2

Facing Problem in compare.py
by bharat lakhani 17 Nov '10

17 Nov '10

Hi when i am running compare.py. i am getting this error. Traceback (most recent call last): File "compare_1.py", line 32, in ? aln.malign3d() File "/usr/lib/modeller9v8/modlib/modeller/alignment.py", line 329, in malign3d edit_file_ext) _modeller.ModellerError: fit2xyz_296E> Number of equivalent positions < 3: 1 in this particular i am taking 12 template to making alignment.ali . but as soon as when i am starting with two, three, means any random number of template to make my alignment.ali its working.

1 0

1st time trying modeller
by Irene Newhouse 17 Nov '10

17 Nov '10

I'm trying to get the format right for a multiple alignment I did with a Clustal W server to use as input to modeller. I don't understand the error message; it probably means I don't understand the numbering convention. My sequence has 290 residues & ends up with 316 characters due to alignment gaps. I get exactly the same error message whether I use 290 or 316 as the end residue. Below are various bits of information that might be helpful in debugging what I'm doing. Thanks for your help! Irene Newhouse ***The input python script:[newhoir@localhost rubber]$ cat mod-rubber1.py# Homology modeling with multiple templatesfrom modeller import * # Load standard Modeller classesfrom modeller.automodel import * # Load the automodel classlog.verbose() # request verbose outputenv = environ() # create a new MODELLER environment to build this model in# directories for input atom filesenv.io.atom_files_directory = ['.', '/home/newhoir/rubber/atom_files']a = automodel(env, alnfile = 'multi.ali', # alignment filename knowns = ('2d2r', '1ueh', '2vg3', '1f75', '2vg0'), # codes of the templates sequence = '2cpt') # code of the targeta.starting_model= 1 # index of the first modela.ending_model = 5 # index of the last model # (determines how many models to calculate)a.make() # do the actual homology modeling***The run command:mod9v8 mod-rubber1.py***atom_files:ls -l /home/newhoir/rubber/atom_filestotal 2640-rw-rw-r-- 1 newhoir newhoir 324891 Nov 16 11:17 1f75.pdb-rw-rw-r-- 1 newhoir newhoir 405810 Nov 16 11:18 1ueh.pdb-rw-rw-r-- 1 newhoir newhoir 372276 Nov 16 11:19 2d2r.pdb-rw-rw-r-- 1 newhoir newhoir 707778 Nov 16 11:18 2vg0.pdb-rw-rw-r-- 1 newhoir newhoir 862326 Nov 16 11:18 2vg3.pdb*** input file multi.ali>P1;2d2rstructureX:2d2r:A:3:A:227:UPP :Helicobacter pylori: :---------------------------------------MLSATQPLSEKLDST-LKHLAIIMDGNGRWAKLKNK-ARAYGHKKGVKTLKDITIWCANHKLECLTLYAFSTENWKRPKSEVDFLMKMLKKYLK---DERSTYLDNNIRFRAIGDLEGFSKELRDTILQLENDTRHFKDFTQVLALNYGSKNELSRAFKSLLESPPS-NISLLE---------------------SLENEISNRLDTRNLPEVDLLLRTGGEMRLSNFLLWQSSYAELFFTPILWPDFTPKDLENIISDFYKRVRKFGELKA-----*>P1;1uehstructureX:1ueh:A:13:A:240 :UPP :Erischeria coli : : --------------------------------------MMLSATQPLSEKLPAHGCRHVAIIMDGNGRWAKKQGK-IRAFGHKAGAKSVRRAVSFAANNGIEALTLYAFSSENWNRPAQEVSALMELFVWALD---SEVKSLHRHNVRLRIIGDTSRFNSRLQERIRKSEALTAGNTGLTLNIAANYGGRWDIVQGVRQLAEKVQQGNLQPDQ---------------------IDEEMLNQHVCMHELAPVDLVIRTGGEHRISNFLLWQIAYAELYFTDVLWPDFDEQDFEGALNAFANRERRFGGTEPGDETA*>P1;2vg3structureX:2vg3:A:13:A:296:rv2361c decaprenyl PP:M. tuberculosis: : FPQLPPAPDDYPTFPDTSTWPVVFPELPAAPYGGPCRPPQHTSKAAAPRIPADRLPNHVAIVMDGNGRWATQRGL-ARTEGHKMGEAVVIDIACGAIELGIKWLSLYAFSTENWKRSPEEVRFLMGFNRDVVR---RRRDTLKKLGVRIRWVGSRPRLWRSVINELAVAEEMTKSNDVITINYCVNYGGRTEITEATREIAREVAAGRLNPER---------------------ITESTIARHLQRPDIPDVDLFLRTSGEQRSSNFMLWQAAYAEYIFQDKLWPDYDRRDLWAACEEYASRTRRFGSA-------*>P1;1f75structureX:1f75:A:14:A:242:UPP :Micrococcus luteus: :-----------------------------------MFPIKKRKAIKNNNINAAQIPKHIAIIMDGNGRWAKQKKM-PRIKGHYEGMQTVRKITRYASDLGVKYLTLYAFSTENWSRPKDEVNYLMKLPGDFLN---TFLPELIEKNVKVETIGFIDDLPDHTKKAVLEAKEKTKHNTGLTLVFALNYGGRKEIISAVQLIAERYKSGEISLDE---------------------ISETHFNEYLFTANMPDPELLIRTSGEERLSNFLIWQCSYSEFVFIDEFWPDFNEESLAQCISIYQ---------------*>P1;2vg0structureX:2vg0:A:30:A:256:Rv 1086 farnesyl PP:M. tuberculosis: : -----------------------------------------------------DLPRHIAVLCDGNRRWARSAGYDDVSYGYRMGAAKIAEMLRWCHEAGIELATVYLLSTENLQRDPDELAALIEIITDVVE---EICAPANHWSVRT--VGDLGLIGEEPARRLRGAVESTPEVASFHVNVAVGYGGRREIVDAVRALLSKELANGATAEELVDA-----------------VTVEGISENLYTSGQPDPDLVIRTSGEQRLSGFLLWQSAYSEMWFTEAHWPAFRHVDFLRALRDYSAR--------------*>P1;2cptsequence:2cpt::1: :316:rubber c-prenylxferase:Hevea brasiliensis: :-----------------------MELYTGERPSVFRLLGKYMRKGLYGILTQGPIPTHLAFILDGNRRFAKKHKL-PEGGGHKAGFLALLNVLTYCYELGVKYATIYAFSIDNFRRKPHEVQYVMDLMLEKIEGMIMEESIINAYDICVRFVGNLKLLSEPVKTAADKIMRATANNSKCVLLIAVCYTSTDEIVHAVEESSELNSNEVCNNQELEEANATGSSTVIQTENMESYSGIKLVDLEKNTYINPYPDVLIRTSGETRLSNYLLWQTTNCILYSPHALWPEIGLRHVVWAVINCQRHYSYLEKHKEYLK--****The runtime messages:[newhoir@localhost rubber]$ ./r'import site' failed; use -v for tracebackTraceback (most recent call last): File "mod-rubber1.py", line 18, in ? a.make() # do the actual homology modeling File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 98,in make self.homcsr(exit_stage) File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line424, in homcsr self.check_alignment(aln) File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line406, in check_alignment aln.check() File "/usr/lib/modeller9v8/modlib/modeller/alignment.py", line 200, in check self.check_structure_structure(io=io) File "/usr/lib/modeller9v8/modlib/modeller/alignment.py", line 209, incheck_structure_structure return f(self.modpt, io.modpt, self.env.libs.modpt, eqvdst)_modeller.ModellerError: rdpdb___303E> No atoms were read from the specifiedinput PDB file, since the starting residue number and/or chain id inMODEL_SEGMENT (or the alignment file header) was not found; requested startingposition: residue number " A", chain " 3"; atom file name:/home/newhoir/rubber/atom_files/2d2r.pdb[newhoir@localhost rubber]$ ***The file 2d2r.pdb is exactly as I downloaded it from RCSB**The first lines of the header:HEADER TRANSFERASE 16-SEP-05 2D2RTITLE CRYSTAL STRUCTURE OF HELICOBACTER PYLORI UNDECAPRENYL PYROPHOSPHATETITLE 2 SYNTHASECOMPND MOL_ID: 1;COMPND 2 MOLECULE: UNDECAPRENYL PYROPHOSPHATE SYNTHASE;COMPND 3 CHAIN: A, B;**The opening ATOM records:ORIGX2 0.000000 1.000000 0.000000 0.00000ORIGX3 0.000000 0.000000 1.000000 0.00000SCALE1 0.020149 0.000000 0.000000 0.00000SCALE2 0.000000 0.016974 0.000000 0.00000SCALE3 0.000000 0.000000 0.006518 0.00000ATOM 1 N SER A 3 -4.458 53.857 94.802 1.00 59.58 NATOM 2 CA SER A 3 -4.745 52.989 93.628 1.00 59.67 CATOM 3 C SER A 3 -3.414 52.652 92.952 1.00 59.42 CATOM 4 O SER A 3 -2.843 51.569 93.142 1.00 59.53 O

2 1

Error in ini modeller-generated file
by Felipe Villanelo 17 Nov '10

17 Nov '10

Hi all, I'm trying of add some residues to an multimeric PDB, with a very simple model script: env = environ() env.io.hetatm = True env.io.atom_files_directory = ['.', '../../atom_files'] a = automodel(env, alnfile='modelo4-2.ali', knowns=('fil1','fil2'), sequence='modelo4-Al') a.starting_model = 1 a.ending_model = 1 a.make() But after a short run, the program crash with tue following message: Traceback (most recent call last): File "model2.py", line 32, in ? a.make() File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 109, in make self.multiple_models(atmsel) File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 215, in multiple_models self.outputs.append(self.single_model(atmsel, num)) File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 280, in single_model self.randomize_initial_structure(atmsel) File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 246, in randomize_initial_structure self.read_initial_model() File "/usr/lib/modeller9v8/modlib/modeller/automodel/automodel.py", line 242, in read_initial_model self.read(file=self.inifile, io=io) File "/usr/lib/modeller9v8/modlib/modeller/model.py", line 117, in read model_format, model_segment) _modeller.FileFormatError: read_pd_702E> File: modelo4-Al.ini, Line 2404 Could not parse data from file; file is probably corrupt. Curiosily, the 'modelo4-Al.ini' file is generated by modeller during the run, and I cannot bypass this problem The alignment file is ok. Maybe the problem is, that there is a ~40 Aa coil with no-template sequence to model (between other ssequences with template) Any suggestion will help Thanks Felipe Villanelo Lizana Bioquímico Laboratorio de Biología Estructural y Molecular Universidad de Chile "La curiosidad no mató al gato, sino que la falta de ella"

2 1

Error: No atoms were read from the specified input PDB file
by Ann Spevacek 17 Nov '10

17 Nov '10

Hello - I am trying to align two models, one an NMR structure (1XYX.pdb) and one a simulation (frame100.pdb) and I get the following error: No atoms were read from the specified input PDB file, since the starting residue number and/or chain id in MODEL_SEGMENT (or the alignment file header) was not found; requested starting position: residue number " FIRST", chain " A"; atom file name: ./frame100.pdb This is what the frame100.pdb file looks like: REMARK GENERATED BY TRJCONV TITLE Protein t= 2.00000 REMARK THIS IS A SIMULATION BOX CRYST1 183.372 183.372 183.372 60.00 60.00 90.00 P 1 1 MODEL 1 ATOM 1 N LYS 1 85.750 9.110 4.370 1.00 0.00 ATOM 2 H1 LYS 1 84.780 9.360 4.220 1.00 0.00 ATOM 3 H2 LYS 1 86.050 8.430 3.680 1.00 0.00 ATOM 4 H3 LYS 1 85.790 8.630 5.260 1.00 0.00 ATOM 5 CA LYS 1 86.670 10.260 4.390 1.00 0.00 ATOM 6 HA LYS 1 86.750 10.670 3.380 1.00 0.00 ATOM 7 CB LYS 1 86.190 11.350 5.350 1.00 0.00 ATOM 8 HB1 LYS 1 86.000 10.980 6.350 1.00 0.00 ATOM 9 HB2 LYS 1 86.920 12.120 5.570 1.00 0.00 ATOM 10 CG LYS 1 84.950 12.080 4.820 1.00 0.00 ATOM 11 HG1 LYS 1 85.000 12.330 3.760 1.00 0.00 ATOM 12 HG2 LYS 1 84.170 11.340 4.990 1.00 0.00 .... ATOM 3050 HH12 ARG 206 120.270 -79.800 115.360 1.00 0.00 ATOM 3051 NH2 ARG 206 119.370 -82.100 116.310 1.00 0.00 ATOM 3052 HH21 ARG 206 118.870 -82.870 116.730 1.00 0.00 ATOM 3053 HH22 ARG 206 120.020 -81.560 116.870 1.00 0.00 ATOM 3054 C ARG 206 120.920 -85.910 117.630 1.00 0.00 ATOM 3055 O1 ARG 206 121.830 -86.740 117.840 1.00 0.00 ATOM 3056 O2 ARG 206 119.950 -85.640 118.370 1.00 0.00 ATOM 3057 Zn Zn2 207 171.290 -12.580 16.660 1.00 0.00 TER ENDMDL And this is the salign.py file that I am trying to use: # Illustrates the SALIGN multiple structure/sequence alignment from modeller import * log.verbose() env = environ() env.io.atom_files_directory = './:../atom_files/' aln = alignment(env) for (code, chain) in (('1XYX', 'A'), ('frame100', 'A')): mdl = model(env, file=code, model_segment=('FIRST:'+chain, 'LAST:'+chain)) aln.append_model(mdl, atom_files=code, align_codes=code+chain) for (weights, write_fit, whole) in (((1., 0., 0., 0., 1., 0.), False, True), ((1., 0.5, 1., 1., 1., 0.), False, True), ((1., 1., 1., 1., 1., 0.), True, False)): aln.salign(rms_cutoff=3.5, normalize_pp_scores=False, rr_file='$(LIB)/as1.sim.mat', overhang=30, gap_penalties_1d=(-450, -50), gap_penalties_3d=(0, 3), gap_gap_score=0, gap_residue_score=0, dendrogram_file='PrP.tree', alignment_type='tree', # If 'progresive', the tree is not # computed and all structues will be # aligned sequentially to the first feature_weights=weights, # For a multiple sequence alignment only # the first feature needs to be non-zero improve_alignment=True, fit=True, write_fit=write_fit, write_whole_pdb=whole, output='ALIGNMENT QUALITY') aln.write(file='PrP.pap', alignment_format='PAP') aln.write(file='PrP.ali', alignment_format='PIR') aln.salign(rms_cutoff=1.0, normalize_pp_scores=False, rr_file='$(LIB)/as1.sim.mat', overhang=30, gap_penalties_1d=(-450, -50), gap_penalties_3d=(0, 3), gap_gap_score=0, gap_residue_score=0, dendrogram_file='1is3A.tree', alignment_type='progressive', feature_weights=[0]*6, improve_alignment=False, fit=False, write_fit=True, write_whole_pdb=False, output='QUALITY') Any input would be greatly appreciated! Thanks! Ann

2 1

Modelling a protein
by bharat lal 17 Nov '10

17 Nov '10

Hi , I want to model the structure of a protein with ser gly tyr but the template structure that I want to use has the reacted form of ser gly tyr ... what i want to model is the position of those 3 residues in my model in place of their reacted form.. so can amybody pls help me out in doing so ?? Thanks ------Bharat Ph.D. Candidate Room No. : 7202A, 2nd Floor Biomolecular Engineering Laboratory Division of Chemical Engineering and Polymer Science Pusan National University Busan -609735 South Korea Lab phone no. - +82-51-510-3680, +82-51-583-8343 E-mail : monu46010(a)yahoo.com

2 1

How to force Modeller to use a unique name for a log file?
by Starr Hazard 16 Nov '10

16 Nov '10

Folks /path/bin/mod9v8 foo.py Writes to a log file "foo.log" /path/bin/mod9v8 foo.py > foo_out.log ALSO writes output to 'foo.log' and nothing is written to 'foo_out.log'. What I want to do is to model a series of sequences from the same alignment. I can do this by copy/editing the foo.py script; eg foo_1.py etc and executing the modeling on each sequence from a unique script. Editing more than a few such scripts gets old. If I pass the sequence name to a foo_input.py, I have less of an editing chore but each call to foo_input.py overwrites the previous log file and I want to get retain these logs to examine. Is there a way for me to redirect to a named file of my choosing? The usual method UNIX redirect is not working for me at the moment. Thanks, Starr

1 1