Fwd: Help for "Sequence difference between alignment and pdb"
Forwarding to list; please reply to original author.
----- Forwarded message from Dong Chen dong.chen@usu.edu -----
Subject: Help for "Sequence difference between alignment and pdb" Date: Fri, 26 Sep 2003 18:10:45 -0600 From: "Dong Chen" dong.chen@usu.edu To: modeller-care@salilab.org
Hi,
I am trying to model a protein by its own structural information, but it gives me "Sequence difference between alignment and pdb" error message in the log.file. Could somebody help to see where is problem. Attached are the files I used.
Thanks,
Dong
####1EDU.atm ####model-Peter.top: # Homology modelling by the MODELLER TOP routine 'model'.
INCLUDE # Include the predefined TOP routines
SET OUTPUT_CONTROL = 1 1 1 1 1 # uncomment to produce a large log file SET ALNFILE = 'alignment_Peter.ali' # alignment filename SET KNOWNS = '1EDU' # codes of the templates SET SEQUENCE = 'peter' # code of the target SET ATOM_FILES_DIRECTORY = './:../atom_files' # directories for input atom files SET STARTING_MODEL= 1 # index of the first model SET ENDING_MODEL = 1 # index of the last model # (determines how many models to calculate)
CALL ROUTINE = 'model' # do homology modelling
####alignment_Peter.ali: C; A sample alignment in the PIR format; used in tutorial >P1;1EDU structure:1EDU:2:A :150:A: : : : -NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR LREERAHALKTKEKLAQTATA* >P1;peter sequence:peter:1 : :150 : : : : : -NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR LREERAHALKTKEKLAQTATA*
####logfile.log:
MODELLER 6v2, 17 Feb 2002
PROTEIN STRUCTURE MODELLING BY SATISFACTION OF SPATIAL RESTRAINTS
Copyright(c) 1989-2002 Andrej Sali All Rights Reserved
Written by A. Sali with help from A. Fiser, R. Sanchez, M.A. Marti-Renom, B. Jerkovic, A. Badretdinov, F. Melo, J.P. Overington & E. Feyfant Rockefeller University, New York, USA Harvard University, Cambridge, USA Imperial Cancer Research Fund, London, UK Birkbeck College, University of London, London, UK
Kind, OS, HostName, Kernel, Processor: 4, Windows_NT BTC101DC x86 Family 15 Model 1 Stepping 2, Genuin Date and time of compilation : Jul 09 2002 16:21:30 Job starting time (YY/MM/DD HH:MM:SS): 2003/09/23 17:56:59.346
TOP_________> 105 705 SET ALNFILE = 'alignment_Peter.ali'
TOP_________> 106 706 SET KNOWNS = '1EDU'
TOP_________> 107 707 SET SEQUENCE = 'peter'
TOP_________> 108 708 SET ATOM_FILES_DIRECTORY = './:../atom_files'
TOP_________> 109 709 SET STARTING_MODEL = 1
TOP_________> 110 710 SET ENDING_MODEL = 1
TOP_________> 111 711 CALL ROUTINE = 'model'
TOP_________> 112 399 CALL ROUTINE = 'getnames'
TOP_________> 113 509 STRING_IF STRING_ARGUMENTS = MODEL 'undefined', OPERATION; = 'EQ', THEN = 'STRING_OPERATE OPERATION = CONCATENA; TE, STRING_ARGUMENTS = SEQUENCE .ini, RESULT = MODEL'
TOP_________> 114 510 STRING_IF STRING_ARGUMENTS = CSRFILE 'undefined', OPERATI; ON = 'EQ', THEN = 'STRING_OPERATE OPERATION = CONCATE; NATE, STRING_ARGUMENTS = SEQUENCE .rsr, RESULT = CSRFILE; '
TOP_________> 115 511 STRING_OPERATE OPERATION = 'CONCATENATE', ; STRING_ARGUMENTS = SEQUENCE '.sch', RESULT = SCHFILE
TOP_________> 116 512 STRING_OPERATE OPERATION = 'CONCATENATE', ; STRING_ARGUMENTS = SEQUENCE '.mat', RESULT = MATRIX_FI; LE
TOP_________> 117 513 SET ROOT_NAME = SEQUENCE
TOP_________> 118 514 RETURN
TOP_________> 119 400 CALL ROUTINE = 'homcsr'
TOP_________> 120 106 READ_ALIGNMENT FILE = ALNFILE, ALIGN_CODES = KNOWNS SEQUE; NCE
Dynamically allocated memory at amaxseq [B,kB,MB]: 2205269 2153.583 2.103 openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali
Dynamically allocated memory at amaxbnd [B,kB,MB]: 4458129 4353.642 4.252 openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali read_al_374_> Non-standard residue type,position,sequence: X 29 1 read_al_374_> Non-standard residue type,position,sequence: X 46 1 read_al_374_> Non-standard residue type,position,sequence: X 48 1 read_al_374_> Non-standard residue type,position,sequence: X 67 1 read_al_374_> Non-standard residue type,position,sequence: X 70 1 read_al_374_> Non-standard residue type,position,sequence: X 89 1 read_al_374_> Non-standard residue type,position,sequence: X 29 2 read_al_374_> Non-standard residue type,position,sequence: X 46 2 read_al_374_> Non-standard residue type,position,sequence: X 48 2 read_al_374_> Non-standard residue type,position,sequence: X 67 2 read_al_374_> Non-standard residue type,position,sequence: X 70 2 read_al_374_> Non-standard residue type,position,sequence: X 89 2
Read the alignment from file : alignment_Peter.ali Total number of alignment positions: 149
# Code #_Res #_Segm PDB_code Name ------------------------------------------------------------------------------- 1 1EDU 149 1 1EDU 2 peter 149 1 peter TOP_________> 121 107 CHECK_ALIGNMENT
check_a_343_> >> BEGINNING OF COMMAND openf5__224_> Open 11 OLD SEQUENTIAL ../atom_files/1EDU.atm rdabrk__291E> Sequence difference between alignment and pdb :
STRUCTURE RES_IND ALN_ITYP ALN_RES X_ITYP X_RES -----*----- 1 29 24 UNK 11 MET PSSSLXSEIAD rdabrk__288W> Protein not accepted: 1 check_a_337E> Structure not read in: 1 recover____E> ERROR_STATUS >= STOP_ON_ERROR: 1 1
Dynamically allocated memory at finish [B,kB,MB]: 4458129 4353.642 4.252 Starting time : 2003/09/23 17:56:59.346 Closing time : 2003/09/23 17:57:01.549 Total CPU time [seconds] : 1.72
----- End forwarded message -----
Dear Dong,
You should replace the "X" with its real code. It might be M (Met).
Good luck.
Xiao-Ping
At 06:11 PM 9/26/2003 -0700, you wrote: >Forwarding to list; please reply to original author. > >----- Forwarded message from Dong Chen dong.chen@usu.edu ----- > >Subject: Help for "Sequence difference between alignment and pdb" >Date: Fri, 26 Sep 2003 18:10:45 -0600 >From: "Dong Chen" dong.chen@usu.edu >To: modeller-care@salilab.org > >Hi, > >I am trying to model a protein by its own structural information, but it >gives me "Sequence difference between alignment and pdb" error message >in the log.file. Could somebody help to see where is problem. Attached >are the files I used. > >Thanks, > >Dong > >####1EDU.atm >####model-Peter.top: ># Homology modelling by the MODELLER TOP routine 'model'. > >INCLUDE # Include the predefined TOP routines > >SET OUTPUT_CONTROL = 1 1 1 1 1 # uncomment to produce a large log file >SET ALNFILE = 'alignment_Peter.ali' # alignment filename >SET KNOWNS = '1EDU' # codes of the templates >SET SEQUENCE = 'peter' # code of the target >SET ATOM_FILES_DIRECTORY = './:../atom_files' # directories for input atom >files >SET STARTING_MODEL= 1 # index of the first model >SET ENDING_MODEL = 1 # index of the last model > # (determines how many models to > calculate) > >CALL ROUTINE = 'model' # do homology modelling > >####alignment_Peter.ali: >C; A sample alignment in the PIR format; used in tutorial > >P1;1EDU >structure:1EDU:2:A :150:A: : : : >-NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY >KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR >LREERAHALKTKEKLAQTATA* > >P1;peter >sequence:peter:1 : :150 : : : : : >-NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY >KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR >LREERAHALKTKEKLAQTATA* > >####logfile.log: > > MODELLER 6v2, 17 Feb 2002 > > PROTEIN STRUCTURE MODELLING BY SATISFACTION OF SPATIAL RESTRAINTS > > > Copyright(c) 1989-2002 Andrej Sali > All Rights Reserved > > Written by A. Sali > with help from A. Fiser, R. Sanchez, M.A. Marti-Renom, > B. Jerkovic, A. Badretdinov, F. Melo, > J.P. Overington & E. Feyfant > Rockefeller University, New York, USA > Harvard University, Cambridge, USA > Imperial Cancer Research Fund, London, UK > Birkbeck College, University of London, London, UK > > >Kind, OS, HostName, Kernel, Processor: 4, Windows_NT BTC101DC x86 Family >15 Model 1 Stepping 2, Genuin >Date and time of compilation : Jul 09 2002 16:21:30 >Job starting time (YY/MM/DD HH:MM:SS): 2003/09/23 17:56:59.346 > >TOP_________> 105 705 SET ALNFILE = 'alignment_Peter.ali' > >TOP_________> 106 706 SET KNOWNS = '1EDU' > >TOP_________> 107 707 SET SEQUENCE = 'peter' > >TOP_________> 108 708 SET ATOM_FILES_DIRECTORY = './:../atom_files' > >TOP_________> 109 709 SET STARTING_MODEL = 1 > >TOP_________> 110 710 SET ENDING_MODEL = 1 > >TOP_________> 111 711 CALL ROUTINE = 'model' > >TOP_________> 112 399 CALL ROUTINE = 'getnames' > >TOP_________> 113 509 STRING_IF STRING_ARGUMENTS = MODEL 'undefined', >OPERATION; > = 'EQ', THEN = 'STRING_OPERATE OPERATION = > CONCATENA; > TE, STRING_ARGUMENTS = SEQUENCE .ini, RESULT = MODEL' > >TOP_________> 114 510 STRING_IF STRING_ARGUMENTS = CSRFILE 'undefined', >OPERATI; > ON = 'EQ', THEN = 'STRING_OPERATE OPERATION = > CONCATE; > NATE, STRING_ARGUMENTS = SEQUENCE .rsr, RESULT = > CSRFILE; > ' > >TOP_________> 115 511 STRING_OPERATE OPERATION = >'CONCATENATE', ; > STRING_ARGUMENTS = SEQUENCE '.sch', RESULT = > SCHFILE > >TOP_________> 116 512 STRING_OPERATE OPERATION = >'CONCATENATE', ; > STRING_ARGUMENTS = SEQUENCE '.mat', RESULT = > MATRIX_FI; > LE > >TOP_________> 117 513 SET ROOT_NAME = SEQUENCE > >TOP_________> 118 514 RETURN > >TOP_________> 119 400 CALL ROUTINE = 'homcsr' > >TOP_________> 120 106 READ_ALIGNMENT FILE = ALNFILE, ALIGN_CODES = >KNOWNS SEQUE; > NCE > > >Dynamically allocated memory at amaxseq >[B,kB,MB]: 2205269 2153.583 2.103 >openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali > >Dynamically allocated memory at amaxbnd >[B,kB,MB]: 4458129 4353.642 4.252 >openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali >read_al_374_> Non-standard residue >type,position,sequence: X 29 1 >read_al_374_> Non-standard residue >type,position,sequence: X 46 1 >read_al_374_> Non-standard residue >type,position,sequence: X 48 1 >read_al_374_> Non-standard residue >type,position,sequence: X 67 1 >read_al_374_> Non-standard residue >type,position,sequence: X 70 1 >read_al_374_> Non-standard residue >type,position,sequence: X 89 1 >read_al_374_> Non-standard residue >type,position,sequence: X 29 2 >read_al_374_> Non-standard residue >type,position,sequence: X 46 2 >read_al_374_> Non-standard residue >type,position,sequence: X 48 2 >read_al_374_> Non-standard residue >type,position,sequence: X 67 2 >read_al_374_> Non-standard residue >type,position,sequence: X 70 2 >read_al_374_> Non-standard residue >type,position,sequence: X 89 2 > >Read the alignment from file : alignment_Peter.ali >Total number of alignment positions: 149 > > # Code #_Res #_Segm PDB_code Name >------------------------------------------------------------------------------- > 1 1EDU 149 1 1EDU > 2 peter 149 1 peter >TOP_________> 121 107 CHECK_ALIGNMENT > >check_a_343_> >> BEGINNING OF COMMAND >openf5__224_> Open 11 OLD SEQUENTIAL ../atom_files/1EDU.atm >rdabrk__291E> Sequence difference between alignment and pdb : > > STRUCTURE RES_IND ALN_ITYP ALN_RES X_ITYP X_RES -----*----- > 1 29 24 UNK 11 MET PSSSLXSEIAD >rdabrk__288W> Protein not accepted: 1 >check_a_337E> Structure not read in: 1 >recover____E> ERROR_STATUS >= STOP_ON_ERROR: 1 1 > >Dynamically allocated memory at finish >[B,kB,MB]: 4458129 4353.642 4.252 >Starting time : >2003/09/23 17:56:59.346 >Closing time : >2003/09/23 17:57:01.549 >Total CPU time [seconds] : 1.72 > >----- End forwarded message -----
Xiao-Ping Zhang, PhD Section of Microbiology Division of Biological Sciences University of California, Davis Davis, CA95616
Hi
more and more new structures are solved using seleno-Met replacements (like your one as well.)
These Se-Mets become heteroatoms in the PDB structure (HETATM).
Your choices: 1- "set HETATM_IO= on" in modeller top file and select the protein segment only (1-150) if you would like to avoid water and other possible ligands (e.g. EGL in your case) listed after the protein seq in the PDB file. In the sequence you would use "." for se-METs.
2- replace HETATM records with ATOM records in the PDB file and introduce MSE as a new residue type (see manual FAQ section for details, which library files to edit for this, restyp.lib, top_heav.lib etc).
3- do not do anything with the PDB and top files, just leave out Se-MET positions. (I.e. In your sequence alignment these will be gaps in your template seq). Met residues will be built in your target with more uncertainty because no template residue/atoms will be available in those positions.
approach 3 is the quickest but relatively least accurate. Approach 2 is the biggest hassle but most exact. Approach 1 is in between the previous two in all regards.
Best wishes, Andras Fiser
Quoting Xiao-Ping Zhang xpzhang@ucdavis.edu:
> Dear Dong, > > You should replace the "X" with its real code. It might be M (Met). > > Good luck. > > Xiao-Ping > > > > At 06:11 PM 9/26/2003 -0700, you wrote: > >Forwarding to list; please reply to original author. > > > >----- Forwarded message from Dong Chen dong.chen@usu.edu ----- > > > >Subject: Help for "Sequence difference between alignment and pdb" > >Date: Fri, 26 Sep 2003 18:10:45 -0600 > >From: "Dong Chen" dong.chen@usu.edu > >To: modeller-care@salilab.org > > > >Hi, > > > >I am trying to model a protein by its own structural information, but it > >gives me "Sequence difference between alignment and pdb" error message > >in the log.file. Could somebody help to see where is problem. Attached > >are the files I used. > > > >Thanks, > > > >Dong > > > >####1EDU.atm > >####model-Peter.top: > ># Homology modelling by the MODELLER TOP routine 'model'. > > > >INCLUDE # Include the predefined TOP routines > > > >SET OUTPUT_CONTROL = 1 1 1 1 1 # uncomment to produce a large log file > >SET ALNFILE = 'alignment_Peter.ali' # alignment filename > >SET KNOWNS = '1EDU' # codes of the templates > >SET SEQUENCE = 'peter' # code of the target > >SET ATOM_FILES_DIRECTORY = './:../atom_files' # directories for input atom > >files > >SET STARTING_MODEL= 1 # index of the first model > >SET ENDING_MODEL = 1 # index of the last model > > # (determines how many models to > > calculate) > > > >CALL ROUTINE = 'model' # do homology modelling > > > >####alignment_Peter.ali: > >C; A sample alignment in the PIR format; used in tutorial > > >P1;1EDU > >structure:1EDU:2:A :150:A: : : : > >-NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY > >KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR > >LREERAHALKTKEKLAQTATA* > > >P1;peter > >sequence:peter:1 : :150 : : : : : > >-NIVHNYSEAEIKVREATSNDPWGPSSSLXSEIADLTYNVVAFSEIXSXIWKRLNDHGKNWRHVY > >KAXTLXEYLIKTGSERVSQQCKENXYAVQTLKDFQYVDRDGKDQGVNVREKAKQLVALLRDEDR > >LREERAHALKTKEKLAQTATA* > > > >####logfile.log: > > > > MODELLER 6v2, 17 Feb 2002 > > > > PROTEIN STRUCTURE MODELLING BY SATISFACTION OF SPATIAL RESTRAINTS > > > > > > Copyright(c) 1989-2002 Andrej Sali > > All Rights Reserved > > > > Written by A. Sali > > with help from A. Fiser, R. Sanchez, M.A. Marti-Renom, > > B. Jerkovic, A. Badretdinov, F. Melo, > > J.P. Overington & E. Feyfant > > Rockefeller University, New York, USA > > Harvard University, Cambridge, USA > > Imperial Cancer Research Fund, London, UK > > Birkbeck College, University of London, London, UK > > > > > >Kind, OS, HostName, Kernel, Processor: 4, Windows_NT BTC101DC x86 Family > >15 Model 1 Stepping 2, Genuin > >Date and time of compilation : Jul 09 2002 16:21:30 > >Job starting time (YY/MM/DD HH:MM:SS): 2003/09/23 17:56:59.346 > > > >TOP_________> 105 705 SET ALNFILE = 'alignment_Peter.ali' > > > >TOP_________> 106 706 SET KNOWNS = '1EDU' > > > >TOP_________> 107 707 SET SEQUENCE = 'peter' > > > >TOP_________> 108 708 SET ATOM_FILES_DIRECTORY = './:../atom_files' > > > >TOP_________> 109 709 SET STARTING_MODEL = 1 > > > >TOP_________> 110 710 SET ENDING_MODEL = 1 > > > >TOP_________> 111 711 CALL ROUTINE = 'model' > > > >TOP_________> 112 399 CALL ROUTINE = 'getnames' > > > >TOP_________> 113 509 STRING_IF STRING_ARGUMENTS = MODEL 'undefined', > >OPERATION; > > = 'EQ', THEN = 'STRING_OPERATE OPERATION = > > CONCATENA; > > TE, STRING_ARGUMENTS = SEQUENCE .ini, RESULT = > MODEL' > > > >TOP_________> 114 510 STRING_IF STRING_ARGUMENTS = CSRFILE 'undefined', > >OPERATI; > > ON = 'EQ', THEN = 'STRING_OPERATE OPERATION = > > CONCATE; > > NATE, STRING_ARGUMENTS = SEQUENCE .rsr, RESULT = > > CSRFILE; > > ' > > > >TOP_________> 115 511 STRING_OPERATE OPERATION = > >'CONCATENATE', ; > > STRING_ARGUMENTS = SEQUENCE '.sch', RESULT = > > SCHFILE > > > >TOP_________> 116 512 STRING_OPERATE OPERATION = > >'CONCATENATE', ; > > STRING_ARGUMENTS = SEQUENCE '.mat', RESULT = > > MATRIX_FI; > > LE > > > >TOP_________> 117 513 SET ROOT_NAME = SEQUENCE > > > >TOP_________> 118 514 RETURN > > > >TOP_________> 119 400 CALL ROUTINE = 'homcsr' > > > >TOP_________> 120 106 READ_ALIGNMENT FILE = ALNFILE, ALIGN_CODES = > >KNOWNS SEQUE; > > NCE > > > > > >Dynamically allocated memory at amaxseq > >[B,kB,MB]: 2205269 2153.583 2.103 > >openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali > > > >Dynamically allocated memory at amaxbnd > >[B,kB,MB]: 4458129 4353.642 4.252 > >openf5__224_> Open 11 OLD SEQUENTIAL alignment_Peter.ali > >read_al_374_> Non-standard residue > >type,position,sequence: X 29 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 46 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 48 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 67 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 70 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 89 1 > >read_al_374_> Non-standard residue > >type,position,sequence: X 29 2 > >read_al_374_> Non-standard residue > >type,position,sequence: X 46 2 > >read_al_374_> Non-standard residue > >type,position,sequence: X 48 2 > >read_al_374_> Non-standard residue > >type,position,sequence: X 67 2 > >read_al_374_> Non-standard residue > >type,position,sequence: X 70 2 > >read_al_374_> Non-standard residue > >type,position,sequence: X 89 2 > > > >Read the alignment from file : alignment_Peter.ali > >Total number of alignment positions: 149 > > > > # Code #_Res #_Segm PDB_code Name > >------------------------------------------------------------------------------- > > 1 1EDU 149 1 1EDU > > 2 peter 149 1 peter > >TOP_________> 121 107 CHECK_ALIGNMENT > > > >check_a_343_> >> BEGINNING OF COMMAND > >openf5__224_> Open 11 OLD SEQUENTIAL ../atom_files/1EDU.atm > >rdabrk__291E> Sequence difference between alignment and pdb : > > > > STRUCTURE RES_IND ALN_ITYP ALN_RES X_ITYP X_RES -----*----- > > 1 29 24 UNK 11 MET PSSSLXSEIAD > >rdabrk__288W> Protein not accepted: 1 > >check_a_337E> Structure not read in: 1 > >recover____E> ERROR_STATUS >= STOP_ON_ERROR: 1 1 > > > >Dynamically allocated memory at finish > >[B,kB,MB]: 4458129 4353.642 4.252 > >Starting time : > >2003/09/23 17:56:59.346 > >Closing time : > >2003/09/23 17:57:01.549 > >Total CPU time [seconds] : 1.72 > > > >----- End forwarded message ----- > > Xiao-Ping Zhang, PhD > Section of Microbiology > Division of Biological Sciences > University of California, Davis > Davis, CA95616 > > >
participants (3)
-
Andras Fiser
-
Modeller Caretaker
-
Xiao-Ping Zhang