ModPipe change history

ModPipe 2.3.1 10-27-2023

  • Modeller 10 syntax is now used throughout; ModPipe requires Modeller 10.0 or later.

  • Template-based modeling now takes a max_concurrent_tasks_tb argument which can be used to throttle its cluster jobs.

ModPipe 2.3.0 05-14-2020

  • ModPipe should now work with Python 3. The setup script is now called “setup.py” (not “Setup”) and should be run as “python2 setup.py” or “python3 setup.py”. It will configure ModPipe to use the same version of Python it was run with.

  • Added HHSuite programs to fold assignment/alignment methods. With HHSuite programs, some ModPipe models now are created from multiple templates. The following hit modes are used: - HHBlitsSP - Sequences against HHBlits profile database - HHBlitsPP - HHBlits UniProt20 profiles against HHBlits profile database - HHSearchSP - Sequences against HHSearch profile database - HHSearchPP - HHBlits UniProt20 profiles against HHSearch profile database The HHSuite hits modes are only allowed without including ligands or water molecules in the final models. HHSuite hits modes are allowed in combination with any other ModPipe hits modes.

  • Changed hits mode handling - number method (1337/etc.) is replaced by a list of method (prf-prf, etc). The new keywords are: - Seq-Seq (was 1000) - Seq-Prf (was 0001) - Prf-Seq (was 0100) - PSI-Blast-Prf-Seq (was 0200) - Prf-Prf (was 0010) - PSI-Blast-Prf-Prf (was 0020) - Max-PSSM-Seq-Prf (was 0002) - Max-Freq-Seq-Prf (was 0004) The input is checked against a list of allowed methods. This change will allow us to add other methods.

  • Added CAMEO option for model selection in gathering script - flexible option that doesn’t require ModWebd restart when changing the model gathering.

  • BenchMark.py prints out training data for TSVMod and potential other scoring methods

  • Additional option -k to retain the residue X (non-standard residue) in AddSeq.py

  • PSI-Blast and Build-Profile now recognize the input E-value threshold. PSI-Blast’s default maximum number of sequences in profile (RunPsiBlast.pl) has been increased from 2000 to 20,000

  • Handling of non-standard residues changed in MakeChains.py. In the previous version, only PDB files were processed that have 10 or less non-standard residues. Now, PDB files are processed with 10% or less non-standard residues - represented as X in the PIR and FASTA files.

  • Added additional option to the flag template_option to ModPipe.pl - TOP, only processes template hits that are up to 20% lower in sequence identity to the highest sequence identity for each region. This is used for regular sequence based calculations, to save compute time when high quality templates are available.

  • Added additional option to the flag template_option to ModPipe.pl - TEMPLATE_FAST, skips statistics, assumes the use of short PDB databases (only for input PDB file). When used, calculations are significantly faster, but evalue and MPQS are not calculated correctly. This is used for template based calculations.

  • Added support for split BLAST databases (previously only databases of the form foo.phr were recognized; now split databases of the form foo.00.phr, foo.01.phr etc. are also supported).

  • All generated models should now have chain IDs assigned.

ModPipe 2.2.0 11-01-2010

  • Introduced flag to disable TSVMod calculation if needed

  • TSVMod scores are incorporated into ModPipe. Scores will be calculated for all structures. Additionally, when chosen as a gathering option, models with a predicted no35>0.4 will be selected.

  • PDB files are now also searched for in PDB-style subdirectories (e.g. PDB code 1abc will be looked for in the ‘ab’ subdirectory). If these PDB files are compressed (.Z, .gz, or .bz2) they will be automatically uncompressed when opened.

  • Change of the Gathering mechanism: GatherModMP.py script now has two modes: (i)local and global gathering using final_models_by and yaml files (ii)fast gathering by concatinating the local files ModWeb by default gathers local for every sequence and fast at the end of the job

  • ModBaseImport.py now inserts the run information after loading models and sequences, to avoid access in the middle of the loading process

  • GatherModMP.py also creates .fin files in sequence directories, which can be used by ModBaseImport.py and are default for ModWebd.

  • ModWebd: jobs get moved to the current directory before running

  • ModWebd: uses enddatapath option in ModBaseImport

  • ModWebd: template based functionality implemented

  • ModWeb_TemplateBased.pl submits a TemplateBased job to the cluster

  • StrucImpact.pl has been renamed to TemplateBased.pl It has been modified to separate the template and target part (different tmp directories and configuration files). It can also be called with an option to only model the sequences (from uniprot or input) using the input template.

  • ModWeb.pl deletes any running sequence based SGE job if it is killed.

  • ModBaseImport.py: New option to copy relevant data from network storage to a permanent disk.

ModPipe 2.1.3 06-03-2009

  • Insertion codes (e.g. residue numbers such as 6A) should now be handled properly in PDB structures.

  • Bugfix: in MPModules.pm: Changed HitsPrfSeq.pl to HitsPrfSeq.py and adjusted options.

ModPipe 2.1.2 04-06-2009

  • read_hits_file() and read_models_file() in the modpipe.serialize module now return generators, rather than lists. Hits and models files can contain multiple YAML documents.

  • Bugfix: GatherModMP.py now correctly clusters models by region.

ModPipe 2.1.1 03-31-2009

  • main/AddSeq.pl has been renamed to main/AddSeq.py, and src/ConvertSeq.pl to src/ConvertSeq.py.

  • Bugfix: ModWeb no longer crashes if other than 3 options are given for ‘select models by’.

ModPipe 2.1.0 02-23-2009

  • Changed GatherModMP.py to check whether the .mod file exists. This could potentially get changed to print out which .mod files are missing (for resubmission)

  • Generated .mod, .fin, .sel and .hit (model and hit) files are now in YAML format.

  • main/BenchMark.pl has been renamed to main/BenchMark.py and main/GatherModMP.pl to main/GatherModMP.py.

  • PDB files generated by ModPipe now contain information on the ModPipe version used to build them.

  • Remove MODELLER option from configuration files; it is not used (ModPipe has its own copy of MODELLER in the ext/mod directory).

  • Several scripts now take positional arguments rather than “mandatory” options; e.g. “MakeChains.py [-p DIR] [-o FILE] pdblist” rather than “MakeChains.py [-p DIR] [-o FILE] -f pdblist”.

  • Structure impact / Leverage. For a given new PDB structure (“template of interest”), find a candidate set of UniProt sequences and model each with ModPipe. StrucImpact.pl is main routine; AugmentPDB.pl and GetFullSeqs.pl are subsidiary scripts; StrucImpactResults.py collates output.

  • Structure impact related change: add option to ModPipe to do modeling only if template of interest is among hits, and to retain template of interest after clustering. Also accommodate option in WriteSGEMP() (in lib/perl/MPLib/MPUtils.pm).

  • Structure impact related change: add options to GetProfile() (lib/perl/MPLib/MPModules.pm).

  • Structure impact related changes: add FULLSEQDB (full UniProt database), XPRF_PIR (PDB95 structure sequences), and XPRF_DATDIR (directory of PDB95 profile data) to conf file – all optional (though required for StrucImpact.pl).

  • Structure impact auxiliary program: UniqueSeq.py - determines which sequences in file are unique within a specified percent identity.

ModPipe 2.0.2 12-09-2008

  • It is no longer necessary to set the MODPIPEBASE environment variable or to run Modeller’s modpy.sh script to use Python scripts; this is all set up automatically by ./Setup now.

  • MD5 sums of generated model (PDB) files no longer include the EXPDTA line. (This is because the EXPDTA line contains the time the model was generated, which changes on each run.)

  • Scripts use regular long options (– prefix) by default, and all scripts should understand the standard –help and –version options.

ModPipe 2.0.1 12-05-2008

  • Choose SEQID and MPQS as mandatory gathering options in MODWEB.

  • Add ModWeb queue status page, and allow multiple jobs to run simultaneously.

  • Update Modeller to correctly handle sequence databases containing more than 2^31 residues.

  • Add basic test suite (‘make’ in tests directory) to run as part of the nightly builds. Currently only reference outputs on synth are present.

ModPipe 2.0 10-09-2008

  • Latest code from Eashwar. Version history prior to this point is unknown.