
 +-------------------------------------------------------------------------+
 |                                                                         |
 |             Published by the Movement of the 23rd of November           |
 |                                                                         |
 |              Copyright April-November 1989, John Overington             |
 |                Copyright May 1990, Perfect Cock Software                |
 |                                                                         |
 |           This code is the sole property of the main author.            |
 | Unauthorised use/distribution is prohibited without the authors consent |
 |                                                                         |
 |                            John Overington                              |
 |                               ICRF Unit                                 |
 |                            Birkbeck College                             |
 |                                Malet St                                 |
 |                                 London                                  |
 |                                WC1E 7HX                                 |
 |                                                                         |
 |                                  JOY                                    |
 |                                                                         |
 |                              Version 2.0                                |
 |                                                                         |
 +-------------------------------------------------------------------------+
 
VERSION HISTORY
 
0.1 - Preliminary version, simply a formatting program.

0.2 - Bugs removed from output, IO varied, etc.

0.3 - Further enhancements and debugging.

0.4 - extra string handling capabilities, and more rational handling of
      non-sequence data, also handles tidying up of N- and C- termini.

0.5 - Option to ignore data from alignment and to just print out things as
      they are but under latex, extra string handling facilities for top of
      Alignment, replaced 'mnyfit' and 'conser' with 'text5' and 'text6',
      added label3, to allow flexible alignment numberings. Most importantly
      changed the program so that the ordering of the lines in the output
      reflects that of the input.

0.6 - No longer bugs in output of diferrent height lines, also thinner
      height lines in number lines.

0.7 - Added sidechain-sidechain hydrogen bonds to the output, also ask user
      for value of sidechain accessibility cutoff, changed sidechain atom
      determination routine to mainchain one, (should be faster).

0.8 - Added analysis options to program, simply counting of various features
      and their occurrences, additions now allow counting of invariant
      residues etc.

0.9 - Added pairwise analysis of aligned sequences. Intelligent formatting
      with respect to the number of blocks per output page, this also speeds
      up the later processing considerably.

1.0 - Added generalized substitution matrices for features, output of 
      substitution data in both raw score and normalised forms. Also added 
      output to unit 4, to allow further summing of raw scores.

1.1 - Distances added as measure of similarities between residues.

1.2 - Output of matrices generalized and simplified, (included subroutine
      DISPMAT).

1.3 - Output now refined so that a template can be constructed from a .tem
      file. Fixed bug with user interface.

1.4 - Made construction of the substitution matrices more general, made title
      handling for matrices more general, automated determination of the
      number of substitution matrices to be determined. Reaaranged ordering of
      code. Made first attempts to remove programs dependancy on latex,
      will port typesetting functions to troff, controlled by logical TROFF.
      Fixed bug in RDPSA part of program (Cutoff used to act on absolute
      sidechain accessibility). Put in comparison of sequence data  read from
      .psa, .hbd and .all files. Program now identifies inconsistencies and
      stops with a nice message. (Should stop some of the idiots).

1.5 - Restructured code in pairwise for determination of matrices, also
      throughout program, assigned tokens for automatic determination of
      matrix environment descriptors, combined secondary structural data
      with positive phi to give a new mainchain conformation feature. Should
      improve classification. Now to create single property matrices will
      have to calculate them from the individual ones in SUMMER. Slightly
      changed control of output of insertion characters at end of alignment.
      NOTE: The format of the section that reports on the number of each
      type of residue is different to that in previous versions this affects
      the program SUMMER. Added automatic numbering of alignment option.
      Added code to check for consistency of length of  read data.

1.6 - Changed analysis of alignment and sequence properties by reference to
      the RDISC vector, simplifies code somewhat. Changed output of both
      sequence and alignment data. Simplified RDALI and WRALI. 
      Added insertion character as the 21st residue type, useful for 
      determination of gap penalties etc. logical CALCDIS to control 
      calculation of distance matrices from probability matrices. NOTE The 
      subroutine DISPMAT is not fully debugged, use this with caution. 
      Fixed bug in input of accessibility cutoff. Added necessary code for 
      automatic display of consensus secondary structure, (see WRALI), 
      controlled by AUTOSEC. Changed determination of length of alignment. 
      Now the program will use the longest sequence from the alignment, 
      (irrespective of whether it is a featured or unfeatured one). This 
      should make the input easier. However if there are spurious characters 
      in the alignment this will upset it still. Fixed bug in no. of blocks 
      in LATEX file. Fixed bug in LATEX formatting, program can now produce 
      bold italic characters whereas before these characters were simply 
      bold. Stopped special LATEX characters appearing in TEXT and LABEL 
      strings. Fixed bug in spacing between consecutive alignment label 
      lines. 

1.7 - Added ability to use sequences alone to allow better statistics in the
      substitution matrices. Note theat this informaion can only be used for
      what a particular environment residue goes to. Will apply weights to 
      this data eventually. Changed structure of initialization and summing
      of matrices. To allow reaonable analyses increased MAXNEWSEQ to 20.
      Started addition of code to assign weights for each of the pairwise
      comparisons. Did a bit more for correct formatting under troff. Ported
      to UNIX, minor changes required, still the troff stuff to do though.
      Added automatic option to use .all or .sst files. (Messy code for this
      though). Changed user questions a bit, used UNIX exit function.
 
      Exit codes are:
                      0   Successful run
                      1   User data bad
                      2   User requested exit
                      3   Internal system or program error
 
      Added option for checking of datafiles, it can be annoying to find that
      one or two residues are stopping you producing the alignment that you 
      need yesterday, controlled by logical CHECK. Fixed bug in logical 
      assignments if no features were to be used in analysis, if this is the
      case then cannot  write out lots of the files etc.
      Improved .out file data, added alignment totals. Added section to
      allow processing of omega angles, N.B. this is only possible when
      .sst files are used for secondary structure assignment information.
      Have not decided how these will be formatted.

1.8 - Added draft output of alignment to formatting option, if you choose
      to format the alignment then a you can produce a .prn file, this is 
      a formatted ASCII alignment that you can print on a normal printer,
      (useful for checking the .pap file.) Changed program to command
      line interface.
 
      Command format is: joy -options align.file
      options :
                d - produce draft alignment, .prn
                f - full analysis, i.e. produce .out, .tem and .sum
                    (requires all structural data files)
                h - list options 
                i - run program interactively
                l - typeset alignment in latex format
                n - don't check data for integrity with alignment
                s - sequences only included in processing
                t - typeset alignment in troff format
                v - version number
                x - add extra sequences to analysis
 
      Some of these options are mutually exclusive, run it to find out
      which. Precedence generally is what you would expect it to be
      The file name must have a '.' in it (not in the first postion).
      Kept in most of the dependancies of logicals to allow easy porting
      to eg VAX. Added back option to run program interactively, most
      users should 'alias joy joy i'. Added v option (report version no.)
      Improved error codes and user questions. Improved code for hydrogen
      bond data. Used include instead of parameter statements. Fixed bug in
       reading of .hbd file, can now  read both old and new formats. Changed
      output of consensus secondary structure to alpha and beta characters
      under latex.

1.9 - Made troff typesetting option fully active, (planned). Added checking
      of h-bonding and accessibility data (checks to see if data is likely
      to be valid). Added yet more error checking. Changed output of latex
      strings, file should now be smaller. Added types of H-bond to process
      latest version of hbond (0.3). Added cis-peptide and HET hbonds to
      compostition statistics. Generalized accumulateion of counts for
      composition. Added few extra lines to .sum file. Made default not to
      produce a .tex file. Added hettoside variable to control treatment of
      sidechain to heteroatom h-bonds. Only  write out .sum file if there is 
      more than one sequence. Added counting of free thiol and disulphide
      cysteines. Changed format of .sum file to cope with larger counts.
      Fixed troff formatting changed option t and added option l. Allowed
      use of both latex and troff simultaneously. troff not there just yet
      though, seems to be a limit on the number of tab stops that one can 
      have. Fixed bug in last character subroutine. Changed reporting of 
      cis peptides, now is associated with the residue on the C terminal side,
      i.e. the residue after the bond. Added output of number of comparisons
      per pairwise comparison. (including insertions).
      Decided to split cysteine into the two oxidation states. Will call the
      free thiol J and keep the oxidized form as C in the one letter code.
      This will take a fair amount of recoding. Maybe will do tomorrow.
      Broke down identity information into secondary structural class, trying
      to show that beta structure is more conserved than alpha structure.
      Added special matrix for cis-peptide prolines. Added debugging lines.
      Fixed bugs in initialization of several variables, now do not need
      -saveall and -noopt flags in command line. This has probably uncovered 
      several more bugs though. Added implicit none to most of subroutines.
      This found a few bugs in the declaration of logical items as assumed
      integers. Added ! declarator in alignment file, can be used as comment

2.0   Added 'segments' coded by Andrej. Removed READALL subroutine.
      Replaced READHBOND with Andrejs version. Added g option, fixed bug in g
      option. fixed bug in RDSST. Moved implicit statement to joy.par.
      Split source into sections, can now use proper make. Renamed READHBOND
      to RDHBD. Shortened names of several other subroutines. Only called
      WRTEM when there are real sequences. Moved help to HELP subroutine.
      Added routines LQUEST, RQUEST and CQUEST for questioning. Added ERROR
      routine to handle simple errors and exits. Added subroutine SETOPT for
      option handling. Added DOINVARI control of invariant subroutine (no 
      external control at the moment). Added LSPAP routine, mainly  to
      teach myself how to call C routines from FORTRAN code. Will add code
      to prepare data from a .atm file if it is needed. Added all the code
      to make all the data. It seems to work at the moment. Added code that if
      a .pap file does not exist then a .atm file is looked for. If this is
      found then a .pap file is created from its contents and the program run
      normally. Changed LSPAP call to SPAWN call for consistency. Removed 
      LSPAP all together. Added typesetting coding for cis-peptide bonds,
      disulphide bonds and also for H-bonds to heterogen groups. See WRALI
      for full details. Addedcode to label according to one of the specified
      sequences. See RDALI for further details. Fixed bug if their were no
      structures but still wanted sequence PIDs. Added code to allow direct
      printing on UNIX systems with latex. Controlled by -L option.
      Increased the number of structures that can be labelled intelligently.
      Can label according to alignment position and structure labels
      by setting the flag -A on command line.

2.1 - Removed support for troff. Checked for special characters in
      names. Also new syntax for labelling structures (a '%' after the
      structure name in the alignment file). Added output of number of
      structures to template file. Added the ability to read a template
      file, assume a .tem extension, if read template then do not
      rewrite one. Added more features to output of template file.
      Fixed bug in reading of rdseg, added checking to sequences.
      Renamed NMATRIX NVEC, added J as free thiol residue type, this
      will have released a lot of bugs. Still to fix DISPMAT for 21
      residue types, also looks like PAIR will need changing. Renamed
      DISPMAT to PRTBL. renamed a few other routines, fixed bug in
      command line parsing, added log file to program.

2.2 - Got the kid to fix the line spacing in latex. For the test
      dataset 1gcr.atm the program takes 8 minutes and 16 seconds on
      my 386 PC running SCO UNIX 3.2. For a bit of fun, added logging
      of use. (pain under XENIX). Added output of conserved buried residues
      to alignment (bullet under respective row of allignment). Still a bug
      in labelling of alignments. Started to add code to look for effects at
      the ends of helices, etc. Removed logging

2.3 - Changed format of .hbd file. Added mainchain to mainchain hydrogen
      bonds as a distinct feature. Fixed a few new bugs. Changed treatment
      of secondary structure in rdhbd, positive phi no longer over-rides
      alpha and beta states, but the housekeeping is taken care of in nvec.
      correspondingly the DSSP and positive phi angle are separate features.
      Found bug in initialization subroutine, no longer segment violations,
      removed all references to IERR. Spruced up wrtex, removed odd do loops,
      added key to all .tex output, page breaks should now cause no problems
      now. Removed reference to .prn file, data no goes to output file.
      Removed lots of sputious and confusing options.

2.4 - Fixed assignment of logicals from reading a template file, also split
      treatment of mainchain to mainchain hydrogen bonds into NH and CO per
      residue. Changed format of .tem file, changed reading of .tem Fixed
      bugs in several parts of program (new compiler!!) Changed format of
      .tex file, larger page area, better inter-block spacing. added option
      of not printing out key to alignment.

2.5 - many small changes, replaced pair with pair2, main change to this is that
      R1 and R2 are now two dimensional arrays inside pair2 whereas they used
      to be three-dimenisonal. Fixed bugs in pair2 to do with counting of
      substitutions. Added output of Kabat and Wu variability, entropy, etc
      for the alignment in the output file. Removed interlanguage calling
      routine submit, now do things directly. Removed PDBDIR variable.
      Added output of templates for all structures in an alignment (T option).
      placed psa calc at end of hbond and sstruc, looks like things are quicker.
      Added support routines for joy in joy source dir. Ported to IRIS, removed
      bugs in string handling in RDSST, RDHBD, RDPSA and WRTEM, some files were
      missing from Makefile, fixed this.

2.6 - many small bugs, agin to iris, again a few bugs fixed, translated a
      few routines with f2c, found a few more bugs. Removed Ramachandran
      map to Ramaplot, numerous changes to Makefile. This is the final
      version I will leave at Birkbeck, changed syntax of atm2seq, and 
      associated changes in joy. Fixed dependancies, will now work
      correctly with all extensions and programs to run. Fixed bug in
      path for atm2seq. Added startup file (but removed almost straight
      away) Removed T option (single structure templates from alignments)
      might as well run joy on relevant .atm files. Fixed bug
      (inconsistency) in psa, used to produce funny results for missing
      atom residues, they could end up as inaccessible. Changed psa
      and rdpsa.f accordingly, they are now always considered to be 
      accessible.

2.7 - removed option to read a template file. Reduces size of program a lot.
      got portrait and landscape format code from andrej. Added Helvetica
      font as an option.
