Joy is an analysis and formatting program for multiple protein sequence alignments or single protein structures. It produces a number of files that are used either directly or by other programs. There are a large number of options, but the defaults are usually what you will want to do to a basic alignment. the way that the program is used is;
joy -options file
The options may be omitted if you are happy with the default ones. (see below for a description of these).
If your files follow this simple naming convention then you can drop the extension from the command line, as in.
joy gamma
What joy does next depends on the existence of various files, but you should get something on standard output. In the above example the first file searched for is the alignment file gamma.ali If this file exists then it is read and processed. If the .ali file does not exist then joy will look for the file gamma.seq if this exists it is used, if this does not exist then Joy searches for gamma.atm, in the current directory, this file should be a standard PDB format file. If this .atm file is missing joy will check for a file with a .pdb extension, this file will then be converted to a .atm file with the filter pdb2atm. The program will then create a .seq file from its contents and do the processing as before. Obviously any existing .seq file is overwritten if you explicitly specify gamma.atm file from the command line.
If you want to include structural information into the analysis the program is smart enough to try to calculate any missing data as long as you have the corresponding .atm for all entries in the alignment in the current directory (and joy is correctly installed). This feature relies on the presence of the programs psa, hbond, sstruc, pdb2atm, and atm2seq in your path. See below in the section on data files for more details.
The default file extensions used by joy are detailed below.
extension | contents |
pdb | Raw PDB format coordinates |
atm | Processed PDB format coordinates |
hbd | Hydrogen-bonding data |
psa | Accessibility data |
sst | Secondary structure data |
seg | Segment definition file |
lbl | Label data for a structure |
ali | Alignment |
seq | Sequence |
sub | Substitution data |
tem | File containing a `template' representation of structure |
tex | LaTeX file containing alignment |
One of the files that joy produces is a file with a .sub extension, this contains a breakdown of residue substitutions classified according to the local environment. As you would expect this data is quite sparse, so there is an ancillary program called summer to merge the data from many datasets. The data produced by summer is then used by a number of other programs.
Another file produced by the program, usually with a .tex extension, can be used to produce a pretty alignment on a typesetter. This file is then simply processed with latex to get a nicely formatted alignment. The .tem file is the main input to the qslave, and pslave template alignment programs.
Joy has a large number of options, to see the current ones, simply type joy at the command line and the options will be listed, some of the more important options are:
1gcr 1gcr 1 45 2gcr 2gcr 1 45
>P1;1gcr structure --GKITFYEDRGFQGHCYECSSDCPNLQP-YFSRCNSIRVDSGCWMLYERPNYQGHQYFLRRGDYPDYQQWMGF- -NDSIRSCRLIPQHTGTFRMRIYERDDFRGQMSEITD-DCPSLQDRFHLSEVHSLNVLEGSWVLYEMPSYRGRQY LLRPGEYRRYLDWGAMNAKVGSLRRVMDFY-* >P1;2gcr structure --GKITFYEDRGFQGRHYECSSDHSNLQP-YFSRCNSIRVDSGCWMLYEQPNFTGCQYFLRRGDYPDYQQWMGF- -SDSVRSCRLIP-HTSSHRLRIYEREDYRGQMVEITE-DCSSLQDRFHFSDIHSFHVMEGYWVLYEMPNYRGRQY LLRPGDYRRYLDWGAANARVGSLRRAVDFY-* >P1;1bb2 structure LNPKIIIFEQENFQGHSHELNGPCPNLKETGVEKAGSVLVQAGPWVGYEQANCKGEQFVFEKGEYPRWDSWTSSR RTDSLSSLRPIKVDSQEHKITLYENPNFTGKKMEVIDDDVPSFHAHGYQEKVSSVRVQSGTWVGYQYPGYRGLQY LLEKGDYKDSGDFGAPQPQVQSVRRIRDMQW*By default, underneath the alignment is the consensus secondary structure. It should be obvious what it all means, (if it isn't, then what can you expect to gain from using the program). The definition of `consensus' is that a fraction of greater than 0.7 is in a particular conformational state at a position. If you want to change this fraction there is a hidden flag so you can fiddle things. Also underneath the alignment is a series of bullets showing the positions of consensus buried residues, You can turn this feature off if you want to. .P The current limitations on the size of various things the user is likely to encounter are: .P .TS center tab(:); l l. total length of alignment : 1000 total number of structures : 35 total number of `plain' sequences : 30 number of text strings : 6 number of label strings : 3 .TE You may want to mix `featured' and `plain' sequence in the formatted alignment, to do this you simply prefix the title of the sequence with a `*', this marks the sequence as simply a string of characters, and no data files are required. A comment line may be added to the alignment file by preceding it with a `#'. .P