MDT, a module for protein structure analysis.
Copyright 1989-2011 Andrej Sali.
MDT is free software: you can redistribute it and/or modify it under the terms of version 2 of the GNU General Public License as published by the Free Software Foundation.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with MDT. If not, see <http://www.gnu.org/licenses/>.
Library data used in the construction and use of MDTs.
Parameters : |
|
---|
Angle classes; see BondClasses
Atom classes; see BondClasses
Bond classes; see BondClasses
Dihedral classes; see BondClasses
Hydrogen bond atom classes; see HydrogenBondClasses
Atom tuple classes; see TupleClasses and Tuple features
Classifications of tuples of atoms into classes. Usually accessed as Library.tuple_classes. These classes are used by tuple or tuple pair features.
Read atom tuple information from a file
Classifications of atoms/bonds/angles/dihedrals into classes. These classes are used by atom and chemical bond features. Usually accessed as Library.atom_classes, Library.bond_classes, Library.angle_classes, or Library.dihedral_classes. (There is no need to create your own BondClasses objects.)
Read class information from filename
Classifications of atoms into hydrogen bond classes. Usually accessed as Library.hbond_classes. These classes are used by the features.HydrogenBondAcceptor, features.HydrogenBondDonor and features.HydrogenBondSatisfaction features.
Read hydrogen bond atom class information from a file
A multi-dimensional table.
Parameters : |
|
---|
Individual elements from the table can be accessed in standard Python fashion, e.g.
>>> import mdt
>>> import modeller
>>> env = modeller.environ()
>>> mlib = mdt.Library(env)
>>> restyp1 = mdt.features.ResidueType(mlib, protein=0)
>>> restyp2 = mdt.features.ResidueType(mlib, protein=1)
>>> gap = mdt.features.GapDistance(mlib, mdt.uniform_bins(10, 0, 1))
>>> m = mdt.Table(mlib, features=(restyp1,restyp2,gap))
>>> print m[0,0,0]
You can also access an element as m[0][0][0], a 1D section as m[0][0], or a 2D section as m[0]. See TableSection.
Add data from a Modeller alignment to this MDT.
Parameters : |
|
---|
Add data from a Modeller alignment to this MDT. Same as add_alignment except the errors in data are taken into account. The parameter errorscale controls how the error is used:
- 0: the errors are ignored; this function is the same as
add_alignment.
- >0 : the errors are taken into account by propagating the errors
in each axis of each atom into the calculated distances or angles. The errors in the position of individual atoms are first calculated using B-iso, X-ray resolution, and R-factor, and then divided by this errorscale value.
Attempt to ‘close’ the MDT, so that it is useful for creating splines of periodic features.
If dimensions = 1, it makes the two terminal points equal to their average. If dimensions = 2, it applies the averages to both pairs of edges and then again to all four corner points.
Returns: | the closed MDT. |
---|---|
Return type: | Table |
If bin_type is specified, it is the storage type to convert the bin data to (see Storage for bin data).
Returns: | a copy of this MDT table. |
---|---|
Return type: | Table |
Print full entropy information.
The MDT is integrated to get a 1D histogram, then normalized by the sum of the bin values. Finally, entropy is calculated as Σi -pi ln pi
Returns: | the entropy of the last dependent variable. |
---|---|
Return type: | float |
Apply an exponential transform to the MDT. Each element in the new MDT, b, is obtained from the original MDT element a, using the following relation: b = offset + exp(expoffset + multiplier * a ^ power).
Return type: | Table |
---|
Integrate the MDT, and reorder the features. This is useful for squeezing large MDT arrays into smaller ones, and also for eliminating unwanted features (such as X-ray resolution) in preparation for Table.write().
Parameters : |
|
---|---|
Returns: | the integrated MDT. |
Return type: |
Apply an inverse transform to the MDT. Each element in the new MDT, b, is obtained from the original MDT element a, using the following relation: b = offset + multiplier / a. Where a is zero, b is assigned to be undefined.
Returns: | the transformed MDT. |
---|---|
Return type: | Table |
Apply a linear transform to the MDT. Each element in the new MDT, b, is obtained from the original MDT element a, using the following relation: b = offset + a * multiplier.
Returns: | the transformed MDT. |
---|---|
Return type: | Table |
Apply a log transform to the MDT. Each element in the new MDT, b, is obtained from the original MDT element a, using the following relation: b = ln(offset + multiplier * a). Where this would involve the logarithm of a negative number, b is assigned to be undefined.
Returns: | the transformed MDT. |
---|---|
Return type: | Table |
Clear the table, and set the features. features must be a list of previously created objects from the mdt.features module.
Number of protein pairs
Number of proteins
Normalize or scale the MDT. It does not really matter what the contents of the input MDT are; sensible contents include the raw or normalized frequencies.
Parameters : |
|
---|---|
Returns: | the normalized MDT. |
Return type: |
Offset the MDT by the minimum value, either in each 1D section (dimensions = 1) or in each 2D section (dimensions = 2).
Returns: | the transformed MDT. |
---|---|
Return type: | Table |
Open a Modeller alignment to allow MDT indices to be queried (see Source). Arguments are as for Table.add_alignment().
Return type: | Source |
---|
Whether this MDT is a PDF
Read an MDT from file.
Read an MDT in HDF5 format from file.
Reorder the MDT features and optionally decrease their ranges. When an MDT is created, each feature has exactly the bins defined in the Library‘s bin file. However, for each feature, you can change the offset (initial number of bins from the bin file to omit) from the default 0, and the shape (total number of bins).
All parameters should be lists with the same number of elements as the MDT has features.
Parameters : |
|
---|---|
Returns: | the reshaped MDT. |
Return type: |
Number of sample points
Smooth the MDT with a uniform prior. The MDT is treated either as a histogram (if dimensions = 1) or a 2D density (dimensions = 2) of dependent features (the last 1 or 2 features in the table) and a uniform distribution is added followed by scaling:
pi = w1 / n + w2 vi / S
S = Σin vi
w1 = 1 / ( 1 + S / (weight * n))
w2 = 1 - w1
where v is the input MDT array, n is the number of bins in the histogram, and p is the output MDT array, smoothed and normalized. weight is the number of points per bin in the histogram at which the relative weights of the input histogram and the uniform prior are equal.
The sum of the bins in the output MDT array is 1, for each histogram.
Note that the resulting output MDT array is not necessarily a PDF, because the bin widths are not taken into account during scaling. That is, the sum of all bin values multiplied by the bin widths is not 1 if the bin widths are not 1.
Returns: | the smoothed MDT. |
---|---|
Return type: | Table |
Multi-level smoothing. This super-smoothes the raw frequencies in the MDT using the hierarchical smoothing procedure for 1D histograms described in Sali and Blundell, JMB 1993. It was also employed in Sali and Overington, Prot Sci. 1994.
Briefly, the idea is to recursively construct the best possible prior distribution for smoothing 1D data p(x | a, b, c, ...). The best prior is a weighted sum (weights optionally based on entropy) of the best possible estimate of p(x | a, b, ...) integrated over c for each c. Each one of these can itself be obtained from a prior and the data, and so on recursively.
The example above is for a single dependent feature (x), which is the case when dimensions = 1. x should be the last feature in the table. dimensions can be set to other values if you have more dependent features - for example, dimensions = 2 will work with p(x, y | a, b, c, ...) where x and y are the last two features in the table.
Parameters : |
|
---|---|
Returns: | the smoothed MDT. |
Return type: |
True if a symmetric scan can be performed
Write the table to file. If write_preamble is False, it will only write out the contents of the MDT table, without the preamble including the feature list, bins, etc. This is useful for example for creating a file to be read by another program, such as Mathematica.
Make input files for ASGL.
Parameters : |
|
---|
Write an MDT in HDF5 format to file.
A section of a multi-dimensional table. You should not create TableSection objects directly, but rather by indexing a Table object, as a TableSection is just a ‘view’ into an existing table. For example,
>>> m = mdt.Table(mlib, features=(residue_type, xray_resolution))
>>> print m[0].entropy()
would create a section (using m[0]) which is a 1D table over the 2nd feature (X-ray resolution) for the first bin (0) of the first feature (residue type), and then get the entropy using the TableSection.entropy() method.
Entropy of all points in the table
Mean and standard deviation of the table
Array offsets; see Feature.offset
Array shape; the number of bins for each feature
Sum of all points in the table
A single feature in an MDT. Generally accessed as TableSection.features.
Integer type
Offset of first bin compared to the MDT library feature (usually 0, but can be changed with Table.reshape())
Whether feature is periodic
A single bin in a feature. Generally accessed as Feature.bins.
Bin range; usually the minimum and maximum floating-point values for the feature to be placed in this bin.
Bin symbol
A source of data for an MDT (generally a Modeller alignment, opened with Table.open_alignment()).
Return the bin index (starting at 1) of a single MDT feature. (Arguments ending in 2 and 3 are used for features involving pairs or triples of proteins.)
Warning
This is a low-level interface, and no bounds checking is performed on these parameters. Avoid this function if possible.
Parameters : |
|
---|
Scan all data points in the source, and return the sum. See Table.add_alignment() for a description of the residue_span_range, chain_span_range and exclude_* arguments.
The full MDT version number, as a string, e.g. ‘5.0’ or ‘SVN’.
For release builds, the major and minor version numbers as a tuple of integers - e.g. (5, 0). For SVN builds, this is the same as ‘version’.
Make a list of num equally-sized bins, each of which has the given width, and starting at start. This is suitable for input to any of the classes in mdt.features which need a list of bins.
Write out a Modeller bond library file from an MDT. The input MDT should be a 2D table (usually of bond type and bond distance). For each bond type, the 1D MDT section (see TableSection) of bond distance is examined, and its mean and standard deviation used to generate a Modeller harmonic restraint.
Parameters : |
|
---|
Write out a Modeller angle library file from an MDT. See write_bondlib() for more details. The MDT should be a 2D table, usually of angle type and bond angle.
Write out a Modeller dihedral angle library file from an MDT. See write_bondlib() for more details. The MDT should be a 2D table, usually of dihedral type and bond dihedral angle.
Write out a Modeller 1D spline library file from an MDT. The MDT should be a 2D table, usually of residue type and a chi dihedral angle. dihtype should identify the dihedral type (i.e. chi1/chi2/chi3/chi4). The operation is similar to write_bondlib(), but each MDT section is treated as the spline values. No special processing is done, so it is expected that the user has first done any necessary transformations (e.g. normalization with Table.normalize() to convert raw counts into a PDF, negative log transform with Table.log_transform() and Table.linear_transform() to convert a PDF into a statistical potential).
Write out a Modeller 2D spline library file from an MDT. See write_splinelib() for more details. The input MDT should be a 3D table, e.g. of residue type, phi angle, and psi angle.