mol.alg – Algorithms for Structures
-
LocalDistDiffTest(model, reference, tolerance, radius, local_ldt_property_string="")
This function calculates the agreement of local contacts between a model and
a reference structure (Local Distance Difference Tests). The overlap is a number
between zero and one, where one indicates complete agreement, zero indicates no
agreement at all. This score is similar to the GDT, but does not require any
superposition between the model and the reference.
The distance of atom pairs in the reference structure, when shorter than a
certain predefined distance (inclusion radius), is compared with the same distance in
the model. If the difference between these two distances is smaller than a
threshold value (tolerance), the distance is considered conserverd in the model. Missing atoms
in the model lead to non-conserved distances and thus lower the score.
The function only processes standard residues in the first chains of the model and of the reference
For residues with symmetric sidechains (GLU, ASP, ARG, VAL, PHE, TYR), the
naming of the atoms is ambigous. For these residues, the function computes the Local Distance Difference
Test score that each naming convention would generate when considering all non-ambigous surrounding atoms.
The solution that gives higher score is then picked to compute the final Local Difference
Distance Score for the whole model.
If a string is passed as last parameter to the function, the function computes the Local Difference Distance
Test score for each residue and saves it as a float property in the ResidueHandle, with the passed string
as property name. Additionally, the actual residue-based counts of the total checked distances and of
the distances conserved in the model are stored as integer properties in the ResidueHandle. The property
names are respectively <passed string>_total and <passed string>_conserved.
Parameters: |
- model (EntityView) – the model structure
- reference (EntityView) – the reference structure
- tolerance – the tolerance threshold used to determine distance conservation
- radius – the inclusion radius in Angstroms
- local_ldt_property_string – the base name for the ResidueHandle properties that store the local scores
|
Returns: | the Local Distance Difference Test score
|
-
LocalDistDiffTest(model, distance_list, tolerance_list, sequence_separation=0, local_ldt_property_string="")
This function counts the conserved local contacts between the model and the reference structure
(these are the values needed to compute the Local Distance Difference Test score, see description of
the previous function). It shares the same properties as the previous function, with some differences:
the thresholds can be more than one (the return counts are then the average over all thresholds), and
the input is not the reference structure, but already a list of distances to be checked for conservation
A sequence separation parameter can be passed to the function. If this happens, only distances between residues
whose separation is higher than the provided parameter are considered when computing the score
If a string is passed as the last parameter, residue-based counts and the value of the residue-based Local
Distance Difference Test score are saved in each ResidueHandle as int and float properties, as detailed in
the description of the previous function.
Parameters: |
- model (EntityView) – the model structure
- distance_list (GlobalRDMap) – the list of distances to check for conservation
- tolerance_list – a list of thresholds used to determine distance conservation
- sequence_separation – sequence separation parameter used when computing the score
- local_ldt_property_string – the base name for the ResidueHandle properties that store the local scores
|
Returns: | a tuple containing the counts of the conserved distances in the model and of all the checked
distances
|
-
LocalDistDiffTest(alignment, tolerance, radius, ref_index=0, mdl_index=1);
Calculates the Local Distance Difference Test score (see previous functions) starting from an
alignment between a reference structure and a model. The AlignmentHandle parameter used to provide the
alignment to the function needs to have the two structures attached to it. By default the first structure in the
alignment is considered to be the reference structure, and the second structure is taken as the model. This
can however be changed by passing the indexes of the two structures in the AlignmentHandle as parameters to the
function.
BEWARE: This function uses the old implementation of the Local Distance Difference Test algorithm and
will give slightly different results from the new one.
Parameters: |
- alignment (AlignmentHandle) – an alignment containing the sequences of the reference and of the model structures, with the structures themselves
attached
- tolerance – a list of thresholds used to determine distance conservation
- radius – the inclusion radius in Angstroms (to determine which distances are checked for conservation)
- ref_index – index of the reference structure in the alignment
- mdl_index – index of the model in the alignment
|
Returns: | the Local Distance Difference Test score
|
-
LDDTHA(model, distance_list, sequence_separation=0);
This function calculates the Local Distance Difference Test - High Accuracy score (see previous functions).
The High Accuracy name comes from the fact that the tolerance levels used by this function are the same
as the thresholds used by GDT-HA (0.5, 1, 2, and 4 Angstrom).
The function only compares the input distance list to the first chain of the model structure
The local Local Distance Difference Test score values are stored in the ResidueHandles of the model passed to the
function in a float property called “locallddt”
A sequence separation parameter can be passed to the function. If this happens, only distances between residues
whose separation is higher than the provided parameter are considered when computing the score
Parameters: |
- model (EntityView) – the model structure
- distance_list (GlobalRDMap) – the list of distances to check for conservation
- sequence_separation – sequence separation parameter used when computing the score
|
Returns: | the Local Distance Difference Test score
|
-
CreateDistanceList(reference, radius);
-
CreateDistanceListFromMultipleReferences(reference_list, tolerance_list, sequence_separation, radius);
Both these functions create lists of distances to be checked during a Local Distance Difference Test
(see description of the functions above).
Both functions process only standard residues present in the first chain of the reference structures.
The only difference between the two functions is that one takes a single reference structure and the other
a list of reference structures. The structures in the list have to be properly prepared before being passed
to the function. Corresponding residues in the structures must have the same residue number, the same chain name,
etc. Gaps are allowed and automatically dealt with: if information about a distance is present in at least one of
the structures, it will be considered.
If a distance between two atoms is shorter than the inclusion radius in all structures in which the two atoms are
present, it is included in the list. However, if the distance is longer than the inclusion radius in at least
one of the structures, it is not considered to be a local interaction and is excluded from the list.
The multiple-reference function takes care of residues with ambigous symmetric sidechains. To decide which naming
convention to use, the function computes a Local Distance Difference Test score for each reference against the
first reference structure in the list, using only non ambigously-named atoms. It picks then the naming convention
that gives the highest score, guaranteeing that all references are processed with the correct atom names.
The cutoff list that will later be used to compute the Local Distance Difference Test score and the sequence
separation parameter must be passed to the multi-reference function. These parameters do not influence the output
distance list, which always includes all distances within the provided radius (to make it consistent with the
single-reference corresponding function). However, the parameters are used when dealing with the naming convention
of residues with ambiguous nomenclature.
Parameters: |
- reference (EntityView) – a reference structure from which distances are derived
- reference_list (list of EntityView) – a list of of reference structures from which distances are derived
- tolerance_list – a list of thresholds used to determine distance conservation when computing the LDDT score
- sequence_separation – sequence separation parameter used when computing the LDDT score
- radius – inclusion radius (in Angstroms) used to determine the distances included in the list
|
Returns: | class ~ost.mol.alg.GlobalRDMap
|
-
class UniqueAtomIdentifier
Object containing enough information to uniquely identify an atom in a structure
-
UniqueAtomIdentifier(chain, residue_number, residue_name, atom_name)
Creates an UniqueAtomIdentifier object starting from relevant atom information
Parameters: |
- chain – a string containing the name of the chain to which the atom belongs
- residue_number (ResNum) – the number of the residue to which the atom belongs
- residue_name – a string containing the name of the residue to which the atom belongs
- atom_name – a string containing the name of the atom
|
-
GetChainName()
Returns the name of the chain to which the atom belongs, as a String
-
GetResNum()
Returns the number of the residue the atom belongs to, as a ResNum object
-
GetResidueName()
Returns the name of the residue to which the atom belongs, as a String
-
GetAtomName()
Returns the name of the atom, as a String
-
class ResidueRDMap
Dictionary-like object containing the a list of distances that originate from the a single residue residue, to
check during a run of the Local Distance Difference Test algorithm
-
class GlobalRDMap
Dictionary-like object containing all the ResidueRDMap objects related to residues
of a single structure
Steric Clashes
The following function detects steric clashes in atomic structures. Two atoms are clashing if their euclidian distance is smaller than a threshold value (minus a tolerance offset).
-
FilterClashes(entity, clashing_distances, always_remove_bb=False)
This function filters out residues with non-bonded clashing atoms. If the clashing atom
is a backbone atom, the complete residue is removed from the structure, if the atom is part of
the sidechain, only the sidechain atoms are removed. This behavior is changed
by the always_remove_bb flag: when the flag is set to True the whole residue is removed even if
a clash is just detected in the side-chain.
The function performs the filtering directly on the the entity which is passed as an argument. The entity
gets altered by the function.
Two atoms are defined as clashing if their distance is shorter than the reference distance minus a tolerance
threshold. The information about the clashing distances and the tolerance thresholds for all possible pairs of
atoms is passed to the function as a parameter
Hydrogen and deuterium atoms are ignored by this function.
Parameters: |
- entity (EntityView or EntityHandle) – The input entity
- clashing_distances (ClashingDistances) – information about the clashing distances
- always_remove_bb – if set to True, the whole residue is removed even if the clash happens in the side-chain
|
Returns: | The filtered EntityView
|
-
CheckStereoChemistry(entity, bond_stats, angle_stats, bond_tolerance, angle_tolerance, always_remove_bb=False)
This function filters out residues with severe stereo-chemical violations. If the violation
involves a backbone atom, the complete residue is removed from the structure, if it involves an atom that is
part of the sidechain, only the sidechain is removed. This behavior is changed
by the always_remove_bb flag: when the flag is set to True the whole residue is removed even if
a violation is just detected in the side-chain
The function performs the filtering directly on the the entity which is passed as an argument. The entity
gets altered by the function.
A violation is defined as a bond length that lies outside of the range: [mean_length-std_dev*bond_tolerance <-> meanlength+std_dev*bond_tolerance] or an angle width lying outside of the range [mean_width-std_dev*angle_tolerance <-> mean_width+std_dev*angle_tolerance ]. The information about the mean lengths and widths and the corresponding standard deviations is passed to the function using two parameters.
Hydrogen and deuterium atoms are ignored by this function.
Parameters: |
- entity (EntityView or EntityHandle) – The input entity
- bond_stats (StereoChemicalParams) – statistics about bond lengths
- angle_stats (StereoChemicalParams) – statistics about angle widths
- bond_tolerance – tolerance for bond lengths (in standard deviations)
- angle_tolerance – tolerance for angle widths (in standard deviations)£
- always_remove_bb – if set to True, the whole residue is removed even if a violation in just detected in the side-chain
|
Returns: | The filtered EntityView
|
-
class ClashingDistances
Object containing information about clashing distances between non-bonded atoms
-
ClashingDistances()
Creates an empty distance list
-
SetClashingDistance(ele1, ele2, clash_distance, tolerance)
Adds or replaces an entry in the list
Parameters: |
- ele1 – string containing the first element’s name
- ele2 – string containing the second element’s name
- clash_distance – minimum clashing distance (in Angstroms)
- tolerance – tolerance threshold (in Angstroms)
|
-
GetMaxAdjustedDistance()
Returns the longest clashing distance in the list, after adjustment with tolerance threshold
-
IsEmpty()
Returns True if the list is empty (i.e. in an invalid, useless state)
-
PrintAllDistances()
Prints all distances in the list to standard output
-
class StereoChemicalParams
Object containing stereo-chemical information about bonds and angles. For each item (bond or angle
in a specific residue), stores the mean and standard deviation
-
StereoChemicalParams()
Creates an empty parameter list
-
SetParam(item, residue, mean, standard_dev)
Adds or replaces an entry in the list
Parameters: |
- item – string defining a bond (format: X-Y) or an angle (format:X-Y-Z), where X,Y an Z are atom names
- residue – string containing the residue type the information pertains to
- mean – mean bond length or angle width
- standard_dev – standard deviation of the bond length or of the angle width
|
-
IsEmpty()
Returns True if the list is empty (i.e. in an invalid, useless state)
-
PrintAllParameters()
Prints all distances in the list to standard output
-
FillClashingDistances(file_content)
-
FillBondStereoChemicalParams(file_content)
-
FillAngleStereoChemicalParams(file_content)
These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for
bonds and a list of stereo-chemical parameters for angles, respectively, starting from a the content of
parameter file. The content of the file is passed to the function as a list of strings, each containing
a line from the parameter file
-
FillClashingDistancesFromFile(filename)
-
FillBondStereoChemicalParamsFromFile(filename)
-
FillAngleStereoChemicalParamsFromFile(filename)
These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for
bonds and a list of stereo-chemical parameters for angles, respectively, starting from a file. The filename
passed to the function can be a full path.
-
DefaultClashingDistances()
-
DefaultBondStereoChemicalParams()
-
DefaultAngleStereoChemicalParams()
These three functions fill a list of reference clashing distances, a list of stereo-chemical parameters for
bonds and a list of stereo-chemical parameters for angles, respectively, using the default parameter file
distributed with OpenStructure.
Trajectory Analysis
This is a set of functions used for basic trajectory analysis such as extracting
positions, distances, angles and RMSDs. The organization is such that most
functions have their counterpart at the individual frame level so that they can also be called on one frame instead
of the whole trajectory.
All these functions have a “stride” argument that defaults to stride=1, which is
used to skip frames in the analysis.
-
AnalyzeAtomPos(traj, atom1, stride=1)
This function extracts the position of an atom from a trajectory. It returns
a vector containing the position of the atom for each analyzed frame.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- atom1 – The AtomHandle.
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeCenterOfMassPos(traj, sele, stride=1)
This function extracts the position of the center-of-mass of a selection
(EntityView) from a trajectory and returns it as a vector.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- sele (EntityView.) – The selection from which the center of mass is computed
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeDistanceBetwAtoms(traj, atom1, atom2, stride=1)
This function extracts the distance between two atoms from a trajectory
and returns it as a vector.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- atom1 – The first AtomHandle.
- atom2 – The second AtomHandle.
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeAngle(traj, atom1, atom2, atom3, stride=1)
This function extracts the angle between three atoms from a trajectory
and returns it as a vector. The second atom is taken as being the central
atom, so that the angle is between the vectors (atom1.pos-atom2.pos) and
(atom3.pos-atom2.pos).
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- atom1 – The first AtomHandle.
- atom2 – The second AtomHandle.
- atom3 – The third AtomHandle.
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeDihedralAngle(traj, atom1, atom2, atom3, atom4, stride=1)
This function extracts the dihedral angle between four atoms from a trajectory
and returns it as a vector. The angle is between the planes containing the
first three and the last three atoms.
-
AnalyzeDistanceBetwCenterOfMass(traj, sele1, sele2, stride=1)
This function extracts the distance between the center-of-mass of two
selections (EntityView) from a trajectory and returns it as
a vector.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- sele1 (EntityView.) – The selection from which the first center of mass is computed
- sele2 (EntityView.) – The selection from which the second center of mass is computed
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeRMSD(traj, reference_view, sele_view, stride=1)
This function extracts the rmsd between two EntityView and
returns it as a vector. The views don’t have to be from the same entity. The
reference positions are taken directly from the reference_view, evaluated only
once. The positions from the sele_view are evaluated for each frame.
If you want to compare to frame i of the trajectory t, first use
t.CopyFrame(i) for example:
eh=io.LoadPDB(...)
t=io.LoadCHARMMTraj(eh,...)
sele=eh.Select(...)
t.CopyFrame(0)
mol.alg.AnalyzeRMSD(t,sele,sele)
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- reference_view (EntityView.) – The selection used as reference structure
- sele_view (EntityView.) – The selection compared to the reference_view
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeMinDistance(traj, view1, view2, stride=1)
This function extracts the minimal distance between two sets of atoms
(view1 and view2) for each frame in a trajectory and returns it as a vector.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- view1 (EntityView.) – The first group of atoms
- view2 (EntityView.) – The second group of atoms
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeMinDistanceBetwCenterOfMassAndView(traj, view_cm, view_atoms, stride=1)
This function extracts the minimal distance between a set of atoms
(view_atoms) and the center of mass of a second set of atoms (view_cm)
for each frame in a trajectory and returns it as a vector.
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- view_cm (EntityView.) – The group of atoms from which the center of mass is taken
- view_atoms (EntityView.) – The second group of atoms
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
AnalyzeAromaticRingInteraction(traj, view_ring1, view_ring2, stride=1)
This function is a crude analysis of aromatic ring interactions. For each frame in a trajectory, it calculates
the minimal distance between the atoms in one view and the center of mass of the other
and vice versa, and returns the minimum between these two minimal distances.
Concretely, if the two views are the heavy atoms of two rings, then it returns the minimal
center of mass - heavy atom distance betweent he two rings
Parameters: |
- traj (CoordGroupHandle) – The trajectory to be analyzed.
- view_ring1 (EntityView.) – First group of atoms
- view_ring2 (EntityView.) – Second group of atoms
- stride – Size of the increment of the frame’s index between two
consecutive frames analyzed.
|
-
SuperposeFrames(frames, sel, from=0, to=-1, ref=-1)
This function superposes the frames of the given coord group and returns them
as a new coord group.
Parameters: |
- frames (CoordGroupHandle) – The source coord group.
- sel (ost.mol.EntityView) – An entity view containing the selection of atoms to be used for
superposition. If set to an invalid view, all atoms in the coord group are
used.
- from – index of the first frame
- to – index of the last frame plus one. If set to -1, the value is set to
the number of frames in the coord group
- ref – The index of the reference frame to use for superposition. If set
to -1, the each frame is superposed to the previous frame.
|
Returns: | A newly created coord group containing the superposed frames.
|
-
SuperposeFrames(frames, sel, ref_view, from=0, to=-1)
Same as SuperposeFrames above, but the superposition is done on a reference
view and not on another frame of the trajectory.
Parameters: |
- frames (CoordGroupHandle) – The source coord group.
- sel (ost.mol.EntityView) – An entity view containing the selection of atoms of the frames to be used for
superposition.
- ref_view (ost.mol.EntityView) – The reference view on which the frames will be superposed. The number
of atoms in this reference view should be equal to the number of atoms in sel.
- from – index of the first frame
- to – index of the last frame plus one. If set to -1, the value is set to
the number of frames in the coord group
|
Returns: | A newly created coord group containing the superposed frames.
|
-
ParseAtomNames(atoms)
Parses different representations of a list of atom names and returns a
set, understandable by MatchResidueByNum(). In
essence, this function translates
- None to None
- ‘all’ to None
- ‘backbone’ to set(['N', 'CA', 'C', 'O'])
- ‘aname1, aname2’ to set(['aname1', 'aname2'])
- ['aname1', 'aname2'] to set(['aname1', 'aname2'])
Parameters: | atoms (str, list, set) – Identifier or list of atoms |
Returns: | A set of atoms. |
-
MatchResidueByNum(ent_a, ent_b, atoms='all')
Returns a tuple of views containing exactly the same number of atoms.
Residues are matched by residue number. A subset of atoms to be included in
the views can be specified in the atoms argument. Regardless of what the
list of atoms says, only those present in two matched residues will be
included in the views. Chains are processed in the order they occur in the
entities. If ent_a and ent_b contain a different number of chains,
processing stops with the lower count.
Parameters: |
|
Returns: | Two EntityView instances with the same amount of
residues matched by number. Each residue will have the same number
& type of atoms.
|
-
MatchResidueByIdx(ent_a, ent_b, atoms='all')
Returns a tuple of views containing exactly the same number of atoms.
Residues are matched by position in the chains of an entity. A subset of
atoms to be included in the views can be specified in the atoms argument.
Regardless of what the list of atoms says, only those present in two
matched residues will be included in the views. Chains are processed in order
of appearance. If ent_a and ent_b contain a different number of
chains, processing stops with the lower count. The number of residues per
chain is supposed to be the same.
Parameters: |
|
Returns: | Two EntityView instances with the same amount of
residues matched by position. Each residue will have the same number
& type of atoms.
|
-
MatchResidueByLocalAln(ent_a, ent_b, atoms='all')
Match residues by local alignment. Takes ent_a and ent_b, extracts
the sequences chain-wise and aligns them in Smith/Waterman manner using the
BLOSUM62 matrix for scoring. The residues of the entities are then matched
based on this alignment. Only atoms present in both residues are included in
the views. Chains are processed in order of appearance. If ent_a and
ent_b contain a different number of chains, processing stops with
the lower count.
Parameters: |
|
Returns: | Two EntityView instances with the same number of
residues. Each residue will have the same number & type of atoms.
|
-
MatchResidueByGlobalAln(ent_a, ent_b, atoms='all')
Match residues by global alignment. Takes ent_a and ent_b, extracts
the sequences chain-wise and aligns them in Needleman/Wunsch manner using the
BLOSUM62 matrix for scoring. The residues of the entities are then matched
based on this alignment. Only atoms present in both residues are included in
the views. Chains are processed in order of appearance. If ent_a and
ent_b contain a different number of chains, processing stops with
the lower count.
Parameters: |
|
Returns: | Two EntityView instances with the same number of
residues. Each residue will have the same number & type of atoms.
|
-
Superpose(ent_a, ent_b, match='number', atoms='all')
Superposes the model entity onto the reference. To do so, two views are
created, returned with the result. atoms describes what goes in to these
views and match the selection method. For superposition,
SuperposeSVD() is called. For matching, following methods
are recognised:
Parameters: |
- ent_a (EntityView or EntityHandle) – The model entity
- ent_b (EntityView or EntityHandle) – The reference entity
- match (str) – Method to gather residues/ atoms
- atoms (str, list, set) – The subset of atoms to be used in the superposition.
|
Returns: | An instance of SuperpositionResult, containing members
|
|
Contents
Search
Enter search terms or a module, class or function name.
Previous topic
Trajectories
Next topic
conop – Connectivity and Topology of Molecules
You are here
|