|
def | __init__ (self, target, compound_lib=None, custom_compounds=None, inclusion_radius=15, sequence_separation=0, symmetry_settings=None, seqres_mapping=dict(), bb_only=False) |
|
def | ref_indices (self) |
|
def | ref_distances (self) |
|
def | sym_ref_indices (self) |
|
def | sym_ref_distances (self) |
|
def | ref_indices_sc (self) |
|
def | ref_distances_sc (self) |
|
def | sym_ref_indices_sc (self) |
|
def | sym_ref_distances_sc (self) |
|
def | ref_indices_ic (self) |
|
def | ref_distances_ic (self) |
|
def | sym_ref_indices_ic (self) |
|
def | sym_ref_distances_ic (self) |
|
def | lDDT (self, model, thresholds=[0.5, 1.0, 2.0, 4.0], local_lddt_prop=None, local_contact_prop=None, chain_mapping=None, no_interchain=False, no_intrachain=False, penalize_extra_chains=False, residue_mapping=None, return_dist_test=False, check_resnames=True, add_mdl_contacts=False) |
|
def | GetNChainContacts (self, target_chain, no_interchain=False) |
|
lDDT scorer object for a specific target
Sets up everything to score models of that target. lDDT (local distance
difference test) is defined as fraction of pairwise distances which exhibit
a difference < threshold when considering target and model. In case of
multiple thresholds, the average is returned. See
V. Mariani, M. Biasini, A. Barbato, T. Schwede, lDDT : A local
superposition-free score for comparing protein structures and models using
distance difference tests, Bioinformatics, 2013
:param target: The target
:type target: :class:`ost.mol.EntityHandle`/:class:`ost.mol.EntityView`
:param compound_lib: Compound library from which a compound for each residue
is extracted based on its name. Uses
:func:`ost.conop.GetDefaultLib` if not given, raises
if this returns no valid compound library. Atoms
defined in the compound are searched in the residue and
build the reference for scoring. If the residue has
atoms with names ["A", "B", "C"] but the corresponding
compound only has ["A", "B"], "A" and "B" are
considered for scoring. If the residue has atoms
["A", "B"] but the compound has ["A", "B", "C"], "C" is
considered missing and does not influence scoring, even
if present in the model.
:param custom_compounds: Custom compounds defining reference atoms. If
given, *custom_compounds* take precedent over
*compound_lib*.
:type custom_compounds: :class:`dict` with residue names (:class:`str`) as
key and :class:`CustomCompound` as value.
:type compound_lib: :class:`ost.conop.CompoundLib`
:param inclusion_radius: All pairwise distances < *inclusion_radius* are
considered for scoring
:type inclusion_radius: :class:`float`
:param sequence_separation: Only pairwise distances between atoms of
residues which are further apart than this
threshold are considered. Residue distance is
based on resnum. The default (0) considers all
pairwise distances except intra-residue
distances.
:type sequence_separation: :class:`int`
:param symmetry_settings: Define residues exhibiting internal symmetry, uses
:func:`GetDefaultSymmetrySettings` if not given.
:type symmetry_settings: :class:`SymmetrySettings`
:param seqres_mapping: Mapping of model residues at the scoring stage
happens with residue numbers defining their location
in a reference sequence (SEQRES) using one based
indexing. If the residue numbers in *target* don't
correspond to that SEQRES, you can specify the
mapping manually. You can provide a dictionary to
specify a reference sequence (SEQRES) for one or more
chain(s). Key: chain name, value: alignment
(seq1: SEQRES, seq2: sequence of residues in chain).
Example: The residues in a chain with name "A" have
sequence "YEAH" and residue numbers [42,43,44,45].
You can provide an alignment with seq1 "``HELLYEAH``"
and seq2 "``----YEAH``". "Y" gets assigned residue
number 5, "E" gets assigned 6 and so on no matter
what the original residue numbers were.
:type seqres_mapping: :class:`dict` (key: :class:`str`, value:
:class:`ost.seq.AlignmentHandle`)
:param bb_only: Only consider atoms with name "CA" in case of amino acids and
"C3'" for Nucleotides. this invalidates *compound_lib*.
Raises if any residue in *target* is not
`r.chem_class.IsPeptideLinking()` or
`r.chem_class.IsNucleotideLinking()`
:type bb_only: :class:`bool`
:raises: :class:`RuntimeError` if *target* contains compound which is not in
*compound_lib*, :class:`RuntimeError` if *symmetry_settings*
specifies symmetric atoms that are not present in the according
compound in *compound_lib*, :class:`RuntimeError` if
*seqres_mapping* is not provided and *target* contains residue
numbers with insertion codes or the residue numbers for each chain
are not monotonically increasing, :class:`RuntimeError` if
*seqres_mapping* is provided but an alignment is invalid
(seq1 contains gaps, mismatch in seq1/seq2, seq2 does not match
residues in corresponding chains).
Definition at line 116 of file lddt.py.
def lDDT |
( |
|
self, |
|
|
|
model, |
|
|
|
thresholds = [0.5, 1.0, 2.0, 4.0] , |
|
|
|
local_lddt_prop = None , |
|
|
|
local_contact_prop = None , |
|
|
|
chain_mapping = None , |
|
|
|
no_interchain = False , |
|
|
|
no_intrachain = False , |
|
|
|
penalize_extra_chains = False , |
|
|
|
residue_mapping = None , |
|
|
|
return_dist_test = False , |
|
|
|
check_resnames = True , |
|
|
|
add_mdl_contacts = False |
|
) |
| |
Computes lDDT of *model* - globally and per-residue
:param model: Model to be scored - models are preferably scored upon
performing stereo-chemistry checks in order to punish for
non-sensical irregularities. This must be done separately
as a pre-processing step. Target contacts that are not
covered by *model* are considered not conserved, thus
decreasing lDDT score. This also includes missing model
chains or model chains for which no mapping is provided in
*chain_mapping*.
:type model: :class:`ost.mol.EntityHandle`/:class:`ost.mol.EntityView`
:param thresholds: Thresholds of distance differences to be considered
as correct - see docs in constructor for more info.
default: [0.5, 1.0, 2.0, 4.0]
:type thresholds: :class:`list` of :class:`floats`
:param local_lddt_prop: If set, per-residue scores will be assigned as
generic float property of that name
:type local_lddt_prop: :class:`str`
:param local_contact_prop: If set, number of expected contacts as well
as number of conserved contacts will be
assigned as generic int property.
Excected contacts will be set as
<local_contact_prop>_exp, conserved contacts
as <local_contact_prop>_cons. Values
are summed over all thresholds.
:type local_contact_prop: :class:`str`
:param chain_mapping: Mapping of model chains (key) onto target chains
(value). This is required if target or model have
more than one chain.
:type chain_mapping: :class:`dict` with :class:`str` as keys/values
:param no_interchain: Whether to exclude interchain contacts
:type no_interchain: :class:`bool`
:param no_intrachain: Whether to exclude intrachain contacts (i.e. only
consider interface related contacts)
:type no_intrachain: :class:`bool`
:param penalize_extra_chains: Whether to include a fixed penalty for
additional chains in the model that are
not mapped to the target. ONLY AFFECTS
RETURNED GLOBAL SCORE. In detail: adds the
number of intra-chain contacts of each
extra chain to the expected contacts, thus
adding a penalty.
:type penalize_extra_chains: :class:`bool`
:param residue_mapping: By default, residue mapping is based on residue
numbers. That means, a model chain and the
respective target chain map to the same
underlying reference sequence (SEQRES).
Alternatively, you can specify one or
several alignment(s) between model and target
chains by providing a dictionary. key: Name
of chain in model (respective target chain is
extracted from *chain_mapping*),
value: Alignment with first sequence
corresponding to target chain and second
sequence to model chain. There is NO reference
sequence involved, so the two sequences MUST
exactly match the actual residues observed in
the respective target/model chains (ATOMSEQ).
:type residue_mapping: :class:`dict` with key: :class:`str`,
value: :class:`ost.seq.AlignmentHandle`
:param return_dist_test: Whether to additionally return the underlying
per-residue data for the distance difference
test. Adds five objects to the return tuple.
First: Number of total contacts summed over all
thresholds
Second: Number of conserved contacts summed
over all thresholds
Third: list with length of scored residues.
Contains indices referring to model.residues.
Fourth: numpy array of size
len(scored_residues) containing the number of
total contacts,
Fifth: numpy matrix of shape
(len(scored_residues), len(thresholds))
specifying how many for each threshold are
conserved.
:param check_resnames: On by default. Enforces residue name matches
between mapped model and target residues.
:type check_resnames: :class:`bool`
:param add_mdl_contacts: Adds model contacts - Only using contacts that
are within a certain distance threshold in the
target does not penalize for added model
contacts. If set to True, this flag will also
consider target contacts that are within the
specified distance threshold in the model but
not necessarily in the target. No contact will
be added if the respective atom pair is not
resolved in the target.
:type add_mdl_contacts: :class:`bool`
:returns: global and per-residue lDDT scores as a tuple -
first element is global lDDT score (None if *target* has no
contacts) and second element a list of per-residue scores with
length len(*model*.residues). None is assigned to residues that
are not covered by target. If a residue is covered but has no
contacts in *target*, 0.0 is assigned.
Definition at line 428 of file lddt.py.