Handling All Atom Positions

To represent and handle all atom loops, we provide a container (AllAtomPositions) to represent arbitrary amino acid sequences with the positions of all their heavy atoms and an environment (AllAtomEnv) to handle changes during loop modelling.

The example below showcases some operations on the two classes:

from ost import io, geom
from promod3 import loop

# load example (has res. numbering starting at 1)
ent = io.LoadPDB("data/1CRN.pdb")
res_list = ent.residues
seqres_str = ''.join([r.one_letter_code for r in res_list])

# initialize an environment
env = loop.AllAtomEnv(seqres_str)
env.SetInitialEnvironment(ent)

# get all atom position for part of it
all_atoms = loop.AllAtomPositions(res_list[35:44])
# change pos. at res. index 1 (= res. number 37)
idx_ca_res_37 = all_atoms.GetIndex(1, "CA")
new_pos = all_atoms.GetPos(idx_ca_res_37) + geom.Vec3(0.2, 0.2, 0.2)
all_atoms.SetPos(idx_ca_res_37, new_pos)
# insert into ent and store updated entity
all_atoms.InsertInto(1, ent.chains[0], 37)
io.SavePDB(ent, "all_atom_pos.pdb")

# update environment with new positions
env.SetEnvironment(all_atoms, 36)
# store all positions of environment
io.SavePDB(env.GetAllAtomPositions().ToEntity(), "all_atom_env.pdb")

The AllAtomEnv class

class promod3.loop.AllAtomEnv(seqres)

The all atom environment contains and handles positions of all atoms during loop modelling. It is linked to a (list of) seqres (one per chain) at construction. The idea is to initialize it at the beginning of the modelling process with all known positions and then update the environment whenever a new loop is being added.

Parameters:

seqres (str / ost.seq.SequenceHandle / list of str / ost.seq.SequenceList) – Internal SEQRES to be set (single chain or list with one per chain). Whenever setting structural data, consistency with this SEQRES is enforced.

Indexing to access parts of the environment:

  • chain_idx: Index of chain as it occurs in seqres (0 for single sequence)

  • start_resnum: Residue number defining the position in the SEQRES of chain with index chain_idx. The numbering starts for every chain with the value 1.

  • internal residue indexing: all residues of all chains are simply concatenated one after each other (indexing starts at 0)

SetInitialEnvironment(env_structure)

Sets full environment. Existing data is cleared first.

Parameters:

env_structure (ost.mol.EntityHandle) – Structral data to be set as environment. The chains in env_structure are expected to be in the same order as the SEQRES items provided in constructor.

Raises:

RuntimeError if env is inconsistent with SEQRES set in constructor. This can be because of corrupt residue numbers or sequence mismatches.

SetEnvironment(new_env_pos)
SetEnvironment(new_pos, start_resnum, chain_idx=0)
SetEnvironment(bb_list, start_resnum, chain_idx=0)

Add/update atom positions in environment. In the end, all residues covered in new_env_pos / new_pos / bb_list will be set as defined there. This means, that positions in the env. may be reset, newly set or cleared.

Parameters:
  • new_env_pos (AllAtomEnvPositions) – Structural data to be set as environment.

  • new_pos (AllAtomPositions) – Structural data to be set as environment.

  • bb_list (BackboneList) – Backbone data to be set as environment.

  • start_resnum (int / ost.mol.ResNum) – Res. number defining the start position in the SEQRES.

  • chain_idx (int) – Index of chain the structural data belongs to.

Raises:

RuntimeError if new_pos / new_env_pos / bb_list is inconsistent with SEQRES set in constructor or when either start_resnum or chain_idx point to invalid positions in the SEQRES.

ClearEnvironment(start_resnum, num_residues, chain_idx=0)

Clears a stretch of length num_residues in the environment in chain with idx chain_idx starting from residue number start_resnum

Parameters:
  • start_resnum (int) – Start of stretch to clear

  • num_residues (int) – Length of stretch to clear

  • chain_idx (int) – Chain the stretch belongs to

Raises:

RuntimeError when either start_resnum or chain_idx point to invalid positions in the SEQRES.

GetEnvironment(start_resnum, num_residues, chain_idx=0)
Returns:

Copy of stretch of structural data in environment. Useful to store a loop to reset later with SetEnvironment().

Return type:

AllAtomEnvPositions

Parameters:
  • start_resnum (int) – Start of stretch to store

  • num_residues (int) – Length of stretch to store

  • chain_idx (int) – Chain the stretch belongs to

Raises:

RuntimeError when either start_resnum or chain_idx point to invalid positions in the SEQRES.

GetSeqres()
Returns:

SEQRES that was set in constructor (one sequence per chain).

Return type:

ost.seq.SequenceList

GetAllAtomPositions()
Returns:

Reference (use with caution!) to the internal storage of all atom positions for the environment. All residues of all chains are stored continuously in there. To get a copy of some positions use GetEnvironment().

Return type:

AllAtomPositions

The AllAtomPositions class

class promod3.loop.AllAtomPositions

Container for the positions of all heavy atoms of an arbitrary amino acid sequence. This is tailored for fast operations within C++ codes. The Python export described here, is mainly meant for debugging or to initialize the object and feed it to other classes using it.

Indexing of positions and residues:

Each atom position is initially unset (unless a list of residues is passed when constructing it) and can only be set with calls to SetPos() or SetResidue().

AllAtomPositions(sequence)

Creates a container for the given sequence with all positions unset.

Parameters:

sequence (str) – Sequence of amino acid one letter codes.

Raises:

RuntimeError if sequence contains a one letter code which is not one of the 20 default amino acids.

AllAtomPositions(residues)

Creates a container for the given residues. Both sequence and positions are extracted from the given residues.

Parameters:

residues (ost.mol.ResidueHandleList) – List of residues

Raises:

RuntimeError if any residue has a one letter code which is not one of the 20 default amino acids.

AllAtomPositions(sequence, residues)

Creates a container for the given sequence and extracts positions from residues. The residues may be different amino acids than the given sequence (see SetResidue()).

Parameters:
  • sequence (str) – Sequence of amino acid one letter codes.

  • residues (ost.mol.ResidueHandleList) – List of residues from which positions are extracted.

Raises:

RuntimeError if sequence contains a one letter code which is not one of the 20 default amino acids or if sequence and residues are inconsistent in size.

AllAtomPositions(bb_list)

Creates a container for the given backbone. Both sequence and backbone positions are extracted from the given residues.

Parameters:

bb_list (BackboneList) – Backbone list of residues

Raises:

RuntimeError if any residue has a one letter code which is not one of the 20 default amino acids.

SetResidue(res_index, res)

Set positions for residue at index res_index given the atoms in res. For each expected heavy atom, we search for an atom of that name in res and if found set the corresponding position, otherwise we unset it.

Parameters:
Raises:

RuntimeError if res_index out of bounds.

SetResidue(res_index, other, other_res_index)

Set positions for residue at index res_index given the positions of residue at index other_res_index in other position container. Each position is set or cleared according to the data in other.

Parameters:
  • res_index (int) – Residue index

  • other (AllAtomPositions) – Position container from which we take data

  • other_res_index (int) – Residue index in other

Raises:

RuntimeError if res_index or other_res_index out of bounds or if residues in the two containers are inconsistent (different amino acids).

SetPos(index, pos)
Parameters:
  • index (int) – Set position at that index.

  • pos (ost.geom.Vec3) – Set position to pos.

Raises:

RuntimeError if index out of bounds.

ClearPos(index)
Parameters:

index (int) – Unset position at that index.

Raises:

RuntimeError if index out of bounds.

ClearResidue(res_index)
Parameters:

res_index (int) – Unset all positions for residue at that index.

Raises:

RuntimeError if res_index out of bounds.

GetPos(index)
Returns:

Position at given index.

Return type:

ost.geom.Vec3

Parameters:

index (int) – Atom position index.

Raises:

RuntimeError if index out of bounds.

IsSet(index)
Returns:

True, if the position at that index is currently set.

Return type:

bool

Parameters:

index (int) – Atom position index.

Raises:

RuntimeError if index out of bounds.

GetIndex(res_index, atom_name)
Returns:

Atom position index for atom named atom_name (standard PDB naming) for residue at index res_index.

Return type:

int

Parameters:
  • res_index (int) – Residue index

  • atom_name (str) – Atom name

Raises:

RuntimeError if res_index out of bounds or if atom_name is not one of that residue’s heavy atoms.

GetFirstIndex(res_index)
Returns:

First atom position index for residue at index res_index.

Return type:

int

Parameters:

res_index (int) – Residue index

Raises:

RuntimeError if res_index out of bounds.

GetLastIndex(res_index)
Returns:

Last atom position index for residue at index res_index.

Return type:

int

Parameters:

res_index (int) – Residue index

Raises:

RuntimeError if res_index out of bounds.

GetAA(res_index)
Returns:

Amino acid type of residue at index res_index.

Return type:

ost.conop.AminoAcid

Parameters:

res_index (int) – Residue index

Raises:

RuntimeError if res_index out of bounds.

IsAnySet(res_index)
Returns:

True, if any atom position of residue at index res_index is set.

Return type:

bool

Parameters:

res_index (int) – Residue index

Raises:

RuntimeError if res_index out of bounds.

IsAllSet(res_index)
Returns:

True, if all atom positions of residue at index res_index are set.

Return type:

bool

Parameters:

res_index (int) – Residue index

Raises:

RuntimeError if res_index out of bounds.

GetPhiTorsion(res_index, def_angle=-1.0472)
Returns:

Phi torsion angle of residue at index res_index or def_angle if required atom positions (C-N-CA-C) are not set.

Return type:

float

Parameters:
  • res_index (int) – Residue index

  • def_angle (float) – Default phi angle.

Raises:

RuntimeError if res_index - 1 or res_index out of bounds.

GetPsiTorsion(res_index, def_angle=-0.7854)
Returns:

Psi torsion angle of residue at index res_index or def_angle if required atom positions (N-CA-C-N) are not set.

Return type:

float

Parameters:
  • res_index (int) – Residue index

  • def_angle (float) – Default psi angle.

Raises:

RuntimeError if res_index or res_index + 1 out of bounds.

GetOmegaTorsion(res_index, def_angle=3.14159)
Returns:

Omega torsion angle of residue at index res_index or def_angle if required atom positions (CA-C-N-CA) are not set. Here, we use CA-C of residue res_index - 1 and N-CA of residue res_index (consistent with OST’s GetOmegaTorsion()).

Return type:

float

Parameters:
  • res_index (int) – Residue index

  • def_angle (float) – Default omega angle.

Raises:

RuntimeError if res_index - 1 or res_index out of bounds.

GetNumAtoms()
Returns:

Number of atom positions stored in this container.

Return type:

int

GetNumResidues()
Returns:

Number of residues stored in this container.

Return type:

int

GetSequence()
Returns:

Sequence of one letter codes of all residues stored here.

Return type:

str

Copy()
Returns:

Full copy of this object.

Return type:

AllAtomPositions

Extract(from, to)
Returns:

Container with residues with indices in range [from, to-1].

Return type:

AllAtomPositions

Parameters:
  • from (int) – First residue index

  • to (int) – One after last residue index

Raises:

RuntimeError if from >= to or if any residue index is out of bounds.

Extract(res_indices)
Returns:

Container with residues with indices in res_indices.

Return type:

AllAtomPositions

Parameters:

res_indices (list of int) – List of residue index

Raises:

RuntimeError if any residue index is out of bounds.

ExtractBackbone(from, to)
Returns:

Backbone list of residues with indices in range [from, to-1]. CB atoms are reconstructed if unset.

Return type:

BackboneList

Parameters:
  • from (int) – First residue index

  • to (int) – One after last residue index

Raises:

RuntimeError if from >= to, if any residue index is out of bounds or if any residue has any unset backbone atom (N, CA, C, O).

ToEntity()
Returns:

All residues packed in a single chain as an OST entity. Connectivity resolved with ost.conop.HeuristicProcessor.

Return type:

ost.mol.EntityHandle

InsertInto(res_index, chain, res_num)

Insert a single residue (taken from given index) into the chain (with given res. number). Existing data is replaced and atoms are (re)connected according to the default connectivity of that amino acid. Peptide links to neighboring residues are set according to residue numbering. To make this function efficient, we require the backbone atoms (N, C, CA) to be set.

Parameters:
Raises:

RuntimeError if res_index out of bounds, if chain is invalid or if not all backbone atoms (N, C, CA) set.

The AllAtomEnvPositions class

class promod3.loop.AllAtomEnvPositions

To link the arbitrary amino acid sequence defined in AllAtomPositions and the SEQRES of AllAtomEnv, we provide a helper class containing structural data as well as a mapping to the internal residue indices of AllAtomEnv.

all_pos

Container for the positions of all heavy atoms of some residues.

Type:

AllAtomPositions

res_indices

Residue indices to be used by AllAtomEnv for each residue defined in all_pos.

Type:

list of int

Distinguishing amino acid atoms

class promod3.loop.AminoAcidAtom

Enumerates all heavy atoms of all amino acids. The naming scheme is TLC_AN, where TLC is the standard three letter code of the amino acid and AN is the atom name (standard PDB naming) of the heavy atom. Examples: ALA_CB, ARG_CA, ASN_C, ASP_O, CYS_SG, GLU_OE1, GLN_NE2, GLY_N.

We include all heavy atoms that amino acids have when they are peptide bound to other residues (i.e. no OXT).

Additionally, there is the value XXX_NUM_ATOMS, which corresponds to the number of atoms in the enumerator. Each heavy atom hence corresponds to an integer in the range [0, XXX_NUM_ATOMS-1].

class promod3.loop.AminoAcidHydrogen

Enumerates all hydrogens of all amino acids. The naming scheme is TLC_AN, where TLC is the standard three letter code of the amino acid and AN is the atom name (standard PDB naming) of the hydrogen. Examples: ALA_H, ARG_HD3, ASN_HB2, ASP_HA, CYS_HB3, LEU_H.

We include all hydrogens that amino acids can have including H1 (def. PDB name = “H”), H2 and (if not PRO) H3 for charged N-terminal residues. Note that the H atom attached to N when peptide bound (H) is distinct from the N terminal hydrogens (e.g. ALA_H != ALA_H1). For HIS we consider the fully protonated state, while ASP and GLU are included in their charged state.

Additionally, there is the value XXX_NUM_HYDROGENS, which corresponds to the number of hydrogens in the enumerator. Each hydrogen hence corresponds to an integer in the range [0, XXX_NUM_HYDROGENS-1].

class promod3.loop.AminoAcidLookup

Collection of static methods to lookup properties of amino acid types (ost.conop.AminoAcid), heavy atom types (AminoAcidAtom) and hydrogen types (AminoAcidHydrogen).

static GetOLC(aa)
Returns:

One letter code for the given amino acid

Return type:

str

Parameters:

aa (AminoAcid) – Amino acid type

static GetAAA(aa, atom_idx)
static GetAAA(aa, atom_name)
Returns:

Heavy atom type for the given amino acid and atom.

Return type:

AminoAcidAtom

Parameters:
  • aa (AminoAcid) – Amino acid type

  • atom_idx (int) – Atom index (in [0, GetNumAtoms(aa)-1])

  • atom_name (str) – Atom name

Raises:

RuntimeError if atom_idx out of bounds or if atom_name is not one of the heavy atoms of aa.

static GetAAH(aa, atom_idx)
static GetAAH(aa, atom_name)
Returns:

Hydrogen type for the given amino acid and atom.

Return type:

AminoAcidHydrogen

Parameters:
  • aa (AminoAcid) – Amino acid type

  • atom_idx (int) – Atom index (in [0, GetNumHydrogens(aa)-1])

  • atom_name (str) – Atom name

Raises:

RuntimeError if atom_idx out of bounds or if atom_name is not one of the hydrogens of aa.

static GetIndex(aa, atom_name)
Returns:

Atom index (in [0, GetNumAtoms(aa)-1]) for the given amino acid and atom.

Return type:

int

Parameters:
  • aa (AminoAcid) – Amino acid type

  • atom_name (str) – Atom name

Raises:

RuntimeError if atom_name is not one of the heavy atoms of aa.

static GetHydrogenIndex(aa, atom_name)
Returns:

Atom index (in [0, GetNumHydrogens(aa)-1]) for the given amino acid and atom.

Return type:

int

Parameters:
  • aa (AminoAcid) – Amino acid type

  • atom_name (str) – Atom name

Raises:

RuntimeError if atom_name is not one of the hydrogens of aa.

static GetNumAtoms(aa)
Returns:

Number of heavy atoms of the given amino acid

Return type:

int

Parameters:

aa (AminoAcid) – Amino acid type

static GetMaxNumAtoms()
Returns:

Max. number of heavy atoms for any amino acid

Return type:

int

static GetNumHydrogens(aa)
Returns:

Number of hydrogens of the given amino acid

Return type:

int

Parameters:

aa (AminoAcid) – Amino acid type

static GetMaxNumHydrogens()
Returns:

Max. number of hydrogens for any amino acid

Return type:

int

static GetAA(aaa)
static GetAA(aah)
Returns:

Amino acid type of the given heavy atom type

Return type:

AminoAcid

Parameters:
static GetAtomName(aaa)
static GetAtomName(aah)
static GetAtomNameCharmm(aaa)
static GetAtomNameCharmm(aah)
static GetAtomNameAmber(aaa)
static GetAtomNameAmber(aah)
Returns:

Atom name of the given heavy atom type according to PDB (default), CHARMM or AMBER naming.

Return type:

str

Parameters:
static GetElement(aaa)
Returns:

Chemical element of the given heavy atom type

Return type:

str

Parameters:

aaa (AminoAcidAtom) – Heavy atom type

static GetAnchorAtomIndex(aah)
Returns:

Atom index (in [0, GetNumAtoms(GetAA(aah))-1]) of the anchor to which the given hydrogen is attached.

Return type:

int

Parameters:

aah (AminoAcidHydrogen) – Hydrogen type

static GetHNIndex(aa)
Returns:

Atom index (in [0, GetNumHydrogens(aa)-1]) of H atom attached to N when residue is peptide bound.

Return type:

int

Parameters:

aa (AminoAcid) – Amino acid type

Raises:

RuntimeError if no such atom (i.e. PRO)

static GetH1Index(aa)
static GetH2Index(aa)
static GetH3Index(aa)
Returns:

Atom index (in [0, GetNumHydrogens(aa)-1]) of H atom attached to N when residue is N terminal.

Return type:

int

Parameters:

aa (AminoAcid) – Amino acid type

Raises:

RuntimeError if no such atom (i.e. H3 for PRO)

Search

Enter search terms or a module, class or function name.

Contents