Modelling Pipeline¶

A protein homology modelling pipeline has the following main steps:

Build a raw model from the template (see BuildRawModel() function)
Perform loop modelling to close (or remove) all gaps (see functions CloseSmallDeletions(), RemoveTerminalGaps(), MergeGapsByDistance(), FillLoopsByDatabase(), FillLoopsByMonteCarlo(), CloseLargeDeletions() or CloseGaps() that calls all these functions using predefined heuristics)
Build sidechains (see BuildSidechains() function)
Minimize energy of final model using molecular mechanics (see MinimizeModelEnergy() function)

The last steps to go from a raw model to a final model can easily be executed with the BuildFromRawModel() function. If you want to run and tweak the internal steps, you can start with the following code and adapt it to your purposes:

from ost import io
from promod3 import modelling, loop

# setup
merge_distance = 4
fragment_db = loop.LoadFragDB()
structure_db = loop.LoadStructureDB()
torsion_sampler = loop.LoadTorsionSamplerCoil()

# get raw model
tpl = io.LoadPDB('data/1crn_cut.pdb')
aln = io.LoadAlignment('data/1crn.fasta')
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)

# we're not modelling termini
modelling.RemoveTerminalGaps(mhandle)

# perform loop modelling to close all gaps
modelling.CloseGaps(mhandle, merge_distance, fragment_db,
                    structure_db, torsion_sampler)

# build sidechains
modelling.BuildSidechains(mhandle, merge_distance, fragment_db,
                          structure_db, torsion_sampler)

# minimize energy of final model using molecular mechanics
modelling.MinimizeModelEnergy(mhandle)

# check final model and report issues
modelling.CheckFinalModel(mhandle)

# extract final model
final_model = mhandle.model
io.SavePDB(final_model, 'model.pdb')

Build Raw Modelling Handle¶

class promod3.modelling.ModellingHandle¶

Handles the result for structure model building and provides high-level methods to turn an initial raw model (see BuildRawModel()) into a complete protein model by removing any existing gaps.

model¶

The resulting model. This includes one chain per target chain (in the same order as the sequences in seqres) and (if they were included) a chain named ‘_’ for ligands. You can therefore access model.chains items and seqres items with the same indexing and the optional ligand chain follows afterwards.

Type:: EntityHandle

gaps¶

List of gaps in the model that could not be copied from the template. These gaps may be the result of insertions/deletions in the alignment or due to missing or incomplete backbone coordinates in the template structure. Gaps of different chains are appended one after another.

Type:: StructuralGapList

seqres¶

List of sequences with one SequenceHandle for each chain of the target protein.

Type:: SequenceList

profiles¶

List of profiles with one ost.seq.ProfileHandle for each chain of the target protein (same order as in seqres). Please note, that this attribute won’t be set by simply calling BuildFromRawModel(). You have to fill it manually or even better by the convenient function SetSequenceProfiles(), to ensure consistency with the seqres.

Type:: list of ost.seq.ProfileHandle

psipred_predictions¶

List of predictions with one promod3.loop.PsipredPrediction for each chain of the target protein (same order as in seqres). Please note, that this attribute won’t be set by simply calling BuildFromRawModel(). You have to fill it manually or even better by the convenient function SetPsipredPredictions(), to ensure consistency with the seqres.

Type:: list of PsipredPrediction

backbone_scorer_env¶

Backbone score environment attached to this handle. A default environment is set with SetupDefaultBackboneScoring() when needed. Additional information can be added to the environment before running the pipeline steps.

Type:: BackboneScoreEnv

backbone_scorer¶

Backbone scorer container attached to this handle. A default set of scorers is initialized with SetupDefaultBackboneScoring() when needed.

Type:: BackboneOverallScorer

all_atom_scorer_env¶

All atom environment attached to this handle for scoring. A default environment is set with SetupDefaultAllAtomScoring() when needed. This environment is for temporary work only and is only updated to score loops. It is not to be updated when loops are chosen and added to the final model.

Type:: AllAtomEnv

all_atom_scorer¶

All atom scorer container attached to this handle. A default set of scorers is initialized with SetupDefaultAllAtomScoring() when needed.

Type:: AllAtomOverallScorer

all_atom_sidechain_env¶

All atom environment attached to this handle for sidechain reconstruction. A default environment is set with SetupDefaultAllAtomScoring() when needed.

Type:: AllAtomEnv

sidechain_reconstructor¶

A sidechain reconstructor to add sidechains to loops prior to all atom scoring. A default one is set with SetupDefaultAllAtomScoring() when needed.

Type:: SidechainReconstructor

fragger_handles¶

Optional attribute which is set in SetFraggerHandles(). Use hasattr() to check if it’s available. If it’s set, it is used in BuildFromRawModel().

Type:: list of FraggerHandle

modelling_issues¶

Optional attribute which is set in AddModellingIssue(). Use hasattr() to check if it’s available. If it’s set, it can be used to check issues which occurred in BuildFromRawModel() (see MinimizeModelEnergy() and CheckFinalModel() for details).

Type:: list of ModellingIssue

Copy()¶

Generates a deep copy. Everything will be copied over to the returned ModellingHandle, except the potentially set scoring members backbone_scorer, backbone_scorer_env, all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor.

Returns:: A deep copy of the current handle
Return type:: ModellingHandle

The Default Pipeline¶

Modelling Steps¶

promod3.modelling.SetupDefaultBackboneScoring(mhandle)¶

Setup scorers and environment for meddling with backbones. This one is already tailored towards a certain modelling job. The scorers added (with their respective keys) are:

“cb_packing”: CBPackingScorer
“cbeta”: CBetaScorer
“reduced”: ReducedScorer
“clash”: ClashScorer
“hbond”: HBondScorer
“torsion”: TorsionScorer
“pairwise”: PairwiseScorer

Parameters:: mhandle (ModellingHandle) – The modelling handle. This will set the properties backbone_scorer and backbone_scorer_env of mhandle.

promod3.modelling.IsBackboneScoringSetUp(mhandle)¶

Returns:: True, if backbone_scorer and backbone_scorer_env of mhandle are set.
Return type:: bool
Parameters:: mhandle (ModellingHandle) – Modelling handle to check.

promod3.modelling.SetupDefaultAllAtomScoring(mhandle)¶

Setup scorers and environment to perform all atom scoring. This one is already tailored towards a certain modelling job, where we reconstruct sidechains for loop candidates and score them. The scorers added (with their respective keys) are:

“aa_interaction”: AllAtomInteractionScorer
“aa_packing”: AllAtomPackingScorer
“aa_clash”: AllAtomClashScorer

Parameters:: mhandle (ModellingHandle) – The modelling handle. This will set the properties all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor.

promod3.modelling.IsAllAtomScoringSetUp(mhandle)¶

Returns:: True, if all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor of mhandle are set.
Return type:: bool
Parameters:: mhandle (ModellingHandle) – Modelling handle to check.

promod3.modelling.InsertLoop(mhandle, bb_list, start_resnum, chain_idx)¶

Insert loop into model and ensure consistent updating of scoring environments. Note that we do not update all_atom_scorer_env as that one is meant to be updated only while scoring. To clear a gap while inserting a loop, use the simpler InsertLoopClearGaps().

Parameters:

mhandle (ModellingHandle) – Modelling handle on which to apply change.
bb_list (BackboneList) – Loop to insert (backbone only).
start_resnum (int) – Res. number defining the start position in the SEQRES.
chain_idx (int) – Index of chain the loop belongs to.

promod3.modelling.RemoveTerminalGaps(mhandle)¶

Removes terminal gaps without modelling them (just removes them from the list of gaps). This is useful for pipelines which lack the possibility to properly model loops at the termini.

Parameters:: mhandle (ModellingHandle) – Modelling handle on which to apply change.
Returns:: Number of gaps which were removed.
Return type:: int

promod3.modelling.ReorderGaps(mhandle)¶: Reorders all gaps to ensure sequential order by performing lexicographical comparison on the sequence formed by chain index of the gap and start residue number.

promod3.modelling.MergeMHandle(source_mhandle, target_mhandle, source_chain_idx, target_chain_idx, start_resnum, end_resnum, transform)¶

Merges the specified stretch of source_mhandle into target_mhandle by replacing all structural information and gaps in the stretch start_resnum and end_resnum (inclusive). The residues specified by start_resnum and end_resnum must be valid in the source_mhandle, i.e. not be enclosed by a gap. If a gap encloses start_resnum or end_resnum in the target_mhandle, the gap gets replaced by a shortened version not including the part overlapping with the defined stretch. If there is any scoring set up (backbone or all atom), the according environments get updated in target_mhandle.

Parameters:

source_mhandle (ModellingHandle) – Source of structural information and gaps
target_mhandle (ModellingHandle) – Structural information and gaps will be copied in here
source_chain_idx (int) – This is the chain where the info comes from
target_chain_idx (int) – This is the chain where the info goes to
start_resnum (int) – First residue of the copied stretch
end_resnum (int) – Last residue of the copied stretch
transform (ost.geom.Mat4) – Transformation to be applied to all atom positions when they’re copied over

Raises:

A RuntimeError when:

the chain indices are invalid
the SEQRES of the specified chains do not match
the start and end residue numbers are invalid or when the residues at the specified positions in the source_mhandle do not exist
a gap in the source_mhandle encloses the residues specified by start_resnum and end_resnum

promod3.modelling.SetSequenceProfiles(mhandle, profiles)¶

Sets the sequence profiles of mhandle while ensuring consistency with the seqres.

Parameters:

mhandle (ModellingHandle) – Will have the profiles attached afterwards
profiles (list of ost.seq.ProfileHandle) – The sequence profiles to attach

Raises:

ValueError when the given profiles are not consistent with seqres in mhandle

promod3.modelling.SetPsipredPredictions(mhandle, predictions)¶

Sets the predictions of mhandle while ensuring consistency with the seqres.

Parameters:

mhandle (ModellingHandle) – Will have the predictions attached afterwards
predictions (list of PsipredPrediction) – The predictions to attach

Raises:

ValueError when the given predictions are not consistent with seqres in mhandle

Modelling Pipeline¶

Build Raw Modelling Handle¶

The Default Pipeline¶

Modelling Steps¶

Alignment Fiddling¶

Search

Contents