Modelling Pipeline

A protein homology modelling pipeline has the following main steps:

  • Build a raw model from the template (see BuildRawModel() function)

  • Perform loop modelling to close (or remove) all gaps (see functions CloseSmallDeletions(), RemoveTerminalGaps(), MergeGapsByDistance(), FillLoopsByDatabase(), FillLoopsByMonteCarlo(), CloseLargeDeletions() or CloseGaps() that calls all these functions using predefined heuristics)

  • Build sidechains (see BuildSidechains() function)

  • Minimize energy of final model using molecular mechanics (see MinimizeModelEnergy() function)

The last steps to go from a raw model to a final model can easily be executed with the BuildFromRawModel() function. If you want to run and tweak the internal steps, you can start with the following code and adapt it to your purposes:

from ost import io
from promod3 import modelling, loop

# setup
merge_distance = 4
fragment_db = loop.LoadFragDB()
structure_db = loop.LoadStructureDB()
torsion_sampler = loop.LoadTorsionSamplerCoil()

# get raw model
tpl = io.LoadPDB('data/1crn_cut.pdb')
aln = io.LoadAlignment('data/1crn.fasta')
aln.AttachView(1, tpl.CreateFullView())
mhandle = modelling.BuildRawModel(aln)

# we're not modelling termini
modelling.RemoveTerminalGaps(mhandle)

# perform loop modelling to close all gaps
modelling.CloseGaps(mhandle, merge_distance, fragment_db,
                    structure_db, torsion_sampler)

# build sidechains
modelling.BuildSidechains(mhandle, merge_distance, fragment_db,
                          structure_db, torsion_sampler)

# minimize energy of final model using molecular mechanics
modelling.MinimizeModelEnergy(mhandle)

# check final model and report issues
modelling.CheckFinalModel(mhandle)

# extract final model
final_model = mhandle.model
io.SavePDB(final_model, 'model.pdb')

Build Raw Modelling Handle

class promod3.modelling.ModellingHandle

Handles the result for structure model building and provides high-level methods to turn an initial raw model (see BuildRawModel()) into a complete protein model by removing any existing gaps.

model

The resulting model. This includes one chain per target chain (in the same order as the sequences in seqres) and (if they were included) a chain named ‘_’ for ligands. You can therefore access model.chains items and seqres items with the same indexing and the optional ligand chain follows afterwards.

Type:

EntityHandle

gaps

List of gaps in the model that could not be copied from the template. These gaps may be the result of insertions/deletions in the alignment or due to missing or incomplete backbone coordinates in the template structure. Gaps of different chains are appended one after another.

Type:

StructuralGapList

seqres

List of sequences with one SequenceHandle for each chain of the target protein.

Type:

SequenceList

profiles

List of profiles with one ost.seq.ProfileHandle for each chain of the target protein (same order as in seqres). Please note, that this attribute won’t be set by simply calling BuildFromRawModel(). You have to fill it manually or even better by the convenient function SetSequenceProfiles(), to ensure consistency with the seqres.

Type:

list of ost.seq.ProfileHandle

psipred_predictions

List of predictions with one promod3.loop.PsipredPrediction for each chain of the target protein (same order as in seqres). Please note, that this attribute won’t be set by simply calling BuildFromRawModel(). You have to fill it manually or even better by the convenient function SetPsipredPredictions(), to ensure consistency with the seqres.

Type:

list of PsipredPrediction

backbone_scorer_env

Backbone score environment attached to this handle. A default environment is set with SetupDefaultBackboneScoring() when needed. Additional information can be added to the environment before running the pipeline steps.

Type:

BackboneScoreEnv

backbone_scorer

Backbone scorer container attached to this handle. A default set of scorers is initialized with SetupDefaultBackboneScoring() when needed.

Type:

BackboneOverallScorer

all_atom_scorer_env

All atom environment attached to this handle for scoring. A default environment is set with SetupDefaultAllAtomScoring() when needed. This environment is for temporary work only and is only updated to score loops. It is not to be updated when loops are chosen and added to the final model.

Type:

AllAtomEnv

all_atom_scorer

All atom scorer container attached to this handle. A default set of scorers is initialized with SetupDefaultAllAtomScoring() when needed.

Type:

AllAtomOverallScorer

all_atom_sidechain_env

All atom environment attached to this handle for sidechain reconstruction. A default environment is set with SetupDefaultAllAtomScoring() when needed.

Type:

AllAtomEnv

sidechain_reconstructor

A sidechain reconstructor to add sidechains to loops prior to all atom scoring. A default one is set with SetupDefaultAllAtomScoring() when needed.

Type:

SidechainReconstructor

fragger_handles

Optional attribute which is set in SetFraggerHandles(). Use hasattr() to check if it’s available. If it’s set, it is used in BuildFromRawModel().

Type:

list of FraggerHandle

modelling_issues

Optional attribute which is set in AddModellingIssue(). Use hasattr() to check if it’s available. If it’s set, it can be used to check issues which occurred in BuildFromRawModel() (see MinimizeModelEnergy() and CheckFinalModel() for details).

Type:

list of ModellingIssue

Copy()

Generates a deep copy. Everything will be copied over to the returned ModellingHandle, except the potentially set scoring members backbone_scorer, backbone_scorer_env, all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor.

Returns:

A deep copy of the current handle

Return type:

ModellingHandle

The Default Pipeline

Modelling Steps

promod3.modelling.SetupDefaultBackboneScoring(mhandle)

Setup scorers and environment for meddling with backbones. This one is already tailored towards a certain modelling job. The scorers added (with their respective keys) are:

Parameters:

mhandle (ModellingHandle) – The modelling handle. This will set the properties backbone_scorer and backbone_scorer_env of mhandle.

promod3.modelling.IsBackboneScoringSetUp(mhandle)
Returns:

True, if backbone_scorer and backbone_scorer_env of mhandle are set.

Return type:

bool

Parameters:

mhandle (ModellingHandle) – Modelling handle to check.

promod3.modelling.SetupDefaultAllAtomScoring(mhandle)

Setup scorers and environment to perform all atom scoring. This one is already tailored towards a certain modelling job, where we reconstruct sidechains for loop candidates and score them. The scorers added (with their respective keys) are:

Parameters:

mhandle (ModellingHandle) – The modelling handle. This will set the properties all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor.

promod3.modelling.IsAllAtomScoringSetUp(mhandle)
Returns:

True, if all_atom_scorer_env, all_atom_scorer, all_atom_sidechain_env and sidechain_reconstructor of mhandle are set.

Return type:

bool

Parameters:

mhandle (ModellingHandle) – Modelling handle to check.

promod3.modelling.InsertLoop(mhandle, bb_list, start_resnum, chain_idx)

Insert loop into model and ensure consistent updating of scoring environments. Note that we do not update all_atom_scorer_env as that one is meant to be updated only while scoring. To clear a gap while inserting a loop, use the simpler InsertLoopClearGaps().

Parameters:
  • mhandle (ModellingHandle) – Modelling handle on which to apply change.

  • bb_list (BackboneList) – Loop to insert (backbone only).

  • start_resnum (int) – Res. number defining the start position in the SEQRES.

  • chain_idx (int) – Index of chain the loop belongs to.

promod3.modelling.RemoveTerminalGaps(mhandle)

Removes terminal gaps without modelling them (just removes them from the list of gaps). This is useful for pipelines which lack the possibility to properly model loops at the termini.

Parameters:

mhandle (ModellingHandle) – Modelling handle on which to apply change.

Returns:

Number of gaps which were removed.

Return type:

int

promod3.modelling.ReorderGaps(mhandle)

Reorders all gaps to ensure sequential order by performing lexicographical comparison on the sequence formed by chain index of the gap and start residue number.

promod3.modelling.MergeMHandle(source_mhandle, target_mhandle, source_chain_idx, target_chain_idx, start_resnum, end_resnum, transform)

Merges the specified stretch of source_mhandle into target_mhandle by replacing all structural information and gaps in the stretch start_resnum and end_resnum (inclusive). The residues specified by start_resnum and end_resnum must be valid in the source_mhandle, i.e. not be enclosed by a gap. If a gap encloses start_resnum or end_resnum in the target_mhandle, the gap gets replaced by a shortened version not including the part overlapping with the defined stretch. If there is any scoring set up (backbone or all atom), the according environments get updated in target_mhandle.

Parameters:
  • source_mhandle (ModellingHandle) – Source of structural information and gaps

  • target_mhandle (ModellingHandle) – Structural information and gaps will be copied in here

  • source_chain_idx (int) – This is the chain where the info comes from

  • target_chain_idx (int) – This is the chain where the info goes to

  • start_resnum (int) – First residue of the copied stretch

  • end_resnum (int) – Last residue of the copied stretch

  • transform (ost.geom.Mat4) – Transformation to be applied to all atom positions when they’re copied over

Raises:

A RuntimeError when:

  • the chain indices are invalid

  • the SEQRES of the specified chains do not match

  • the start and end residue numbers are invalid or when the residues at the specified positions in the source_mhandle do not exist

  • a gap in the source_mhandle encloses the residues specified by start_resnum and end_resnum

promod3.modelling.SetSequenceProfiles(mhandle, profiles)

Sets the sequence profiles of mhandle while ensuring consistency with the seqres.

Parameters:
Raises:

ValueError when the given profiles are not consistent with seqres in mhandle

promod3.modelling.SetPsipredPredictions(mhandle, predictions)

Sets the predictions of mhandle while ensuring consistency with the seqres.

Parameters:
Raises:

ValueError when the given predictions are not consistent with seqres in mhandle

Alignment Fiddling

Search

Enter search terms or a module, class or function name.

Contents