ProMod3 Actions

A pure command line interface of ProMod3 is provided by actions. You can execute pm help for a list of possible actions and for every action, you can type pm <ACTION> -h to get a description on its usage.

Here we list the most prominent actions with simple examples.

Building models

You can run a full protein homology modelling pipeline from the command line with

$ pm build-model [-h] (-f <FILE> | -c <FILE> | -j <OBJECT>|<FILE>)
                 (-p <FILE> | -e <FILE>) [-s <FILE>] [-o <FILENAME>]
                 [-r] [-t]

Example usage:

$ pm build-model -f aln.fasta -p tpl.pdb

This reads a target-template alignment from aln.fasta and a matching structure from tpl.pdb and produces a gap-less model which is stored as model.pdb. The output filename can be controlled with the -o flag.

Target-template alignments can be provided in FASTA (-f), CLUSTAL (-c) or as JSON files/objects (-j). Files can be plain or gzipped. At least one alignment must be given and you cannot mix file formats. Multiple alignment files can be given and target chains will be appended in the given order. The chains of the target model are named with default chain names (A, B, C, …, see BuildRawModel()). Notes on the input formats:

  • Leading/trailing whitespaces of sequence names will always be deleted

  • FASTA input example:

    >target
    HGFHVHEFGDNTNGCMSSGPHFNPYGKEHGAPVDENRHLG
    >2jlp-1.A|55
    RAIHVHQFGDLSQGCESTGPHYNPLAVPH------PQHPG
    

    Target sequence is either named “trg” or “target” or the first sequence is used. Template sequence names can encode an identifier for the chain to attach to it and optionally an offset (here: 55, see below for details). Leading whitespaces of fasta headers will be deleted

  • CLUSTAL input follows the same logic as FASTA input

  • JSON input: filenames are not allowed to start with ‘{‘. JSON objects contain an entry with key ‘alignmentlist’. That in turn is an array of objects with keys ‘target’ and ‘template’. Those in turn are objects with keys ‘name’ (string id. for sequence), ‘seqres’ (string for aligned sequence) and optionally for templates ‘offset’ (number of residues to skip in structure file attached to it). Example:

    {"alignmentlist": [ {
      "target": {
          "name": "mytrg",
          "seqres": "HGFHVHEFGDNTNGCMSSGPHFNPYGKEHGAPVDENRHLG"
      },
      "template": {
          "name": "2jlp-1.A",
          "offset": 55,
          "seqres": "RAIHVHQFGDLSQGCESTGPHYNPLAVPH------PQHPG"
      }
    } ] }
    

Structures can be provided in PDB (-p) or in any format readable by the ost.io.LoadEntity() method (-e). In the latter case, the format is chosen by file ending. Recognized File Extensions: .ent, .pdb, .ent.gz, .pdb.gz, .cif, .cif.gz. At least one structure must be given and you cannot mix file formats. Multiple structures can be given and each structure may have multiple chains, but care must be taken to identify which chain to attach to which template sequence. Chains for each sequence are identified based on the sequence name of the templates in the alignments. Valid sequence names are:

  • anything, if only one structure with one chain

  • “<FILE>.<CHAIN>”, where <FILE> is the base file name of an imported structure with no extensions and <CHAIN> is the identifier of the chain in the imported structure.

  • “<FILE>” if only one chain in file

  • “<CHAIN>” if only one file imported

  • “<CHAINID>|<OFFSET>”, where <CHAINID> identifies the chain as above and <OFFSET> is the number of residues to skip for that chain to reach the first residue in the aligned sequence. Leading/trailing whitespaces of <CHAINID> and <OFFSET> are ignored.

Example: ... -p data/2jlp.pdb.gz, where the pdb file has chains A, B, C and the template sequence is named 2jlp.A|55.

You can optionally specify sequence profiles to be added (-s) and linked to the corresponding target sequences. This has an impact on loop scoring with the database approach. The profiles can be provided as plain files or gzipped. Following file extensions are understood: .hhm, .hhm.gz, .pssm, .pssm.gz. Consider to use ost.bindings.hhblits.HHblits.A3MToProfile() if you have a file in a3m format at hand.

  • The profiles are mapped based on exact matches towards the gapless target sequences from the provided alignment files, i.e. one profile is mapped to several chains in case of homo-oligomers

  • Every profile must have a unique sequence to avoid ambiguities

  • All or nothing - You cannot provide profiles for only a subset of target sequences

Example usage:

$ pm build-model -f aln.fasta -p tpl.pdb -s prof.hhm

A fast torsion angle based sampling is performed in case of Monte Carlo sampling. You can enforce the usage of structural fragments with -r but this increases runtime due to searching the required fragments. Setup of the according promod3.modelling.FraggerHandle objects is performed in the PM3ArgumentParser class as described in detail here.

The default modelling pipeline in ProMod3 is optimized to generate a gap-free model of the region in the target sequence(s) that is covered with template information. Terminal extensions without template coverage are negelected. You can enforce a model of the full target sequence(s) by adding -t. The terminal parts will be modelled with a crude Monte Carlo approach. Be aware that the accuracy of those termini is likely to be limited. Termini of length 1 won’t be modelled.

Possible exit codes of the action:

  • 0: all went well

  • 1: an unhandled exception was raised

  • 2: arguments cannot be parsed or required arguments are missing

  • 3: failed to perform modelling (internal error)

  • 4: failed to write results to file

  • other non-zero: failure in argument checking (see promod3.core.pm3argparse.PM3ArgumentParser)

Sidechain Modelling

You can (re-)construct the sidechains in a model from the command line.

$ usage: build-sidechains [-h] (-p <FILE> | -e <FILE>) [-o <FILENAME>] [-k] [-n]
                          [-r] [-i] [-s]

Example usage:

$ pm build-sidechains -p input.pdb

This reads a structure stored in in.pdb, strips all sidechains, detects and models disulfid bonds and reconstructs all sidechains with the flexible rotamer model. The result is stored as out.pdb. The output filename can be controlled with the -o flag.

A structure can be provided in PDB (-p) or in any format readable by the ost.io.LoadEntity() method (-e). In the latter case, the format is chosen by file ending. Recognized File Extensions: .ent, .pdb, .ent.gz, .pdb.gz, .cif, .cif.gz.

Several flags control the modelling behaviour:

-k, --keep-sidechains

Keep existing sidechains.

-n, --no-disulfids

Do not build disulfid bonds before sidechain optimization

-r, --rigid-rotamers

Do not use rotamers with subrotamers

-i, --backbone-independent

Use backbone independent rotamer library (from promod3.sidechain.LoadLib()) instead of the default backbone dependent one (from promod3.sidechain.LoadBBDepLib())

-s, --no-subrotamer-optimization

Dont do subrotamer optimization if flexible rotamer model is used

-f, --energy_function

The energy function to be used. Default is SCWRL4, can be any function supported by promod3.modelling.ReconstructSidechains().

Search

Enter search terms or a module, class or function name.

Contents