clustalw - Perform multiple sequence alignment

ClustalW(seq1, seq2=None, clustalw=None, keep_files=False, nopgap=False, clustalw_option_string=False)

Runs a ClustalW multiple sequence alignment. The results are returned as a AlignmentHandle instance.

There are two ways to use this function:

  • align exactly two sequences:

    param seq1:

    sequence_one

    type seq1:

    SequenceHandle or str

    param seq2:

    sequence_two

    type seq2:

    SequenceHandle or str

    The two sequences can be specified as two separate function parameters (seq1, seq2). The type of both parameters can be either SequenceHandle or str, but must be the same for both parameters.

  • align two or more sequences:

    param seq1:

    sequence_list

    type seq1:

    SequenceList

    param seq2:

    must be None

    Two or more sequences can be specified by using a SequenceList. It is then passed as the first function parameter (seq1). The second parameter (seq2) must be None.

Parameters:

Note

  • In the passed sequences ClustalW will convert lowercase to uppercase, and change all ‘.’ to ‘-’. OST will convert and ‘?’ to ‘X’ before aligning sequences with ClustalW.

  • If a sequence name contains spaces, only the part before the space is considered as sequence name. To avoid surprises, you should remove spaces from the sequence name.

  • Sequence names must be unique (ValueError exception raised otherwise).

ClustalW will accept only IUB/IUPAC amino acid and nucleic acid codes:

Residue

Name

Residue

Name

A

alanine

P

proline

B

aspartate or asparagine

Q

glutamine

C

cystine

R

arginine

D

aspartate

S

serine

E

glutamate

T

threonine

F

phenylalanine

U

selenocysteine

G

glycine

V

valine

H

histidine

W

tryptophan

I

isoleucine

Y

tyrosine

K

lysine

Z

glutamate or glutamine

L

leucine

X

any

M

methionine

*

translation stop

N

asparagine

-

gap of indeterminate length