[J. Biochem. Vol. 129, pp. 653-664 (2001), JB Review; © 2001 by The Japanese Biochemical Society]


The Structure of Calpain

Hiroyuki Sorimachi*,1 and Koichi Suzuki

*Laboratory of Biological Function, Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo 113-8657; and Tokyo Metropolitan Institute of Gerontology, Tokyo 173-0015

Received March 2, 2001; accepted March 12, 2001

Recent very rapid developments in genome and EST projects have identified an increasing number of gene products homologous to those that were previously identified by other methods. Calpain is no exception. At the time this review is written, 83 genes from 23 living organisms have been identified in the database to encode amino acid sequences showing significant similarities to the protease domain of "conventional" calpain, which was first purified as a homogeneous protein in 1978. Progress in genome/EST projects has occurred so quickly that there seems to be some confusion as to the identity of each calpain molecule. This review will attempt to clarify all calpain homologues, to describe the common and differing features of calpain homologues in terms of structure-function relationship, and to discuss the evolutionary process of calpain.

Key words: calpain, calcium, C2-domain, connectin/titin, EF-hand motif.


[email to author][Related articles in JB][Home page]


In 1964, Guroff and Guroff first described a unique proteinase in a soluble brain fraction (1), which was finally purified to homogeneity as an enzyme in 1978 by Ishiura et al. (2). This protease after many twists and turns, is now called "calpain" (3, 4). Calpain [EC 3.4.22.17, clan CA family C2 (5)] is a Ca2+-requiring cysteine protease. In animals such as vertebrates (Table I) and nematodes, calpain forms a large gene family comprising more than 10 members (6-9). On the other hand, Drosophila and budding yeast have only 4 and 1 genes, respectively, encoding a calpain-like protease domain. Nevertheless, polypeptides showing significant similarity to the calpain protease domain have been found in all kinds of living organisms, including animals, plants, fungi, yeast, and bacteria. This strongly suggests that calpain-type proteases (or protease domains) must have ubiquitous, distinct, and essential function(s) required for the maintenance of life. Indeed, defects in some calpain genes have been shown to be responsible for pathogenic/lethal phenotypes (Table I).

Various members of the "calpain superfamily," however, have structures that are rather diverged, probably because their physiological functions have been adapted and specialized to the cells in which they function. Some mammalian calpains such as hTRA-3/calpain5, calpain6, PalBH/calpain7, and SOLH, have structures that are highly conserved with respect to those found in lower eukaryotes such as Drosophila, nematodes, and fungi (Table I). In this review, the amino acid sequences of calpain superfamily members are compared in detail, and, together with recent knowledge about the 3D structure of m-calpain, an attempt is made to clarify the consensus and variant sequences.

Conventional/classical calpains

The best characterized members of the calpain superfamily, so far, are m-calpain and m-calpain, which are now called "conventional" and "classical" calpains (6, 10-22). The word "calpain" itself should mean a papain-like cysteine protease that requires Ca2+ for its activity. Exemplifying the word "calpain," the primary structure of chicken calpain, which is now assigned as the m/m-calpain large subunit, i.e., an intermediate type between m- and m-calpains, has been revealed to have both a papain-like cysteine protease domain and a calmodulin-like Ca2+-binding domain in the same polypeptide chain (23, 24, Fig. 1).

The m- and m-calpains consist of two distinct subunits, a larger ca. 80-kDa catalytic subunit and a smaller ca. 30-kDa regulatory subunit, forming a heterodimer structure. The smaller subunit (called "30K" after its molecular mass) is common to both m- and m-calpains, but the larger subunits are different (called "mCL" and "mCL" standing for m- and m-calpain large subunits, respectively). As shown in Fig. 1 (the top two structures), the small and large subunits can be divided into 2 (V and VI) and 4 (I to IV) domains, respectively, according to the structure.

Domain I is actually not a domain, but rather a single a-helix composed of ten-odd amino acid residues (see Fig. 2). This helix is very important for the stability and activation of some calpains, but several other members have unique sequences instead at their N-termini including Zn-fingers, trans-membrane sequences, etc. (Fig. 1). Most significantly, the N-termini of Trypanosoma brucei calpain-like proteases show similarity to calpastatin, although the identities are quite low (around 15%). The N-terminal domains of some nematode calpain-like proteases contain Gly clusters resembling domain V of 30K. These members are interesting considering the process of evolutional constitution of calpain.

Domain III shows no significant amino acid sequence similarity to any other sequences in the database. Thus, its structure and functions have long been unknown. The 3D structure of m-calpain has now been revealed, that this domain resembles the C2-domain found in several Ca2+-regulated proteins such as protein kinase C and synaptotagmins (25, 26) (described later, see Fig. 2).

Domain IV is highly similar to domain VI in 30K, and has 5 EF-hand motifs in one domain. The 5th EF-hand motif (EF-5s) of domains IV and VI interact with one another to form a heterodimer. Each EF-hand motif shows slight similarity to that of calmodulin. EF-hand containing proteins can be divided into 45 classes according to Kawasaki et al. (27). The 5EF-hand-containing proteins, the FEF or PEF (penta-EF-hand) family represented by calpain, are unique not only in that they contain an odd number of EF-hand motifs, but also in that the primary sequences are significantly distinct from those of calmodulin (6, 27, 28). Furthermore, FEF proteins are known to form homo- and/or hetero-dimers, and Ca2+-binding causes only slight changes to the structure (29).

Domain V of 30K contains clusters of Gly residues that partake in the hydrophobic nature. This domain is not visible in the X-ray crystallography, strongly suggesting that the domain has very flexible structure that is not anchored to other parts of the calpain molecule (30).

3D structure of calpain

The 3D structures of human and rat m-calpains in the absence of Ca2+ have recently been solved (30, 31). Apo-m-calpain has an oval disk-like shape, with each domain placed side by side almost on the same plane (Fig. 2). The 3D structure of m-calpain revealed us a number of important characteristics as follows:

i) Domains in the 3D structure virtually coincide with those presumed from the primary sequence.

ii) The protease domain is divided into two sub-domains, IIa and IIb, which contain the active site Cys-105 and His-262/Asn-286, respectively, indicating that the latency of calpain protease activity in the absence of Ca2+ is caused by a structural phenomenon, not by the pro-peptide inhibition found in other cysteine proteases such as cathepsins.

iii) The third domain consists of 8 b-strands, forming a b-sandwich structure with a topology similar to tumor necrosis factor a and the C2-domains. Although no primary sequence similarity can be found between calpain domain III and any of the C2-domains, the overall 3D-structural similarity strongly suggests that this domain is responsible for the Ca2+-dependent translocation of calpain to the membrane.

iv) Domains IV and VI, mainly by their C-termini, form a heterodimer structure that is very similar to the 3D structure of the domain VI homodimer reported previously (32, 33). It should be noted that domain I, the N-terminus of calpain, is in contact with EF-2 of domain VI in 30K, and that the interaction is broken either by Ca2+-binding to EF-2 or by autolysis of the N-terminus upon activation.

Newly arisen questions include how this inactive open structure is activated upon Ca2+-binding, and where the Ca2+-binding site most responsible for activation is located. For a long time it has been presumed that domain IV is responsible for the Ca2+-dependent activation of calpain, since m- and m-calpains, which have domain VI of 30K in common and different domains IV, show different Ca2+-requirements for activity. The 3D-structure of m-calpain, however, shows that domain IV is at the opposite end from the active site. Furthermore, considering the similarity to the domain VI homodimer, it is very likely that the Ca2+-induced conformation change of the domains IV and VI heterodimer is small (30-34). Interestingly, our preliminary results (Hata, S. personal communication) strongly suggest that domain II also binds Ca2+. In other words, m-calpain binds Ca2+ over the entire molecule. Therefore, to solve the 3D-structure of calpain in the presence of Ca2+ is an urgent and essential future issue. The 3D structure of p94, which is active even in the absence of Ca2+, also holds a key to answering the above questions.

Calpain superfamily

Members of the calpain superfamily have diverged structures, but the primary sequences of the protease domains show significant similarities to each other, and form a family distinct from papains [clan CA family C1 (35)] or other cysteine proteases. The so-called "papain superfamily" includes clan CA families C1 and C2. Berti and Storer, however, indicated the interesting fact that the papain superfamily is divided into three independent families, papain-type, bleomycin hydrolase (BLH)-type, and calpain-type (36). In bacteria and yeast, only the BLH- and calpain-types have been found. BLH-type bacterial proteases, such as Lactococcus helveticus PepC, however, are considered to be gene products transferred laterally from eukaryotes. On the other hand, the Porphyromonas gingivalis calpain-type protease, Tpr (37), has a sequence so diverged that the evolutionary distance should be more than 2.6 billion years, which corresponds to a period when prokaryotes and eukaryotes branched (36). Thus, it is possible to assert that calpain-type Tpr is the most ancestral gene product for not only calpains, but also for papain superfamily members.

In addition to protease domains, most members of the calpain superfamily have Ca2+-binding and/or probable Ca2+-binding domains, such as 5EF-hand, C2-like, and C2 domains. The SOL subfamily, which consists of drosophila SOL (small optic lobes) homologues found in mammals and nematodes, does not seem to have an extra Ca2+-biding domain, but has an SOL homology domain (SOH) and several Zn-finger motifs. Yeast and bacterial calpain-like proteases found in Saccharomyces cerevisiae and P. gingivalis have the most diverged primary sequences and no obvious Ca2+-binding motifs.

Interestingly, some calapin superfamily members, such as mammalian calpain 6, Drosophila CG3692, some nematode calpains, and all trypanosome calpains, lack the conserved active site Cys, although the overall similarity is not so low. Further information about these molecules would shed light on other physiological functions of the calpain superfamily.

Comparison of the primary sequences of calpain

When the primary sequences are aligned as shown in Fig. 3, striking conservations can be found locally. Generally, highly conserved regions correspond to the a-helices and b-strands found in the 3D structure of m-calpain. Especially, conservation from the 1st b-strand to the end of domain IIa extends beyond species (Fig. 3A).

When the 3D structures of papain and m-calpain are compared, the overall topology and positions of a-helices and b-strands are highly conserved, and can be overlaid on each other. One loop between the 8th and 9th b-strands (the upper-most loop in Fig. 2) of m-calpain is significantly different from that in papain. This loop contains four strictly conserved Trp residues (Trp297, 304, 325, and 356; Fig. 3B). These Trp residues are located very close to each other in the 3D structure, suggesting that they are important for the formation or character of the calpain protease domain. In addition to the conserved Trp residues, the N-terminal and C-terminal short stretches containing acidic residues in this loop are highly conserved among various calpain molecules (Fig. 3B). This suggests that the Ca2+-requirement of the calpain protease domain can be ascribed to this region, which is not present in papain. This region actually overlaps the apparitional 6th EF-hand motif once predicted from the primary structure. No obvious EF-hand structure was found at the predicted position in the 3D structure of m-calpain. It cannot be excluded that the conserved Asp and Glu residues (D360, 352, 361, and Glu354) in this region form a novel Ca2+-binding motif(s).

The C2-like domain III in some calpain homologues is most intriguing. In the 3D structure of m-calpain, this domain is characterized by an acidic loop between the 2nd and 3rd b-strands (Fig. 3C). Although the overall sequence similarity is not so high, especially in the most diverged calpain homologues, the 5th, 6th and 7th b-strands and the loop between the 1st and 2nd b-strands are highly conserved from humans to fungi (Fig. 3C). These portions should play important roles in the functions of domain III.

Another point that should be noted about domain III is the tandem conjunction of two domain IIIs found in mammalian calpain 10 (Fig. 1, 3C), and the C-terminal domain III downstream of the PBH domain found in some PalB homologues (38). Based on phylogenetic considerations, these distal domains III are in positions comparable to those of conventional domains III (Fig. 4C). The domain III of m-calpain is in close contact with domain II, and is considered to be very important in the regulation of activity (30, 31). Therefore, the positional relationship between the protease domain and the distal domain III in these calpain homologues may be a key to solving the regulatory mechanism of these calpains.

Although domain IV, the 5EF-hand domain, is highly conserved from schistosomes to humans, no calpain homologue with a 5EF-hand domain exists in nematodes or yeast (Fig. 1, 4D). Since several substrates have been shown to bind to domain IV (39, 40), it is likely that this domain is important for substrate recognition by calpain. In this case, the function can be assumed by other protein-interactive domains, such as the C2-containing domain T, and, possibly, the SOH and PBH domains.

Conclusion and perspective

Three-dimensional studies of calpain have just begun. The 3D structure of m-calpain is an art of nature arising through long evolutionary processes, as in the case of every other protein. The 3D structure has provided answers to long-discussed questions, and given unexpected and valuable information for understanding the functions of calpain. At the same time, many new questions now arise. At present, no common function of calpain superfamily members from bacteria to humans has been identified. The ultimate goal of structure-function relationship studies is to discover the consensus functions, and also, functions that are shared in organisms with more than two calpains.

Another point that should be emphasized is that evolutionary considerations should be taken into account at every stage of calpain studies, as in all other biological studies. It should be remembered that calpain genes are the products of evolutional combinations of several ancestral unit genes (for the phylogenetic relationship of each domain of the calpain superfamily, see Fig. 4), that is, genes for the C2-like, 5EF-hand, C2-containing T, SOH, PBH, Gly-clustering, calpastaitin-like, Zn-finger, and transmembrane domains. In this sense, the most ancestral yeast Cpl1p and P. gingivalis Tpr are exciting subjects not only for functional studies but also for structural studies.



[JB home page]