Inteins are protein splicing elements that remove themselves from host proteins (exteins) during post-translational processing. Intein-mediated protein splicing does not require any exogenous enzymes or cofactors. Inteins are recognized as insertions within other genes and their protein products. They share four conserved splicing domain motifs (A, B, F and G, )
[1],
[2],
[3],
[4], with many of the most highly conserved residues playing catalytic roles
[1],
[5],
[6],
[7]. Some inteins are chimeric proteins with a centrally located homing endonuclease domain containing four endonuclease motifs (C, D, E, and H, )
[2],
[3],
[4]. The His at position 10 of intein Motif B (B
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
10) is the most conserved intein amino acid (aa) and is essential for splicing in all inteins previously tested
[1],
[5],
[6],
[8],
[9],
[10],
[11]; however, it is an Asn in the Arthrobacter species FB24 (Arsp-FB24) Arth_1007 (DnaB) intein. Only this intein and the Thermococcus kodakaraensis KOD1 (Tko) CDC21-1 intein (and its orthologs) lack a His at B
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
10
[6], bringing into question the activity of these inteins and if functional, whether they have evolved to use other residues to compensate for the lack of this catalytically essential His
B
10.
Inteins have been divided into 3 classes based on specific signature sequences and protein splicing mechanisms
[1],
[10],
[11]. Most inteins are Class 1 inteins that splice themselves out of precursor proteins by a four-step mechanism
[5],
[7],
[12] consisting of an initial acyl shift of the intein N-terminal Ser, Thr or Cys to form a linear (thio)ester intermediate, followed by a transesterification reaction that results in a branched intermediate (). This branched intermediate is resolved by cyclization of the intein C-terminal Asn, which separates the intein from the ligated exteins. A standard peptide bond is then formed between the exteins by an acyl shift. Class 2 inteins begin with other residues, but their splicing motifs are otherwise similar to Class 1 inteins
[6],
[13]. Class 1 and Class 2 inteins splice by a mechanism that proceeds through a single branched intermediate ()
[5],
[7],
[12],
[13].
Class 3 inteins also lack the Class 1 N-terminal nucleophile, but have an additional class specific WCT motif consisting of a Trp at intein Motif B position 12 (B
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
12), a Cys at intein Motif F position 4 (F
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
4) and a Thr at intein Motif G position 5 (G
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
5)
[1]. Class 3 inteins splice by a mechanism that includes two branched intermediates (), with Cys
F
4 being the nucleophile and branch point for the Class 3 specific branched intermediate
[1],
[10],
[11].
The generally accepted assumption is that Class 2 and Class 3 inteins arose from mutation of the N-terminus of a Class 1 intein. To date, all experimental substitutions of a Class 1 intein N-terminus always block splicing if the change is not conservative (Ser, Thr or Cys), so these naturally mutated inteins most likely failed to splice or spliced very poorly unless and until further mutations restored robust splicing. Class 2 inteins solved this problem by overcoming the barrier to direct attack of an amide bond at the N-terminal splice junction by the C-terminal nucleophile (, Step 1 in Class 2 inteins) that is present in all Class 1 inteins tested. Class 3 inteins solved this problem by having Cys
F
4 attack the amide bond at the intein N-terminus to form the class specific branched intermediate (, Step 1 in Class 3 inteins), which then forms the standard Class 1 intein branched intermediate after a transesterification reaction.
To date, Class 2 inteins have only been found in KlbA genes
[6],
[13]. Class 3 inteins were previously found to be monophyletic, while other helicase inteins, phage-derived inteins and Class 1 inteins having Cys at F
![[ratio]](/corehtml/pmc/pmcents/x2236.gif)
4 were polyphyletic
[11]. This led to the hypothesis that all Class 3 inteins arose from a phage encoded progenitor intein that lost its N-terminal Ser, Thr or Cys
[11]. Based on mutagenesis studies of modern day Class 1 inteins, these early Class 3 inteins would not splice well, if at all. They could have been retained in the population because the extein function was provided by other phage co-infecting the cell or by the host. Eventually, these early Class 3 inteins accumulated second site mutations that enabled them to splice as efficiently as standard Class 1 inteins, as exemplified by the Class 3 Mycobacteriophage Bethlehem (MP-Be) DnaB intein
[1], Deinococcus radiodurans (Dra) Snf2 intein
[10] and Mycobacteriophage Catera (MP-catera) Gp206 intein
[11], which all spliced efficiently in a model precursor consisting of the intein flanked by the Escherichia coli Maltose Binding Protein (M) and the ΔSal fragment of Dirofilaria immitis paramyosin (P).
The Arsp-FB24 DnaB intein, which was annotated to be of phage origin
[14], is a Class 3 intein based on phylogenetic analysis () and it fulfils all of the sequence criteria listed above for Class 3 inteins except that the catalytically essential Cys
F
4 is absent
[1],
[10],
[11]. This suggests that either it is an inactive intein or it is not a Class 3 intein.