InBase has expanded since the 2000 NAR Database Issue (
6). The InBase home page has been reorganized to move background information into a separate section called Intein Basics. This section is suitable for students and contains PDF files that allow users to download protein splicing figures. An intein-specific BLAST server is now available and the amino acid sequence is present in individual intein pages. The Identification of Intein section reflects the growing list of intein polymorphisms. A new splicing mechanism for inteins lacking an N-terminal nucleophile has been included in the Splicing Mechanism section (
7).
The InBase BLAST server was added in response to requests from researchers and genome sequencing groups, since intein identification is not always trivial. Searches of general sequence databases often yield hits with very low scores and low probability values due to (i) the small size of mini-inteins (as few as 134 amino acids), (ii) the low level of sequence similarity even in conserved motifs and (iii) the high degree of polymorphisms in conserved splice junction residues. By limiting the search process to intein sequences, significant scores and P-values can be obtained. Since inteins are predominantly found in extein active sites and cofactor or substrate binding pockets, identification of inteins can potentially help locate these elements in uncharacterized proteins.
Several important advances have been reported in the protein splicing field. Most notable was the discovery of a non-canonical protein splicing pathway for inteins beginning with Ala, instead of Ser, Thr or Cys (
7). The first crystal structure of an intein precursor shed light onto the amino acids that assist catalysis and the need for conformational changes to align nucleophiles during the sequential steps in the protein splicing pathway (
8). Numerous protein engineering applications take advantage of the C-terminal α-thioester formed on target proteins purified from intein vectors (
2,
3,
9–
11), such as the commercially available IMPACT™ system (NEB). Papers using intein vectors for protein purification are not included in the bibliography unless they add to our understanding of intein technologies. A green (APP) label in the bibliography section highlights application papers.