|Home | About | Journals | Submit | Contact Us | Français|
Bioconjugation is a burgeoning field of research. Novel methods for the mild and site-specific derivatization of proteins, DNA, RNA, and carbohydrates have been developed for applications such as ligand discovery, disease diagnosis, and high-throughput screening. These powerful methods owe their existence to the discovery of chemoselective reactions that enable bioconjugation under physiological conditions—a tremendous achievement of modern organic chemistry. Here, we review recent advances in bioconjugation chemistry. Additionally, we discuss the stability of bioconjugation linkages—an important but often overlooked aspect of the field. We anticipate that this information will help investigators choose optimal linkages for their applications. Moreover, we hope that the noted limitations of existing bioconjugation methods will provide inspiration to modern organic chemists.
The enormous complexity and diversity of life presents an enormous challenge to scientists attempting to reveal its chemical basis. The discovery that genes contain the information required to generate proteins—the molecules that orchestrate biological processes—provided a universal axiom that enabled countless discovery-based investigations [1–3]. Deciphering the genetic composition of various organisms was a logical next step towards understanding biology. The ensuing whole-genome sequencing projects have yielded a wealth of information [4,5].
The initial enthusiasm over the attainment of complete genetic information about various organisms has, however, been tempered by the realization that the utility of this information is nearly inactionable without knowledge of the function of the encoded proteins. Elucidation of the functions of other biomolecules, such as RNA and carbohydrates, is likewise imperative. “Bioconjugation”, which refers to the covalent derivatization of biomolecules, provides a means to attain this goal [6–8]. This review focuses on modern methods for bioconjugation, and delineates both imperatives and means for making useful bioconjugates. We restrict our analysis to wild-type proteins composed of the 20 amino acids encoded by genetics, or close analogues thereof. Strategies involving the addition of an exogenous domain and its subsequent modification have been reviewed elsewhere [9–14].
Proteins and other biopolymers regulate and perform biological functions by binding to ligands. Accordingly, discovering and characterizing the natural ligands of biopolymers is crucial to understanding biological processes. A promising approach for ligand discovery involves appending biomolecules of interest with synthetic small molecules that can function as probes that report on ligand binding . Such probes include fluorescent molecules [16,17], biotin [18,19], and NMR probes . The ability to screen large numbers of potential ligands rapidly is highly desirable. An especially promising “high-throughput” approach involves the introduction of nonnatural functional groups into biomolecules, followed by site-specific immobilization on surfaces via a chemoselective reaction that occurs exclusively at the nascent appendage, as in Fig. (1). The immobilized biomolecule can be exposed subsequently to various molecules to identify ligands. DNA microarrays [21–23] and protein microarrays  are important examples of this approach.
Small molecules appended to biomolecules can serve as probes for rigorous biochemical analyses. For example, Förster resonance energy transfer (FRET) can be used to generate signals that are sensitive to molecular conformational changes in the 1–10 nm range . A typical FRET experiment entails attachment of a pair of fluorescent molecules to different regions of a biomolecule. One of these fluorophores serves as a “donor” by transferring energy nonradiatively to the other fluorophore, which serves as an “acceptor”. Subsequently, the acceptor emits radiation at its characteristic emission frequency, thereby reporting on the distance between the donor and acceptor. FRET has been used to characterize protein folding , RNA folding [27,28], and biochemical reactions [29,30]. Modern single-molecule fluorescence approaches have elevated FRET-based approaches to an unprecedented level of specificity [31,32].
Non-fluorescent small molecules are also employed as mechanistic probes. For example, biotin has been attached to a K+-ion channel, enabling the conformational changes accompanying channel opening to be mapped by measuring accessibility of the biotin to exogenous avidin . In another example, a nitrile group was introduced into an enzyme as a vibrational probe, and its stretching frequency was a sensitive reporter of the electrostatic environment within the enzymic active site .
Qualitative and quantitative detection of analytes in clinical samples is crucial for the early diagnosis of disease. The complexity and heterogeneity of clinical samples presents a challenging environment for the detection of individual molecules. Chromatographic purification of analytes prior to analysis is time-consuming and labor-intensive, and hence impractical. Accordingly, chemical and immunological methods have become favored for medical diagnoses.
Clinical chemistry exploits an intrinsic physicochemical property of the analyte to generate a unique signal, thus circumventing analyte purification. Examples of this approach include spectrophotometric detection of metal ions and chromogenic and fluorogenic substrate-based assays for characterizing enzymes of interest . Clinical chemistry approaches are limited to special cases because many analytes lack a unique signal-generating property. Moreover, clinical chemistry approaches are often not sensitive enough to be useful in clinical regimes.
In comparison to chemical methods, immunological approaches are often more sensitive . The high specificity of antibody–antigen interactions avoids sample purification. Moreover, since antibodies can be generated against almost any analyte, this method is widely applicable.
Traditional diagnostic methods require significant biochemical experimental protocols that are time-consuming and require specialized laboratory equipment, limiting their applicability. There is an urgent need to develop reusable biosensors for economical and rapid detection of analytes that would be usable in locations far removed from a laboratory setting, such as in the office of a medical doctor or in a remote geographical location. Most biosensors consist of biomolecules attached to surfaces via robust bioconjugation linkages. For example, a commercially available glucose sensor has been developed in which glucose oxidase is immobilized to an electrode surface. The immobilized enzyme converts glucose into hydrogen peroxide, which is recorded as a digital signal. This device is used to monitor glucose levels in diabetes patients . Some biosensor applications employ optical techniques such as surface plasmon resonance (SPR) to detect binding of analytes to biomolecules immobilized on a surface. SPR is used to measure binding of ligands, and yields accurate binding constant values [38,39]. SPR-detection requires expensive instrumentation. A more practical and still highly sensitive detection method based on the orientational behavior of liquid crystals on nanostructured surfaces is demonstrating immense promise [40–43].
The diagnostic methods discussed above are limited to cases wherein the nature of the disease allows for the preparation of clinical samples. In many cases, sample preparation is unfeasible, and the diagnosis needs to be performed directly inside the body. Methods such as magnetic resonance imaging (MRI) and radioimaging are employed in such situations.
Contrast agents are used to improve signal-sensitivity in MRI. Gd(III) complexes are effective contrast agents [44–46]. Antibodies conjugated to Gd(III) complexes have been used for in vivo targeting . Other contrast agents such as magnetite have also been conjugated to antibodies for similar applications .
Radioimaging is another powerful method for in vivo imaging. Isotopes of iodine (that is, 123I and 131I) are commonly used radionuclides. The iodo group is especially convenient because it can be introduced readily into the tyrosine residues of proteins , but the observation of in vivo deiodination raises concerns . Metal nuclides such as 99mTc and 111In are useful alternatives, and can be attached to proteins via organic chelating agents such as EDTA .
Positron emission tomography (PET) continues to grow as an imaging tool. PET is used often in clinical oncology, as well as for the clinical diagnosis of certain diffuse brain diseases such as those causing various types of dementias. PET is also an important research tool to map normal human brain and heart function. PET relies on gamma rays emitted indirectly by a positron-emitting radionuclide, usually an [18F]fluoro group attached to glucose. The conjugation of 18F to proteins is a promising area for future development .
The conjugation of polyethyleneglycol (PEG) molecules to proteins is a well-established technique. Commonly referred to as “PEGylation”, attachment of PEGs can endow proteins with many desirable attributes, such as enhanced water solubility, reduced immunogenicity, improved circulating half-life in vivo, enhanced proteolytic resistance, reduced toxicity, and improved thermal and mechanical stability. PEGylation has been reviewed extensively [53–55], and will not be discussed in detail here.
Immobilized enzymes are used as industrial catalysts [56,57]. The first commercial application of immobilized enzymes was the resolution of amino acids by an aminocyclase . Applications in the food industry include use of fumarase to catalyze the isomerization of fumaric acid to malic acid. The pharmaceutical industry employs immobilized enzymes for the synthesis of drugs. For example, immobilized penicillin amidase is used in the preparation of 6-aminopenicillanic acid . Applications of bioconjugation are also prevalent in the chemical industry. One prominent example is the use of immobilized nitrile hydratase for the production of acrylamide from acrylonitrile .
Traditional strategies for covalent bioconjugation preclude control over the regiochemistry of reactions, producing heterogeneous reaction products, as in Fig. (1A). Poor control over the site of modification often results in loss of the biological function of the target biomolecule . In contrast, novel methods of bioconjugation are highly site-specific and cause minimal perturbation to the active form of the biomolecule. Moreover, biomolecules immobilized site-specifically can possess higher ligand binding ability [62–64,41], as in Fig. (1B), and display stronger spectral polarization . Thus, site-specific bioconjugation is preferable to random bioconjugation. Common linkages for site-specific bioconjugation rely on cysteine or lysine residues. Newer methods target nonnatural functional groups , including olefins via metathesis [67,68]. Chemical reactions that target tryptophan and tyrosine have been reviewed elsewhere .
Thiolates (though not thiols ) are potent nucleophiles in aqueous solutions. Accordingly, the derivatization of proteins via the thiolate group of a cysteine residue is a popular method of bioconjugation . As cysteine is the second least common amino acid in natural proteins , site-specific conjugation can often be performed at a unique cysteine residue.
Typical thiol-reactive functional groups include iodoacetamides, maleimides, and disulfides, as in Fig. (2). Iodoacetamides (Fig. (2A)) were used in classic experiments for determining the presence of free cysteines in proteins . More recently, iodoacetamido groups have been used extensively for labeling proteins with fluorophores, PEGylation, and protein immobilization . Chloroacetamides appear to exhibit even greater specificity than iodoacetamides for cysteine residues .
Like iodoacetamides, maleimides are commonly used electrophiles for thiol-mediated bioconjugation [6,7,75,8,74]. Thiolates undergo a Michael addition reaction with maleimides to form succinimidyl thioethers (Fig. (2B)). An undesirable and underappreciated aspect of maleimide conjugates is the susceptibility of their imido groups to undergo spontaneous hydrolysis, resulting in undesirable heterogeneity. Both molybdate and chromate have been shown to catalyze the hydrolysis of an imido group near neutral pH , providing a means to decrease the heterogeneity of bioconjugates derived from maleimides.
The thiol-selectivity of iodoacetamides and maleimides is compromised at high concentrations of the reagents, as nucleophilic side chains of amino acid residues such as histidines and lysines can be modified covalently. In contrast, disulfide reagents react selectively with thiols, as in Fig. (2C). Disulfides are, however, susceptible to reduction by biological reducing agents, like glutathione. Hence, the use of disulfides is limited to in vitro applications, such as the crosslinking [77,78] and immobilization  of peptides and proteins. Thiol–disulfide interchange is also the basis of an innovative tethering method that enables the identification of small-molecule fragments that bind to specific regions of a target protein .
Amide bonds have a half life of ca. 600 years in neutral solution at 25 °C . This extraordinary stability makes amide linkages highly attractive for bioconjugation. The random introduction of amide linkages in biomolecules is trivial. For example, a protein can be treated with a small molecule or surface displaying an activated ester (e.g., an N-hydroxysuccinimidyl ester) to form amide bonds with the amino groups on lysine side chains and the N terminus [82,41]. In contrast, the site-specific generation of amides is challenging. Native chemical ligation and the Staudinger ligation are two modern approaches for generating amide linkages at a specific site in a protein .
In native chemical ligation, an N-terminal cysteine residue reacts with a thioester to undergo transthioesterification followed by a rapid S→N acyl transfer to form an amide, as in Fig. (3). This reaction is a powerful tool for peptide ligation and hence protein synthesis [84–88]. Expressed protein ligation is an extension of native chemical ligation [89–91]. In this method, a target protein is expressed as a fusion protein with an intein—a protein subunit that catalyzes the formation of a thioester at the C-terminus of the target protein, as in Fig. (4). The protein–intein fusion proteins are treated with peptides containing an N-terminal cysteine residue to effect native chemical ligation, as in Fig. (3). Surfaces displaying cysteines were treated with protein–intein thioesters to perform site-specific protein immobilization via amide bonds . Using a similar approach, fluorescent molecules were conjugated to specific sites in proteins . Furthermore, proteins were biotinylated using expressed protein ligation, and then used for high-throughput proteomic analyses . An undesirable aspect of native chemical ligation and expressed protein ligation is the introduction of a residual thiol at the site of bioconjugation, which can be a focal point for undesirable side reactions [95–97]. Chemical desulfurization approaches  provide a solution to the above problem, but are obviated if the protein contains other cysteine residues. Hydrazine nucleophiles react with protein–intein thioesters without installing a residual reactive group, and enable a functional group  or surface  to be appended at the C terminus of a protein, as in Fig. (4).
The Staudinger ligation provides another solution to the cysteine limitation [101,83]. This conjugation method is based on the venerable Staudinger reaction, in which an azide is reduced to an amine by a phosphine [102,103]. Staudinger ligation employs a phosphine that also serves as an acyl donor—the phosphorus first attacks the azide forming an iminophosphorane, which is then acylated with the concomitant liberation of nitrogen gas to form an amidophosphonium salt that hydrolyzes to yield the amide [104,105]. One version of the Staudinger ligation leaves a phosphine oxide in the amide product, as in Fig. (5A) [106,107,104]. Another version—the “traceless” Staudinger ligation—employs a phosphinothioester that yields an acyclic amidophosphonium salt, resulting in an amide product that lacks the phosphine oxide moiety or other residual atoms, as in Fig. (5B) [108–111,83,105,112].
The Staudinger ligation is used often for bioconjugation. The initial uses were with azido-containing carbohydrates introduced onto cell surfaces by biosynthesis, and enabled quantitative measurements by flow cytometry [106,113]. Subsequently, the Staudinger ligation has been used for N-glycopeptide synthesis . In addition, an azido group has been installed into a protein by using the methionyl-tRNA synthetase of Escherichia coli activated with γ-azidohomoalanine, and then subjected to Staudinger ligation with a peptide . Azido groups have also been installed into proteins by diazo transfer . The Staudinger ligation has been used for the rapid and site-specific immobilization of peptides and proteins [117,64,118,100]. Water-soluble reagents for the traceless version avail new possibilities, such as integration with expressed protein ligation. The site-specific labeling of DNA by fluorescent molecules has also been performed by Staudinger ligation . A phosphinothiol that mediates the traceless Staudinger ligation also reacts with S-nitrosothiols to generate bis-conjugates . Finally, a gentle phosphine-mediated means to convert an azido group into a diazo compound—a ready precursor to a carbene—provides the potential for random crosslinking to biomolecules, as in Fig. (5C) .
The special reactivity of squarates has been exploited for bioconjugation (Fig. (6)) [122–124]. The reaction of two amino groups with a squarate results in their conjugation via two vinylogous amides. Notable advantages of this conjugation method include the small size of the squarate, and the greatly reduced rate of its reaction with the second amino group, limiting the undesirable synthesis of homodimers.
The facile synthesis of carbon–nitrogen double bonds via condensation of nitrogen bases with aldehydes and ketones in aqueous solutions at neutral pH renders them attractive for bioconjugation. Hydrazones (C=N–N) are generated when the nitrogen base is a hydrazine, as in Fig. (7). Oximes (C=N–O) are formed when the nitrogen base is an alkoxyamine. Both hydrazones and oximes are significantly more stable than are simple imines (C=N)—the products of condensation of amines with aldehydes or ketones. Anilines are, however, especially effective catalysts of hydrazone and oxime formation [125–128].
Carbohydrates are especially amenable to modification with carbon–nitrogen double bonds, as their hydroxyl groups can be oxidized readily into aldehydes . Alternatively, ketones can be introduced into cell-surface carbohydrates by biosynthesis [130,131]. Carbohydrates immobilized via oxime linkages have been used to generate carbohydrate microarrays .
There are numerous examples in the literature of hydrazone and oxime conjugates of oligonucleotides . For example, acylhydrazone linkages have been used for the immobilization of aldehydic oligonucleotides on surfaces displaying acylhydrazines . Additionally, peptide nucleic acid–peptide conjugates have been generated using oxime conjugation .
Peptide microarrays generated by immobilizing peptides via acylhydrazone linkages enable the sensitive detection of antibodies in blood samples [136,137]. Peptides and small molecules have been immobilized via oxime linkages onto glass slides displaying aldehydes, and the resulting microarrays used for protein binding and cell-adhesion assays . Peptide fragments bearing aminoxy functional groups were incubated with a polyaldehyde template to generate large protein-like molecules containing multiple oxime linkages [139,140]. The chemoselectivity of oxime formation has been used to assemble a transcription factor-related protein that is not readily accessible by recombinant DNA technology . A conceptually related approach was used to synthesize glycodendrimers appended with an antigen [142,143]. Finally, oxidative deamination mediated by pyridoxal 5′-phosphate can be used to generate an aldehyde or ketone at the N terminus of some peptides and proteins [144,145].
Although hydrazones and oximes are common conjugates, both are labile to spontaneous hydrolysis. The hydrolytic stabilities of isostructural alkylhydrazones, acylhydrazones, and an oxime were examined at pD 5.0–9.0 . The hydrolysis of each adduct was found to be catalyzed by acid. Rate constants for oxime hydrolysis were nearly 103-fold lower than those for the hydrazones; a trialkylhydrazonium ion (formed after condensation) was even more stable than the oxime. These data led to a general mechanism for the hydrolysis of C=N–X linkages. There are several important messages from this work. First, alkylhydrazones and acylhydrazones should not be used for bioconjugation, as their half-lives are only an hour or so under physiological conditions. Secondly, oximes are far more stable than hydrazones, having half-lives close to a month. Finally, efforts to develop a gentle means to condense a trialkylhydrazine with an aldehyde or ketone are worthwhile.
The discovery of the rate acceleration availed by Cu(I) has made the Huisgen 1,3-dipolar azide–alkyne cycloaddition one of the most useful reactions for bioconjugation [147–151]. This cycloaddition results in the irreversible formation of a 1,4-disubstituted[1,2,3]triazole linkage, as in Fig. (5D). The reaction has been used in an extraordinary range of contexts, including labeling proteins with small molecules [152,99,153,154], immobilizing proteins and peptides [155,118], proteomics applications , immobilizing carbohydrates , functionalizing DNA [155,157], and decorating virus particles and bioactive polymers with fluorescent molecules [158,159]. A Ru(II) catalyst leads to the 1,5-disubstituted[1,2,3]triazole , which mimics a cis (that is, E) peptide bond , as in Fig. (5E).
The Cu(I)-catalyzed version of the Huisgen cycloaddition can cause cytotoxicity and protein precipitation due to the Cu(I) ion [156,99]. Moreover, the reaction rates are slow, precluding its use for studying cellular processes. To overcome these drawbacks, several groups have exploited a reaction discovered in the 1950s [162,163] and exploited latterly , using the ring strain of a cyclooctyne group to enable the Huisgen 1,3-dipolar azide–alkyne cycloaddition to proceed rapidly without a catalyst [165–167], as in Fig. (5F).
Another cycloaddition reaction—the Diels–Alder reaction—between a diene on a peptide and a dienophile on a glass surface has been used for peptide immobilization . A similar approach was employed for immobilizing carbohydrates onto glass slides displaying hydroquinone functional groups . The inverse–electron-demand Diels–Alder reaction between a tetrazine and trans-cyclooctene is especially rapid and has much potential for bioconjugation .
Of the functional groups described above, the azido group is foremost in having an intrinsic reactivity that is versatile though relatively chemoselective under physiological conditions (Fig. (5)) [171,172]. Accordingly, the Staudinger ligation and Huisgen 1,3-dipolar azide–alkyne cycloaddition have gained special favor amongst chemical biologists. There are, however, two limitations to consider.
The first limitation is the relatively low chemoselectivity of azide coreactants. The oxidizing extracellular environment is electron-poor, and hence replete with electrophiles. Prevalent there are disulfide bonds and singlet oxygen, which can react rapidly with the phosphines of the Staudinger ligation. Conversely, the reducing environment of the cytosol is electron-rich and awash with nucleophilic thiolates that can attack cyclooctyne and its congeners (including trans-cyclooctene). Although these bioconjugation reactions are compromised by coreactant promiscuity, extracellular Huisgen cycloadditions avoid the most calamitous of side reactions.
The second limitation is the rather modest reaction rate constants of azide-mediated conjugation reactions. The second-order rate constant for the fastest known Staudinger ligation is 7.7 × 10−3 M−1s−1 . That for the most rapid Huisgen cycloaddition is 2.3 M−1s−1 . What do these rate constants mean for a chemical biologist? Consider a bioconjugation reaction between equimolar reactants A and B to form A–B with second-order rate constant k:
Integrating the rate equation gives
and the yield = [A – B]t /[A]t=0 of conjugate is
If attainable concentrations of [A]t=0 = [B]t=0 = 1 µM are allowed to react for a reasonable time of t = 1 h without any side reactions, then the fastest known Staudinger ligation and Huisgen cycloaddition would provide A–B yields of 0.003% and 0.8%, respectively. For comparison, we note that highly chemoselective enzyme-mediated conjugation reactions can occur with a second-order rate constant of 2.7 × 106 M−1s−1 , which would provide an A–B yield of >90%. This value provides a benchmark for the development of rapid but still chemoselective reactions for biological contexts.
Bioconjugation is being applied in research laboratories, industrial facilities, and medical clinics. Choosing the optimal linkage for a particular application is crucial. One imperative is the ease of generating the desired bioconjugate quickly under physiological conditions. Another is the stability of the bioconjugate during the course of its use. These imperatives present interesting challenges of substantial importance. As a consequence, new conjugation modalities are being pursued with vigor.
Work on bioconjugation in our laboratory is supported by grant NIH GM044783 and the Materials Research Science and Engineering Center at the University of Wisconsin–Madison (NSF DMR-0520527).