|Home | About | Journals | Submit | Contact Us | Français|
Protein microarray is an emerging technology that provides a versatile platform for characterization of hundreds of thousands of proteins in a highly parallel and high-throughput way. Two major classes of protein microarrays are defined to describe their applications: analytical and functional protein microarrays. In addition, tissue or cell lysates can also be fractionated and spotted on a slide to form a reverse-phase protein microarray. While the fabrication technology is maturing, applications of protein microarrays, especially functional protein microarrays, have flourished during the past decade. Here, we will first review recent advances in the protein microarray technologies, and then present a series of examples to illustrate the applications of analytical and functional protein microarrays in both basic and clinical research. The research areas will include detection of various binding properties of proteins, study of protein posttranslational modifications, analysis of host-microbe interactions, profiling antibody specificity, and identification of biomarkers in autoimmune diseases. As a powerful technology platform, it would not be surprising if protein microarrays will become one of the leading technologies in proteomic and diagnostic fields in the next decade.
Microarray technology is a term that refers to the miniaturization of thousands of assays on one small plate. This concept was developed from an earlier concept called ambient analyte immunoassay, which was first introduced by Roger Ekins in 1989. In the following decade, this concept was successfully transformed into the DNA microarray, a technology that determines mRNA expression levels of thousands of genes in parallel. However, this DNA microarray technology possesses some limitations because mRNA profiles do not always correlate with protein expression (Gygi et al., 1999; Kopf and Zharhary, 2007; Zhu and Snyder, 2001). More importantly, proteins are the major driving force in almost all cellular processes. Therefore, protein microarrays were developed as a high-throughput tool to overcome the limitation of DNA microarrays and to provide a direct platform for protein function analyses.
Immunoassays, the first form of protein microarray, take advantage of highly specific antigen-antibody recognition to build a protein detection system. The expansion of the capability of conventional immunoassays to antibody arrays enabled a parallel and multiple detection system using a small amount of sample (Haab, 2005; Kopf and Zharhary, 2007). Moreover, this technology has high sensitivity and good reproducibility in quantitative assays. The sensitive and reliable performance of antibody arrays is a valuable advantage in studying complex biological samples.
Around the same time, another type of protein microarray was developed via the immobilization of purified proteins on glass slides. To distinguish this type of array from the antibody arrays, they are divided into two classes: analytical and functional (Chen and Zhu, 2006). Unlike antibody arrays (analytical microarrays), functional protein microarrays are made by spotting all of the proteins encoded by an organism and therefore are useful for characterization of protein functions, such as protein-protein binding, biochemical activity, enzyme-substrate relationships, and immune responses (Chen and Zhu, 2006; Poetz et al., 2005). More recently, a so-called reverse-phase array was developed, providing an alternative format to the analytical microarrays in which tissue/cell lysates (or fractionated lysates) are used to form such an array (Poetz et al., 2005). An overview of these three categories of protein microarrays is given in Figure 1.
The most representative model of analytical protein microarrays is the antibody array. The first model to demonstrate the application of antibody arrays was the “analyte-labeled” assay format. In this format, proteins are detected after antibody capture using direct protein labeling (Haab, 2005). Using this format, Knezevic et al. (2001) successfully found alterations in protein expression in cancer cell development. Multiple differences in protein expression could be detected in epithelial and stromal cells using this “analyte-labeled” antibody array. However, some limitations have to be considered because this method lacks specificity in protein target labeling and has poor sensitivity for low abundance proteins. Moreover, targeted protein labeling may lead to the epitope destruction due to some chemical reactions (Poetz et al., 2005). Another model of antibody array provides higher sensitivity using the “sandwich” assay format. This format employs two different antibodies to detect the targeted protein (Haab, 2005; Poetz et al., 2005). One antibody, called the capture antibody, immobilizes the targeted protein on the solid phase, while the other antibody, called the reporter or detection antibody, generates a signal for the detection system. Using two antibodies significantly increases the specificity and sensitivity of the “sandwich” assay format, even at femtomolar levels (Chen and Zhu, 2006; Poetz et al., 2005). Gonzalez et al. (2011) reported the application of this “sandwich” assay format to distinguish between two similar samples: blood plasma and serum. They were able to perform high specificity and sensitivity detection using less than 1 μL of sample with this platform. These assays offer a multiplexed format of the original Enzyme-linked Immunosorbent Assay (ELISA), but they can only detect dozens, rather than hundreds, of analytes simultaneously because cross-reactivity between antibodies may occur (Poetz et al., 2005).
Antibodies are the most popular protein capture reagents, although their affinity and/or specificity can vary dramatically (Talapatra et al., 2002). For instance, many antibodies may cross-react with proteins other than their expected target proteins when tested on functional protein microarrays, especially when multiple analyte detection is employed. The need for highly specific antibodies has become a major challenge in analytical protein microarrays (Chen and Zhu, 2006; Poetz et al., 2005; Talapatra et al., 2002) because nonspecific binding will lead to large numbers of false positive results (Talapatra et al., 2002). Another challenge comes from producing a large number of antibodies in a high-throughput fashion. Recombinant antibodies have become a promising means to overcome this problem; however, they are not ready for prime time performance yet (Brichta et al., 2005).
In the post-genomics era, the ability to characterize protein functionality at the proteome level has been highly desirable for modern proteomics and systems biology studies. Because functional protein microarrays are constructed using individually purified proteins, they enable the study of various biochemical properties of proteins, such as binding activities, including protein-protein, protein-DNA, protein-lipid, protein-drug and protein-peptide interactions, and enzyme-substrate relationships via various types of biochemical reactions (Chen and Zhu, 2006; Poetz et al., 2005).
The first use of functional protein microarrays was demonstrated by Zhu et al. (2001) to determine the substrate specificity of protein kinases in yeast. Since then, reported applications of functional protein microarrays in basic research, as well as in clinical applications, are increasing rapidly. Great achievements in providing the whole proteome of several organisms (i.e., human, yeast, Escherichia coli, virus) on arrays has allowed for many important biological discoveries (Thao et al., 2010; Zhu et al., 2006; Zhu and Snyder, 2001; Zhu et al., 2009). Moreover, protein microarrays enable us to study many post-translational modifications (i.e., phosphorylation, acetylation, ubiquitylation, S-nitrosylation) in a large-scale fashion, which is critical for understanding cellular protein synthesis and function (Foster et al., 2009; Lin et al., 2009; Lu et al., 2008; Zhu et al., 2000).
Employing the opposite format of classic protein microarrays, reverse-phase microarrays expand the applications of this technology. This method allows for the analysis of many samples obtained at different states by directly spotting tissue, cell lysates or even fractionated cell lysates on a glass slide. Many different probes can be tested to specifically identify certain proteins in lysate samples (Poetz et al., 2005). This type of microarrays was first established by Paweletz and colleagues to monitor histological changes in prostate cancer patients. Using this method, they successfully detected microscopic transition stages of pro-survival checkpoint protein in three different stages of prostate cancer: normal prostate epithelium, prostate intraepithelial neoplasia, and invasive prostate cancer. The high degree of sensitivity, precision and linearity achieved by reverse-phase protein microarrays enabled this method to quantify the phosphorylation status of some proteins (such as Akt and ERK) in these samples; phosphorylation was statistically correlated with prostate cancer progression. Harnessing this sophisticated technology, Ciaccio et al. (2010) profiled EGF receptor signaling dynamics using micro-western arrays (MWA), which combine western blotting and reverse-phase protein microarrays to produce better sensitivity by separation of whole lysate sample components. This method allowed them to precisely measure 91 phosphosites of 67 proteins at 6 different time points with five EGF concentrations in A431 human carcinoma cells to analyze the dynamic profile of EGF receptor concentrations. A significant drawback of this approach, however, is that it is highly dependent on the availability and specificity of commercially produced antibodies. Because of this bias, it is has limited applications.
Commercially protein microarrays are available in many different formats, which represent the three types of protein microarrays. While some companies provide high-content protein microarrays comprised of hundreds of purified proteins or antibodies, most of them provide a specialized array platform for certain purposes of research, such as detection of human cytokines, phosphorylation events, and study of certain pathogen-host interactions (Table 1). For example, Invitrogen provides a human protein microarray of ~9000 full-length human proteins purified individually from insect cells. The quantity and quality of this array is carefully controlled. The major advantage of commercial protein microarrays is that it provides commercial access of protein microarrays to researchers who are not familiar with this technology. However, the disadvantages are obvious as well. Most of these commercial microarrays do not have a full coverage of the proteome (mostly humans) or only provide antibodies against a small fraction of the human proteome. Secondly, most of them are quite costly.
Fabrication of protein microarrays still leaves many challenges in different critical aspects, especially in producing proteins in a high-throughput fashion. Moreover, the quality of proteins has to be considered as a crucial aspect because it will determine the quality of the resultant protein microarrays.
Recombinant antibodies have become an alternative to the traditional hybridoma-based technology in generating high quality antibodies. Phage display is one of the early methods developed to generate recombinant antibodies (Brichta et al., 2005; Carmen and Jermutus, 2002). Using antibody-fragment encoding genes (VH and VL) and bacteriophage capsid gene fusion, this technology enables sets of human antibody libraries to be stored in prokaryotic systems where they can be readily expressed by phage infection. Prokaryotic expression systems promise large-scale antibody production in short time periods. In addition, this system generates antibody fragments lacking the Fc domain rather than intact IgG, eliminating nonspecific binding to the Fc receptor (Knappik and Brundiers, 2009). The recombinant antibodies are expressed and displayed in the phage capsid, and then purified using column chromatography. Recently, other methods have been developed to generate recombinant antibodies in eukaryotic expression systems (i.e., yeast display) or even in vitro environments (i.e., mRNA display, ribosome display), which provide additional advantages for recombinant antibody fabrication (Chao et al., 2006; Irving et al., 2001; Tabata et al., 2009).
Fabrication of functional protein microarrays faces an even bigger challenge due to the need for large amounts of highly purified proteins. Furthermore, biochemical characteristics (such as protein folding and posttranslational modifications) and physical conditions during the purification procedure have to be considered to generate functional proteins. To overcome these hurdles, high-throughput protein purification protocols have been developed using both Saccharomyces cerevisae (yeast) and E. coli protein expression systems. Using a batch purification protocol in a 96-well format, >4000 recombinant proteins can be overexpressed and purified in yeast or E. coli (Chen et al., 2008; Jeong et al., 2012; Zhu et al., 2001). Because it is challenging for most labs to purify large numbers of proteins, Angenendt et al. (2006) developed an alternative technology in protein chip fabrication, dubbed nucleic acid-programmable protein array (NAPPA). Spotting plasmid DNAs with capture antibodies allows for the generation of a protein microarray via simultaneous in situ transcription/translation reactions and protein immobilization on the printed slides. A significant benefit of this approach is that the printed template DNA microarray can be stored for a long time, and the resulting protein microarrays are always freshly made. This method allows the generation of up to 13,000 protein spots on one slide without laborious cloning and expression vectors. In addition, our group has developed two strategies to fabricate protein microarrays by directly capturing nascent polypeptides on a solid surface during translation using puromycin. Employing synthetic or in vitro–transcribed RNAs for this strategy, we can produce high density and quality protein microarrays (Tao and Zhu, 2006). However, such arrays have not flourished due to low protein yield and difficulties in producing large proteins (e.g., >60 kD).
Another method is to obtain lysates for the construction of reverse-phase protein microarrays. Proper samples can be isolated from cell culture; frozen, ethanol-fixed, or paraffin-embedded tissue or laser captured microdissections of cell populations from certain tissues (Charboneau et al., 2002; Espina et al., 2007).
Following protein production, protein immobilization on a solid support (e.g., derivatized glass slides) is also crucial. An ideal surface for protein microarray fabrication has to be capable of protein immobilization and preserving three dimensional (3D) conformation of proteins (Guo and Zhu, 2006). Table 2 summarizes several surfaces which have been used in protein microarray fabrication (Chen et al., 2007).
The detection system for protein microarrays is another important design parameter. There are two main methods of detection system: label-dependent and label-free. In label-dependent detections, several types of labeling reagents have been developed, such as fluorescent dyes, enzymes, radioisotopes, and liposomes. Fluorescent dyes with narrow excitation and emission spectra, such as Cy3, Cy5 and equivalents, are the most commonly used because they are convenient and provide a wide linear detection range compared to other labeling systems (Hall et al., 2007). Moreover, these fluorescent dyes provide a multicolor detection system, which allows for multiplex assay design. Harnessing the advantage of signal amplification, enzymatic methods offer significant improvement in signal detection sensitivity. The most popular enzyme is horseradish peroxidase. Rolling circle amplification (RCA) and tyramide signal amplification (TSA) have also been developed to detect low abundance proteins (Schweitzer et al., 2002; Varnum et al., 2004). For some assays, such as enzymatic reactions, radioisotopes (e.g., 32P, 33P, and 14C) are the most straightforward for detection. However, other labeling methods are preferable to radioisotope labeling due to safety issues (Chen et al., 2007; Hall et al., 2007).
Because labeling processes can affect protein activity, label-free methods have been developed. Label-free detection gives additional advantages by providing real-time measurement to monitor the dynamics of protein interactions. Surface plasmon resonance spectroscopy (SPR), imaging optical ellipsometry (OE), and reflectometric interference spectroscopy (RIFS) use the measurement of the optical dielectric response of a thin film and thereby detect changes in physical or chemical properties (Piehler et al., 1997; Thiel et al., 1997; Wang and Jin, 2003). In addition to these label-free detection systems, the oblique-incidence reflectivity difference (OIRD) technique performs extremely sensitive detection by measuring the difference in reflectivity between S- and P-polarization. This method can dramatically speed up kinetics measurements of protein binding by detecting a tiny change in its physical properties, such as thickness and density (Chen et al., 2001; Evans-Nguyen et al., 2008; Landry et al., 2008).
Other sophisticated methods have also been developed to achieve good detection in protein microarrays. Atomic force microscopy (AFM) uses the movement of a cantilever controlled by piezoelectric crystals and a laser-based optical system to image the topological structure of the protein, thereby detecting protein interactions. The range of methods has been expanded by combining mass spectrometry with other surface techniques (MS-coupled), such as SELDI and MALDI-TOF MS. These methods provide further advantages by providing chemical and structural information for small amounts of sample, information which is difficult to obtain through other methods (Yu et al., 2006).
An obvious advantage of functional protein microarrays is their ability to provide a flexible platform that can characterize a wide range of biochemical properties of spotted proteins. To date, these assays have been successfully developed to detect various types of protein binding properties, such as protein-protein, protein-DNA, protein-RNA, protein-lipid, protein-drug, and protein-glycan interactions (Chen et al., 2008; Hall et al., 2004; Ho et al., 2006; Hu et al., 2009; Huang et al., 2004; Kung et al., 2009; MacBeath and Schreiber, 2000; Popescu et al., 2007; Zhu et al., 2001; Zhu et al., 2007), and identify substrates of various classes of enzymes, such as protein kinases, ubiquitin/SUMO E3 ligases, and acetyltransferases, to name a few (Lin et al., 2009; Lu et al., 2008; Ptacek et al., 2005; Schnack et al., 2008; Zhu et al., 2000).
During the development of various assay types, it became obvious that surface chemistry plays an important role in the success of a new assay (Table 2). For example, protein-DNA interactions were first performed on yeast proteome microarrays on a nitrocellulose surface (i.e., FAST slide) with randomly shared yeast genomic DNA fragments (~500 bp) that were labeled with Cy5 (Hall et al., 2004). Later, Hu et al. also found that the FAST slide, among other tested surfaces, produced the best signal-to-noise ratio for detecting interactions between short DNA motifs (36-100 bp) and proteins (Hu et al., 2009). In another example, when the Zhu group was developing protein acetylation reactions using 14C-labeled Ac-CoA as a donor, they first tested the HAT activity of the NuA4 acetyltransferase complex using histones H3 and H4 as substrates on FAST slides, as well as aldehyde- and Ni-NTA-coated slides (Lin et al., 2009; Lu et al., 2011). The results clearly showed that both FAST and nickel surfaces worked, but the FAST surface produced better signal-to-noise ratios. However, the FAST surface was not suitable for phosphorylation reactions because the background noise was too high (Ptacek et al., 2005; Zhu et al., 2009). Another instance where surface chemistry is vital to assay development is profiling of cell surface glycans on a lectin microarray. Our group and others have found that the only proper surface for this type of binding assays is a commercial Schott slide, although the exact surface chemistry is proprietary (Hsu et al., 2006; Pilobello et al., 2007; Tao et al., 2008). Several reasons factor into the importance of surface chemistry. First, for low-affinity binding assays (e.g., protein-DNA interactions), a porous surface (e.g., FAST) is likely to retain more proteins and hence improve sensitivity. Second, when radioisotope-labeled small molecules are used, it is important to completely remove unincorporated radioisotopes from the surface to reduce background noise. Excess retention of radioisotopes might explain why phosphorylation assays do not work well on FAST surfaces. Third, in the case of using live cells to probe a lectin microarray, the grafted chemical ligands must not be too repulsive to cells. Other factors, such as protein conformation and stability, can also be affected by surface chemistry. Therefore, whenever a novel assay is to be developed, a variety of surfaces should be tested first in a pilot study.
Application of these assays has had a profound impact on a wide range of research areas. This is especially true when they are used in large-scale, high-throughput projects, exemplified in both network construction and biomarker identification (see below and Table 3).
Among the first applications of protein microarrays was the analysis of protein-protein and protein-lipid interactions where test ligands were directly or indirectly labeled with fluorescent dyes. For example, Zhu and Snyder (2001) developed the first proteome microarray composed of ~5800 recombinant yeast proteins (>85% of the yeast proteome) and identified binding partners of calmodulin and phosphatidylinositides (PIPs). They first incubated the microarrays with biotinylated bovine calmodulin and discovered 39 new calmodulin binding partners. In addition, using liposomes as a carrier for various PIPs, they identified more than 150 binding proteins, >50% of which were known membrane-associated proteins. Popescu et al. (2007) developed a protein microarray containing 1,133 Arabidopsis thaliana proteins and also used it to globally identify proteins binding to calmodulins or calmodulin-like proteins in Arabidopsis. A large number of previously known and novel targets were identified, including transcription factors, receptor and intracellular protein kinases, F-box proteins, RNA-binding proteins, and proteins of unknown function. Alternative approaches to identifying protein-protein interactions, such as the yeast two-hybrid system and protein complex purification coupled with mass spectrometry analysis, are well-established and are used as standard high-throughput methods to detect protein-protein interactions in higher eukaryotes (Krogan et al., 2006; Vidal et al., 1996). Thus, while protein microarray-based approaches provide a rapid approach to characterizing protein-protein interactions, they have much competition in this arena.
MacBeath and colleagues fabricated protein domain microarrays to investigate protein-peptide interactions that might play an important role in signaling in a semi-quantitative fashion (Jones et al., 2006). They constructed an array by printing 159 human Src homology 2 (SH2) and phosphotyrosine binding (PTB) domains on the aldehyde-modified glass substrates and incubated the arrays with 61 peptides representing tyrosine phosphorylation sites on the four ErbB receptors. Eight concentrations of each peptide (10 nM to 5 mM) were tested in the assay, allowing semi-quantitative measurement of the binding affinity of each peptide to its protein ligand.
Protein microarrays have also been applied extensively and successfully to characterize protein-DNA interactions (PDIs). In an earlier study, Snyder and colleagues screened for novel DNA-binding proteins by probing yeast proteome microarrays with fluorescently labeled yeast genomic DNA (Hall et al., 2004). Of the ~200 positive proteins, half were not previously known to bind to DNA. By focusing on a single yeast gene, ARG5,6, encoding two enzymes involved in arginine biosynthesis, they discovered that its protein product bound to a specific DNA motif and associated with specific nuclear and mitochondrial loci in vivo.
In a later report, the Snyder and Johnston groups constructed a protein microarray with 282 known and predicted yeast transcription factors (TFs) to identify their interactions with 75 evolutionarily conserved DNA motifs (Ho et al., 2006). Over 200 specific PDIs were identified and >60% of them were previously unknown. The binding site of a previously uncharacterized DNA-binding protein, Yjl103p, was defined, and a number of its target genes were identified, many of which are involved in stress response and oxidative phosphorylation. This study was the first to demonstrate that an unbiased screen with short DNA motifs on a protein microarray could reveal novel function of a previously uncharacterized protein.
Our team developed a bacterial proteome microarray composed of 4,256 proteins encoded by the E. coli K12 strain (~99% coverage of the proteome) using a bacterial high-throughput protein purification protocol (Chen et al., 2008). To demonstrate the usefulness, end-labeled, double-stranded (ds) DNA probes carrying abasic or mismatched base pairs were used to identify proteins involved in DNA damage recognition. A small number of proteins were specifically recognized with high affinity by each type of probe. Two of these proteins, YbaZ and YbcN, were further characterized and found to encode base-flipping activity using biochemical assays.
Recently, Zhu and colleagues also undertook a large-scale analysis of human PDIs using a protein microarray composed of 4,191 unique, full-length human proteins, including ~90% of the annotated TFs and a wide range of other protein categories, such as RNA-binding proteins, chromatin-associated proteins, nucleotide-binding proteins, transcription co-regulators, mitochondrial proteins and protein kinases (Hu et al., 2009). The protein microarrays were probed with 400 predicted and 60 known DNA motifs, and a total of 17,718 PDIs were identified. Many known PDIs and a large number of new PDIs for both well-characterized and predicted TFs were recovered, and new consensus sites for over 200 TFs were determined, doubling the number of previously reported consensus sites for human TFs (Hu et al., 2009; Xie et al., 2010). Surprisingly, over 300 proteins that were previously unknown to specifically interact with DNA showed sequence-specific PDIs, suggesting that many human proteins may bind specific DNA sequences as a moonlighting function. To further investigate whether the DNA-binding activities of these unconventional DNA binding proteins (uDBPs) were physiologically relevant, we carried out in-depth analysis of a well-studied protein kinase, Erk2, to determine the potential mechanism behind its DNA-binding activity. Using a series of in vitro and in vivo approaches, such as EMSA, luciferase assay, mutagenesis, and ChIP, we demonstrated that the DNA-binding activity of Erk2 is independent of its protein kinase activity, and Erk2 acts as a transcription repressor of transcripts induced by interferon gamma signaling (Hu et al., 2009). Other than Erk2, many other uDBPs show sequence-specific DNA-binding activity and more intriguingly, many of their consensus sequences are highly similar to those recognized by annotated TFs. This observation suggests that these uDBPs may work synergistically with the TFs to achieve highly accurate transcription regulation. Again, an unbiased approach demonstrated the power of functional protein microarrays.
Discovering new drug molecules and drug targets is another field in which protein microarrays have shown its potential. For example, Huang et al. (2004) incubated biotinylated small-molecule inhibitors of rapamycin (SMIRs) on the yeast proteome microarrays, and obtained the binding profiles of the SMIRs across the entire yeast proteome. They identified candidate target proteins of the SMIRs, including Tep1p, a homolog of the mammalian PTEN tumor suppressor, and Ybr077cp (Nir1p), a protein of previously unknown function, both of which are validated to associate with PI(3,4)P2, suggesting a novel mechanism by which phosphatidylinositides might modulate the target-of-rapamycin pathway.
Protein glycosylation, a general posttranslational modification of proteins involved in cell membrane formation, dictates the proper conformation of many membrane proteins, retains stability on some secreted glycoproteins, and plays a role in cell-cell adhesion. To further understand the roles of protein glycosylation in yeast, the Zhu and Snyder teams reasoned that because proteins on the yeast proteome microarrays are expressed in their original host, they should maintain most of their PTMs; thus these arrays can be used to profile glycosylation using fluorescently labeled lectins, such as Concanavalin A (ConA) and Wheat-Germ Agglutinin (WGA) (Kung et al., 2009). A total of 534 proteins were identified, 406 of which were not previously known to be glycosylated. Many proteins in the secretory pathway were identified, as well as other functional classes of proteins, including TFs and mitochondrial proteins. Upon treatment with tunicamycin, an inhibitor of N-linked protein glycosylation, two of the four mitochondrial proteins identified showed partial distribution to the cytosol and reduced localization to the mitochondria, suggesting a new role of protein glycosylation in mitochondrial protein function and localization.
Antibodies are widely applied for many purposes in proteomic studies. Because of their specificity, monoclonal antibodies (MAb) are a better option compared to polyclonal antibodies for most applications. In 2007, Hu et al. identified and characterized monoclonal antibodies against human liver antigen using a high-throughput screening system of 1,058 unique human proteins in protein microarrays. Five potential specific MAbs were successfully identified and applied to detect the protein profile difference between normal liver and hepatoma cells. Using a similar platform, Jeong et al. (2012) combined immunization with live human cells and microarray-based analysis to develop a rapid identification method of monospecific monoclonal antibody (mMAb). Because a human proteome microarray composed of ~17,000 individually purified full-length human proteins was used in the monoclonal antibody binding assays, antibodies that only recognized a single antigen on the microarrays could be identified as highly specific mMAbs. Using a series of assays, including Western blot (WB), immunoprecipitation (IP), and immunocytochemistry (ICC), in transfected human cells, these authors demonstrated that the identified mMAbs are likely to be useful in all of three of the above applications.
Protein posttranslational modifications (PTMs) are one of the most important mechanisms to directly regulate protein activity. Among the hundreds of PTMs identified so far, enzyme-dependent, reversible protein (de)phosphorylation, (de)ubiquitylation, (de)SUMOylation, and (de)acetylation, as well as glycosylation, are perhaps the most well studied. To fully understand the biological consequences of these PTMs, it is important to identify their downstream targets at the systems level. Recent advances in “shotgun” MS/MS techniques have identified many PTM sites in mammalian proteomes; however, this bottom-up approach does not help to connect the identified PTM sites to their upstream modification enzymes. Therefore, we and others have developed various types of enzymatic reactions on the functional protein microarrays to identify direct in vitro targets of these enzymes (Table 3).
Protein phosphorylation plays a central role in almost all aspects of cellular processes. The application of protein microarray technology to protein phosphorylation was first demonstrated by Zhu et al. (2000). They immobilized 17 different substrates on a nanowell protein microarray, followed by individual kinase assays with almost all of the yeast kinases (119/122). This approach allowed them to determine the substrate specificity of the yeast kinome and identify new tyrosine phosphorylation activity.
In a later report, Snyder’s group accomplished a large scale “Phosphorylome Project” using the yeast proteome microarrays. Eighty-seven purified yeast kinases or kinase complexes were individually incubated on yeast proteome arrays in a kinase buffer in the presence of 33P-γ-ATP. A total of 1,325 distinct protein substrates were identified, representing 4,129 phosphorylation events. These results provided a global network that connect kinases to their potential substrates and offered a new opportunity to identify new signaling pathways or cross-talk between pathways. Several smaller scale studies of kinase-substrate interactions have been reported in higher eukaryotes. For instance, a commercially available human protein microarray comprised of approximately 3,000 individual proteins was used to identify substrates of cyclin-dependent kinase 5 (Cdk5), a serine/threonine kinase that plays an important role during CNS development (Schnack et al., 2008).
Using an Epstein Barr herpesvirus (EBV) protein microarray, Zhu et al. (2009) investigated the function of an EBV-encoded protein kinase, BGLF4, via phosphorylation and binding assays on the arrays and identified a total of 23 BGLF4 substrates and interactors. By focusing on EBNA1, which is essential for replication and maintenance of the episomal EBV genome during latency, the authors showed that BGLF4 acts as a negative regulator of EBNA1 replication function and raised the possibility that induction of BGLF4 kinase activity may provide a novel means of eliminating EBV genomes from latently infected cells.
Ubiquitylation is one of the most prevalent PTMs and controls almost all types of cellular events in eukaryotes. To establish a protein microarray-based approach for identification of ubiquitin E3 ligase substrates, Lu et al. (2008) developed an assay for yeast proteome microarrays that uses a HECT-domain E3 ligase, Rsp5, in combination with the E1 and E2 enzymes. More than 90 new substrates were identified, eight of which were validated as in vivo substrates of Rsp5. Further in vivo characterization of two substrates, Sla1 and Rnr2, demonstrated that Rsp5-dependent ubiquitylation affects either posttranslational processing of the substrate or subcellular localization.
Histone acetylation and deacetylation, which are catalyzed by histone acetyltransferases (HATs) and histone deacetylases (HDACs), respectively, are emerging as critical regulators of chromatin structure and transcription. It has been hypothesized that many HATs and HDACs might also modify non-histone substrates. For example, the core enzyme, Esa1, of the essential nucleosome acetyltransferase of H4 (NuA4) complex, is the only essential HAT in yeast, which strongly suggested that it may target additional non-histone proteins that are crucial for cell to survive. To identify non-histone substrates of the NuA4 complex, Lin et al. (2009) established and performed acetylation reactions on yeast proteome microarrays using the NuA4 complex in the presence of [14C]-Acetyl-CoA as a donor. Surprisingly, 91 proteins were found to be readily acetylated by the NuA4 complex on the array. Twenty of the identified proteins were randomly chosen for further validation, and 13 of them showed Esa1-dependent acetylation in cells. One of them, phosphoenolpyruvate carboxykinase (Pck1p), was further characterized to explore the possible link between acetylation and metabolism. A mass spectrometry-based assay revealed Lys19 and 514 as the acetylation sites of Pck1p, and mutagenesis analyses demonstrated that acetylation on K514 is critical to enhance Pck1p’s enzyme activity and results in a longer life span for yeast cells growing under starvation. This study offers a molecular link between the HDAC Sir2 and yeast longevity.
In a more recent study, Lu et al. (2011) focused on in-depth characterization of another nonhistone substrate, Sip2. Sip2 is one of three regulatory β subunits of the SNF1 complex (the yeast homolog of AMP-activated protein kinase), and its protein level decreases as cells age. We used mutants at four acetylation sites, K12, 16, 17 and 256, to study acetyl-Sip2 function. Sip2 acetylation, controlled by antagonizing NuA4 acetyltransferase and Rpd3 deacetylase, enhances interaction with kinase Snf1, the catalytic α subunit of Snf1 complex. Sip2-Snf1 interaction inhibits Snf1 activity, thus decreasing phosphorylation of a downstream target, Sch9, ultimately leading to impaired growth but extending the yeast replicative lifespan. We also demonstrate that the anti-aging effect of Sip2 acetylation is independent of nutrient availability and TORC1 activity. Therefore, intrinsic aging stress, signaled via the Sip2-Snf1 acetylation, constitutes a second TORC1-independent pathway regulating Sch9 activity that controls lifespan in yeast.
S-nitrosylation is independent of enzyme catalysis but an important PTM that affects a wide range of proteins involved in many cellular processes. Recently, Foster et al. (2009) developed a protein microarray-based approach to detect proteins reactive to S-nitrosothiol (SNO), the donor of NO+ in S-nitrosylation, and to investigate determinants of S-nitrosylation (Foster et al., 2009). S-nitrosocysteine (CysNO), a highly reactive SNO, was added to a yeast proteome microarray, and the nitrosylated proteins were then detected using a modified biotin switch technique. The top 300 proteins with the highest relative signal intensity were further analyzed, and the results revealed that proteins with active-site Cys thiols residing at N-termini of alpha-helices or within catalytic loops were particularly prominent. However, substantial variations of S-nitrosylation were observed even within these protein families, indicating that secondary structure or intrinsic nucleophilicity of Cys thiols was not sufficient to interpret the specificity of S-nitrosylation. Further analyses revealed that NO-donor stereochemistry and structure had a significant impact on S-nitrosylation efficiency.
Extending the applications in basic research, protein microarrays have proven highly useful in clinical research because the development of almost all diseases is related to protein function and interaction. In addition, the protein microarray format can be directly employed to develop highly sensitive and specific diagnostic and detection tools.
A new trend in the field of host-pathogen interactions is to use host protein microarrays to survey relationships between a pathogenic factor of interest and the host proteome. This idea is particularly suited for studying host-virus interactions because, after entering the host cells, the viral genome and proteins are in direct physical contact with the host components, whereby a pathogen can hijack/exploit the host pathways and machineries for its own replication. This approach would alleviate the problems associated with RNAi-based screening in identifying direct host targets (Brass et al., 2009; Karlas et al., 2010; Shapira et al., 2009).
Yeast proteome microarrays have been used to identify specific RNA-binding proteins for antiviral activities (Zhu et al., 2007). In these experiments, arrays were incubated with a fluorescently tagged small RNA hairpin containing a clamped adenine motif (CAM), which is required for the replication of Brome Mosaic Virus (BMV), a plant-infecting RNA virus that can also replicate in budding yeast. Two of the candidate proteins, Pseudouridine Synthase 4 (Pus4) and the Actin Patch Protein 1 (App1), were further characterized in Nicotiana benthamiana. Both of them modestly reduced BMV genomic plus-strand RNA accumulation and dramatically inhibited the spread of BMV in plants.
Pathogen entry and infection in the host cell are a series of processes that abuse protein pathways and interactions. Li et al. (2011) demonstrated the use of protein microarrays to study conserved serine/threonine kinase of herpesvirus that play an important role in their replication in human cells. They identified shared substrates of the conserved kinases from herpes simplex virus, human cytomegalovirus, EBV, and Kaposi’s sarcoma-associated herpesvirus using human proteomic chips. From this study, they found that the histone acetyltransferase TIP60, an upstream regulator of the DNA damage response pathway, was essential for herpesvirus replication. This finding is promising for broad-spectrum anti-viral development.
Recently, Chen and his colleagues used a unique approach to study the interaction between antimicrobial peptide and the bacterial proteome in the gut. They probed Lactoferricin B (Lfcin B) with a protein chip that contained the whole proteome of E. coli K12 and successfully identified 16 proteins that affected the tricarboxylic acid (TCA) cycle. Further validation using knockout assays revealed that phosphoenolpyruvate carboxylase was the target of Lfcin B (Tu et al., 2011). In a parallel study, this group found that the same antimicrobial peptide also targeted other two response regulators, BasR and CreB, of the two-component system (TCS) by inhibiting their phosphorylation (Ho et al., 2011). This is the first study to show that a host antimicrobial peptide attacks bacterial TCS.
Biomarkers are a crucial tool in expeditious detection of infection or diagnosis of autoimmune diseases. In most cases, production of antibodies against pathogens or autoantibodies in blood serum is correlated with infection or occurrence of autoimmune diseases. Therefore, to find a powerful biomarker, protein microarrays can be used to directly detect antibodies that statistically correlate with the corresponding disease in a patient’s serum. Zhu et al. (2006) developed the first viral protein microarray to detect biomarkers for severe acute respiratory syndrome (SARS). This array, which consisted of all SARS coronavirus (SARS-CoV) proteins, as well as proteins of five additional coronaviruses, could readily distinguish serum samples as SARS-positive or SARS-negative with 94% accuracy compared to the traditional ELISA method.
Another application was a 2009 report by Chen et al., in which potential biomarkers of inflammatory bowel diseases (IBDs) were identified. Serum samples from healthy and IBD (Crohn’s disease (CD) and ulcerative colitis (UC)) patients were screened using E. coli K12 proteome microarrays, and 417 proteins were detected as possible candidates (169 proteins were identified as highly immunogenic in healthy controls, 186 proteins highly immunogenic in CD, and only 19 highly immunogenic in UC). Following k-TSP bioinformatics analysis, novel sets of biomarkers for the diagnosis of CD versus healthy control and CD versus UC were identified (Czajkowski and towski, 2011).
More recently, human protein microarrays have been widely used to identify biomarkers for a variety of autoimmune diseases. Unlike the above-mentioned two studies, when the size of cohorts to be screened is small (e.g., 20 patients versus 20 healthy controls), researchers often face the so-called “overfitting” problem. In other words, even though a candidate protein shows a high penetration rate in a small patient cohort, its ability to distinguish a patient sample from the healthy ones often falls apart when applied to a much larger patient population (e.g., >100). Therefore, validation of potential biomarkers using a larger cohort in a double-blind fashion is always desired. However, because of the high cost associated with functional protein microarrays, consumption of a large number of protein microarrays may not be practical. To alleviate this problem, Song et al. (2009) developed a two-phase strategy for rapid identification of biomarkers for autoimmune hepatitis (AIH) in a cost-effective way. In phase I, the authors fabricated a human protein microarray composed of 5,136 individually purified proteins on poly-L-lysine-coated surfaces and quickly identified 11 candidate autoantigens with a relatively small serum collection. In phase II, they fabricated an AIH-specific protein chip with the 11 proteins and obtained autoimmunogenic profiles of serum samples from 44 AIH patients, 50 healthy controls, and 184 additional patients suffering from hepatitis B, hepatitis C, systemic lupus erythematosus, primary Sjogren’s syndrome, rheumatoid arthritis, or primary biliary cirrhosis. Statistical analyses of these profiles revealed that three antigens, RPS20, Alba-like, and dUTPase, were highly AIH-specific biomarkers, with sensitivities of 47.5% (RPS20), 45.5% (Alba-like), and 22.7% (dUTPase). These potential biomarkers were further validated with additional AIH samples in a double-blind design.
In additions, there have been a series of studies that employed pathogen protein microarrays to profile serological responses following infection. For examples, protein microarrays have been developed in bacteria and viruses for biomarker identification in various infectious diseases (Doolan et al., 2008; Liang et al., 2011; Luevano et al., 2010; Vigil et al., 2011). These studies have clearly demonstrated the power of protein microarrays in identification of potential biomarkers; however, several shortcomings are repeatedly seen in these studies. For instances, many of these arrays were fabricated using proteins translated in E. coli lysates without purification (Doolan et al., 2008; Liang et al., 2011; Luevano et al., 2010; Vigil et al., 2011). Because these proteins are contaminated with unwanted E. coli proteins, sensitivity of the assay is likely reduced due to their high immunogenicity (Chen et al., 2008). As a result, E. coli lysates had to be used as a blocking reagent to alleviate this problem. Also problematic is that in many of these studies, identified biomarkers were not validated with additional cohorts and therefore, the possibility of overfitting was not completely ruled out.
Protein microarrays have become one of the most powerful tools in proteomic studies and can be applied with many different purposes. High-throughput processing has become the trend due to cost reduction and high productivity of results. Given a high-throughput and parallel system, protein microarrays will speed up new findings in protein interactions for basic research as well as clinical research purposes. Reduction of sample volume usage is one important factor that demonstrates the superiority of this technology compared to other techniques. This factor is especially important for clinical research that uses precious samples, such as human serum samples. In addition, the high sensitivity and specificity of protein microarrays provides a powerful tool in quantifying and profiling proteins.
Despite many successful applications of protein microarrays, limitations of this technology still leave many challenges to be overcome. The large-scale production of high-quality antibodies using recombinant platforms is still hard to be applied due to the complexity of expression and purification procedures. Another consideration is to perform high-throughput detection without sample labeling. Label-free detection systems should be the future of protein microarrays. In conclusion, although this technique still needs to be explored, it would not be surprising if after several years, this technique is the leading technology in proteomic and diagnostic fields.
This work is supported in part by the NIH (RR020839, DK082840, RO1GM076102, CA125807, CA160036, and HG006434 to HZ; R01EY017589 to JQ), National Science Council, Taiwan (NSC 100-2320-B-008-001to C-S C) and Taipei Medical University Hospital (100TMUH-NCU-005 to C-S C).
NIH-funded authors: J. Qian and H. Zhu