|Home | About | Journals | Submit | Contact Us | Français|
Technical advances have seen the rapid adoption of genomics and multiplex genetic polymorphism identification to research on vascular diseases. The utilization of proteomics for the study of vascular diseases has been limited by comparison. In this review we outline currently available proteomics techniques, the challenges to using these approaches and modifications which may improve the utilization of proteomics in the study of vascular diseases.
Common vascular diseases, including atherosclerosis and aortic aneurysm appear to result from interactions between environmental risk factors and genetic predisposition which exacerbate normal aging processes. Mechanisms underlying these diseases include inflammation, alterations of the vascular extracellular matrix and dyslipidaemia. Changes in protein expression are a feature of pathological progression and drive such alterations. Inflammation, for example, is determined by the expression of adhesion molecules, chemokines and pro-inflammatory cytokines. Understanding the protein profiles associated with different vascular diseases may help to improve disease management in a number of areas including diagnosis, prognosis and treatment.
Traditionally, identification of new disease-associated proteins relied on hypothesis-led approaches, using techniques such as ELISA and western blotting to investigate expression of individual candidates. However, the relatively recent development of techniques, including two-dimensional electrophoresis in 1975,1 which enable more complete protein profiling have given rise to a discipline known as “proteomics”, with potential to investigate protein expression in response to endogenous and exogenous stimuli. Using a suite of sample preparation, fractionation, separation and analysis tools a modern proteomics facility can perform quantitative ‘differential display’ experiments comparing cohorts of control and disease samples, highlighting molecules directly related to and/or indicative of disease processes (Figure 1). Whilst similar data can be gained from gene expression studies, mRNA levels do not always reflect protein expression and cellular phenotype.2 Thus, proteomics provides data not easily obtained from other post-genomic technologies (Figure 2).3 Consequently, proteomic techniques have been applied to wide variety of organisms including plants, bacteria, fungi, and metazoa. In a medical context, proteomic investigations have the potential to uncover proteins of therapeutic and diagnostic potential. However, despite ongoing endeavors, proteomic science has yet to provide a significant breakthrough in this area, with many putative markers failing during validation. In this review we outline proteomic techniques and discuss current technical challenges and likely future developments.
The Human Proteome Organisation (HUPO) defines a protein as a complex molecule made up of one or more peptide chains (a peptide consisting of two or more chemically-linked amino acids), which perform a wide variety of functions and are essential to the life of the cell (http://www.hupo.org/overview/glossary/). Further definitions of some common terms used in proteomics are provided in Table 1.
Proteins are encoded by the genome, with each open reading frame giving rise to an average of 6–7 protein products. Thus, it is estimated that the human genome comprising 23,000–40,000 genes may give rise to up to one million proteins, in addition to approximately 600,000 serum immunoglobulins with slight variations in epitope binding regions.4–6 However, protein structure is not limited to genetic coding as post-translational modifications (PTMs) such as oxidation, phosphorylation and glycosylation are performed throughout cellular metabolism, and can not be predicted by gene analysis. PTMs alter protein function, and consequently have pathological significance. For example the, matrix gamma-carboxyglutamate protein (MGP) is a known inhibitor of cardiovascular calcification, and the uncarboxylated form (uMGP) has been shown to accumulate within calcified tissue. Consequently, circulating uMGP is a proposed biomarker for vascular calcification, with low titres a risk factor for vascular calcification.7 Accordingly, investigations into the expression of proteins and PTMs in disease-states have potential to generate data of clinical significance.
Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), separating proteins according to charge (first dimension), and molecular weight (second dimension), can profile thousands of polypeptides within a sample, enabling non-hypothesis led investigations. Changes in protein expression, evidenced through alterations in spot density, are determined through computer-aided comparisons of 2D profiles produced from diseased and control tissues and highlight proteins specifically expressed in response to disease. Spots of interest can be directly excised from gels for identification through mass spectrometry (MS). Many of the published vascular proteomics investigations have been conducted using a gel-based format (Table 2).8–13 For example, gel-based investigations performed by Urbonavicius et al. (2008), confirm the role of oxidative stress in aneurysm rupture, confirmed by the upregulation of stress-response proteins when compared to non-ruptured controls.14
Criticisms of 2D-PAGE include gel-to-gel variability, inability to analyse highly basic and hydrophobic proteins, poor visualization of very high- or very low molecular weight proteins, under-representation of weakly-expressed proteins, and a lack of easy automation.15–19 Some of these criticisms relate to the difficulty of good sample preparation, consequently poor samples are the cause of most poor 2D gels. However, 2D-PAGE is the only technique capable of simultaneously resolving thousands of post-translationally modified proteins due to the, influence such modifications exert on the migration of proteins during electrophoresis which is easily visualized following 2D-PAGE. For example, oxidized 1-Cys peroxiredoxin expressed within the aortas of apolipoprotein E (Apo E−/−) deficient mice has a more acidic charge than the form expressed by Apo E+/+ controls, and migrates to a different position following 2D-PAGE.20 Thus, gel-based investigations generate data difficult to obtain from other proteomic techniques, supporting their continued use.
The introduction of increasingly sensitive protein stains has enhanced the analytical power of 2D-PAGE. The development of the difference gel electrophoresis (DiGE) technique,21 in which fluorescent cyanine dyes are covalently bound to lysine residues within sample proteins, enables analysis of multiple samples within the same gel. Furthermore, DiGE protein expression is compared to a known internal standard, reducing gel-to-gel variation. Currently, the simultaneous analysis of 2 samples (2-plex) is possible, and is particularly useful when comparing protein expression in the same individual at different time points or following interventions, such as a drug treatment. However, the application of DiGE to vascular investigations has been limited, possibly due to the cost of compatible reagents and specialized equipment.13, 22
MS is the central protein identification technique in almost all proteomic investigations, with the exception of chip array experiments. A mass spectrometer measures the mass of ions within a sample, and comprises an ion source, a mass analyzer which measures the mass-to-charge ratio (m/z) of the ions, and a detector to record the frequency of ions at each m/z value.23 In proteomic studies MS analysis is commonly conducted on ionised peptides yielded from enzymatic digestion of sample proteins. MS analysis enables quantitative investigation into complex protein samples regardless of the hydrophobicity, charge and size of constituent components,24 without the need for as large a quantity of sample proteins needed for gel-based approaches.
With this in mind, MS-based proteomic approaches are useful when working with scarce or protein-poor samples. The commercialization of reagents for MS-led investigations has improved access to these approaches. For example, protein labeling techniques using stable ‘heavy’ isotopes (e.g. 15N, 2H, 18O and 13C), generate a detectable shift from native molecular mass, to distinguish between experimental treatments.25–30 These techniques have increased the multiplex capabilities of MS, enabling the simultaneous analysis of several samples. Stable isotopes can also be used to supplement cell culture media, or animal chow to enable MS-based investigations of protein expression in actively metabolising cells and tissues. 31–34 However, MS-based methods do not readily provide information about PTMs which are plainly visible on 2D gels. An online supplement discussing the principles of MS is available.
Protein arrays are equivalent to gene arrays, and assess relative protein expression within sample tissue. Such assays usually consist of specific antibodies immobilized on a chip or membrane which are incubated with proteins extracted from diseased or control samples, enabling antigen/antibody complexes to form. This approach can be used as a means to fractionate crude samples for further analysis (for example in SELDI-MS), or protein binding can be quantified, reflecting antigen levels within patient samples permitting direct investigation into protein expression. The investigative potential of this approach is limited to commercially available antibody panels, thus protein microarrays are more suited to hypothesis-driven research than novel protein discovery per se. However, the two approaches are compatible as proteins discovered through non-biased investigations can be validated in large cohorts using the microarray format, providing that appropriate antibodies are available. It is expected that future developments of this technique will allow more comprehensive protein assessment; however, the range of protein isotypes, fragments and modifications expressed in vivo present considerable challenges for true global analysis using protein arrays.
Examination of whole tissue samples is complicated by the presence of abundant proteins which mask more conservatively expressed molecules. This is particularly true of serum or plasma, in which albumin accounts for 55% total protein (normal concentration range 35–50 mg/ml). In contrast other proteins are present in much lower concentrations (e.g. interleukin 6, normal concentration range 0–5 pg/ml).35–38 Thus, plasma protein expression is estimated to span 10 orders of magnitude, often referred to as the ‘dynamic range’ of the sample. Such variation cannot be visualised as current techniques capture a dynamic range spanning 4–5 orders of magnitude and only abundant molecules will be profiled from crude samples.35, 39 Similarly, analysis of vascular tissue is complicated by cellular heterogeneity, favouring proteins expressed by dominant cell types, where the dynamic range spans at least 5–6 orders of magnitude.17 Weakly expressed molecules can be enriched by depleting dominant proteins. However, this can introduce experimental error by removing protein complexes, or misrepresenting genuine variations in the expression of abundant proteins.36, 40
Sample complexity can be reduced by studying a sub-group of proteins expressed within tissues using physical extraction techniques such as laser microdissection to isolate specific cells from tissue preparations,25, 41 biochemical methods,22, 42 or cellular approaches to identify proteins secreted from cell lines or tissue explants (the secretome). Secretome analysis is attractive since soluble secreted proteins are more amenable to proteomic profiling than their hydrophobic counterparts. Furthermore, such studies highlight proteins which may enter the bloodstream, revealing a panel of putative circulating biomarkers.
Secretome analysis is commonly conducted through in vitro incubation of tissue explants,43–45 or cell lines,46, 47 and 2D-PAGE/MS analysis of culture fluid. However, in vitro culture inevitably results in some degree of sample deterioration which releases cellular proteins into the culture media. Under typical conditions cellular proteins cannot be differentiated from secreted molecules at the gel level, potentially resulting in the discovery of false biomarkers. Truly secreted proteins can be putatively revealed by analyzing proteins to detect amino acid motifs involved in protein secretion following MS identification.48 Such analyses generate more robust datasets, although extensive MS investigations are necessary before the truly secreted proteins can be differentiated. Modifications to cell culture and proteomic techniques facilitate identification of secreted proteins, offering an alternative to exhaustive MS/bioinformatic analysis. Radiolabelled amino acids (e.g. 35S-methionine), can be used as metabolic tags are detectable at the gel level; non-labelled proteins can be considered degradation products and discounted from further analysis.49 Potential drawbacks such as the need for specialist radioactive facilities, and radiation-induced alterations in protein expression can be alleviated by using stable isotopes, and SILAC has been used to characterise biomarkers secreted from a variety of tissues.50–52
There are currently no formalized guidelines for the design of proteomic experiments and data presentation, and early proteomic investigations often documented two-dimensional protein profiles or mass spectra. Such analyses are no longer considered suitable for publication, and protein identification is essential. The recent publication of the minimum information about a proteomics experiment (MIAPE) document by HUPO’s protein standards initiative (HUPO PSI), has provided more stringent guidelines for reporting proteomic data.57 Whilst these suggestions are not concrete, many of the leading journals in this field employ MIAPE criteria, insisting on the provision of metadata to increase confidence in the proteomic experiment and subsequent analyses (e.g. MS and statistics). Increased visibility of metadata has stimulated the development of open-access online repositories such as the proteomics identifications database (PRIDE),53, 54 which capture MS identifications and annotations from proteomics investigations in a standard format. It is envisaged that such repositories will act as a valuable research tool, whilst upholding HUPO PSI standards and vocabulary.
As with other scientific methodologies, the MIAPE document emphasises the need for replication to generate reliable data. Calculation of suitable sample sizes can be problematic, and depends on the complexity of the proteomes to be analysed and the level of variation anticipated between treatments. Calibration experiments using standard protein mixtures of similar complexity to tissues of interest may help to ensure suitable replication.55, 56 Alternatively, the use of power calculations can help determine suitable sample sizes.57 In the case of gel-based investigations, a minimum of 5 replicates per treatment has been recommended, with modifications to established statistical analyses to limit the number of false protein spots observed.58 Whilst this improves confidence in experimental data, replication in medical experiments may be problematic due to the paucity of suitable samples. Similarly, the availability of control tissue can limit potential investigations, especially in vascular experiments where healthy vasculature is not routinely sampled. Samples recovered at autopsy may increase experimental variation through physiological changes post-mortem and differences in tissue collection protocols. Furthermore, post-mortem samples are of limited use when investigating chronic diseases such as atherosclerosis, where variables including age and prolonged exposure to therapeutic compounds may complicate case/control matching. Small quantities of tissue such as aneurysm neck which can be removed from patients during surgical repair will match experimental samples and may be used as controls, but are unlikely to represent truly healthy vasculature despite being visibly free of complications.
Biological and technical variations can affect the reproducibility of proteomic investigations although commercialisation of common proteomic consumables has contributed towards increased reproducibility and experimental robustness. The development and standardization of sample preparation protocols appropriate to the experimental tissues represents the greatest challenge to effective and reproducible proteomic profiling. Successful proteomic investigations rely on understanding the sample that is to be analysed and knowledge of how to extract and solubilise the desired proteins whilst eliminating contaminants. Sample preparation can appear daunting primarily because diversity within even the simplest proteome cannot be captured by any single extraction and separation. Over the past decade, fractionation has been widely adopted to reduce sample complexity and enable analysis of weakly expressed proteins. Diversity in sample preparation is reflected by scientific literature.59, 60 Despite this, inter-laboratory variation remains an issue, as discussed by Callesen et al. (2008), who report that only 25% of reported biomarker peaks for breast cancer could be detected by other research groups following SELDI-MS.61
Molecular identification is commonly achieved after MS analysis by searching electronic databases containing protein and/or translated gene sequences. Database searches generate putative identifications based on homology between experimental peptide sequences or fragmentation patterns, to those of known molecules. Putative identifications are quantified by assigning statistical scores (such as statistical error) to reflect the surety of each search; identifications are only accepted when the calculated error falls below accepted thresholds.17 This approach is only possible when databases contain data relating to the experimental proteins and relies on comprehensive datasets. In the case of medical investigations, MS-based protein identification is facilitated by the completion of the human genome project, and increased availability of genomic data for common animal models. However, identification of weakly expressed polypeptides remains challenging in the absence of a PCR equivalent for proteins.
Whilst desireable for publication, protein identification may not always be needed for clinical applications, especially where disease-state protein profiles are distinctly different to healthy controls. For example, MS analysis of low-molecular weight proteins expressed by platelets reveals distinctive changes in spectral peaks which discriminate resting platelets from those aggregated by ADP or thrombin-activator peptide (sensitivity 65%, specificity 89%). Platelets isolated from patients receiving clopidogrel therapy show a 70% reduction in ADP-induced aggregation, this is mirrored by a similar MS profile to resting platelets.10 Such analyses may be of particular benefit when assessing patient response to medication, highlighting appropriate therapeutic regimes. Similarly, case-control comparisons of urine protein profiles in coronary artery disease (CAD) demonstrated increased expression of a panel of 17 polypeptides in CAD patients. Using this pattern as a diagnostic tool, von Zur Muhlen et al. (2009) reported 81% sensitivity and 92% specificity using samples from 26 CAD patients and 12 controls in a blinded study. However, later analyses revealed a similar protein profile in patients with chronic renal failure. suggesting that renal damage may disturb the urine proteome and complicate proteomic analysis.62
Proteomic techniques document the expression of molecules which directly influence cell phenotype, and therefore provide data of clinical relevance. These techniques can be applied to a wide variety of organisms and biological samples, and can be applied to investigate protein expression in vitro or in vivo. Proteomic tools permit analysis of thousands of proteins and their PTMs and enable non-biased appraisals of the molecular biology of disease states, highlighting molecules which may be overlooked in hypothesis-driven scenarios. As described above, proteomic science is a relatively recent development. Currently proteomics receives many criticisms, notably low reproducibility evidenced by gel-to-gel and inter-laboratory variation. Despite this, it is likely that these issues will diminish as protocols and reporting requirements become standardized and are universally applied. Identification of weakly expressed molecules via MS remains problematic, although techniques such as western blotting may be used to overcome this challenge.
The application of proteomics to vascular disease is at an early stage, and without subsequent downstream analyses, proteomic experiments merely provide lists of protein data with little practical value. Proteomic investigations generate large datasets which require expertise for meaningful analysis, especially in the case of large or complex experiments. The continued use of proteomics has provided the stimulus for improvements in data visualization, contributing to the establishment of bioinformatics as a specialist field. Currently, it is becoming common for proteomic data to be displayed as interaction networks, highlighting associations (e.g. synchronous up/down-regulation) between identified molecules.63, 64 This is of particular benefit when investigating disease mechanisms, or therapeutic pathways. The most routine application of proteomic techniques in vascular medicine is to discover biomarkers of diagnostic, therapeutic and prognostic potential. However, protein identifications generated by proteomic techniques remain putative, requiring validation through more established approaches including ELISA, western blotting and immunohistochemistry, using reliable antibodies. Furthermore, in order to assess biomarker performance molecules of interest must be valididated in large patient populations using high-throughput techniques which is time consuming and limited by access to large patient cohorts.
The investigative power of proteomics is greatly magnified when combined with other post-genomic techniques. Combined proteomic and metabolomic analyses provide direct evidence of the effect of protein expression on cellular processes. Using this approach, Perlman et al. (2009) investigated the cardioprotective mechanisms following nitrate exposure. Nitrate administration resulted in a short-lived increase in cardiac nitrate levels, but substantial elevations in cardiac ascorbate oxidation. This was accompanied by significant improvements in cardiac contractile recovery following ischemia-reperfusion after preconditioning with low (0.1 mg/kg) or high (10 mg/kg) nitrate doses. Proteomic analysis of cardiac mitochondria revealed dose-dependant PTM of isoforms of 3 proteins involved in serine/threonine kinase signaling, anti-oxidant protection and cell metabolism. This led the authors to suggest that a similar mechanism may underpin the cardioprotective value of physical exercise and a diet containing nitrite/nitrate-rich foods.65 Such systemic strategies have been successfully employed in other biological situations, although relatively few vascular investigations have embraced this multi-omics approach. Thus it seems likely that vascular medicine will greatly benefit from the future application of proteomics in a ‘multi-omics’ context.
JVM and JG thank Dr Cathy Rush for proof reading and helpful discussion. JG is supported by grants from the NHMRC, Australia (project grants 540403, 540404, 540405) and NIH, USA (RO1 HL080010), and a Practitioner Fellowship from the NHMRC, Australia (431503).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.