|Home | About | Journals | Submit | Contact Us | Français|
The cellular adhesion pathway is critical in the pathophysiology of atherosclerosis, and genetic factors contributing to regulation of circulating levels of related proteins may be relevant to risk prediction of cardiovascular disease. In contrast to conducting separate genome-wide protein quantitative trait loci (pQTL) mapping analyses of each individual protein, joint genetic association analyses of multiple quantitative traits can leverage cross-trait co-variation and identify simultaneous regulatory effects on protein levels across the pathway. We conducted a multi-pQTL (mpQTL) analysis of 15 proteins related to cellular adhesion assayed on 2,313 participants from the Multi-Ethnic Study of Atherosclerosis (MESA). We applied the MQFAM multivariate association analysis method in PLINK on normalized protein level residuals derived from univariate linear regression, adjusting for age, sex, and principal components of ancestry. Race/ethnicity-stratified analyses identified nine genome-wide significant (P<5e-08) loci associated with co-variation of protein levels. Although the majority of these SNPs were in proximity to structural genes of the assayed proteins, we discovered multiple loci demonstrating co-association with the circulation of at least two proteins. Of these, two significant loci specific to non-Hispanic white participants, rs17074898 at ALOX5AP (P = 1.78E-08) and rs7521237 at KIAA1614 (P = 2.2E-08), would not have met statistical significance using univariate analyses. Moreover, common patterns of multi-protein associations were discovered at the ABO locus across race/ethnicity. These results indicate the biological relevance of blood group antigens on regulation of circulating cellular adhesion pathway proteins while also demonstrating race/ethnicity-specific co-regulatory effects.
The cellular adhesion pathway is a promising avenue of research to understanding molecular mechanisms of inflammation that could lead to biomarkers for prediction, early detection, and prognosis of atherosclerotic diseases. Members of these adhesion families in combination with a host of other types of molecules, including actin, thrombin, chemokines, kinases, transcription factors, cytokines, apolipoproteins, fibrinogen, growth factors, and matrix metalloproteinases, comprise the cellular adhesion pathway and link the processes of hemostasis, thrombosis, and inflammation. Despite the fact that adhesion proteins have been shown to predict increased risk of incident CHD (Bielinski et al. 2015; Cesari et al. 2003; Demerath et al. 2001; Hwang et al. 1997), many potentially important proteins and the corresponding coding genes have yet to be fully studied. Single-protein genome-wide association studies (GWASs) have been conducted for only a few select adhesion proteins (Barbalic et al. 2010; Pare et al. 2008; Pare et al. 2011; Paterson et al. 2009; Reiner et al. 2013).
In GWAS where multiple traits (e.g., protein measurements) are available for association analysis, it is common practice to assess these traits in a separate, or univariate, manner, with post-hoc comparisons of associated loci across traits. However, recently developed multivariate analysis methods(Ferreira and Purcell 2009; O’Reilly et al. 2012; Stephens 2013) accommodate the simultaneous analysis of multiple potentially correlated traits. Multivariate GWAS methods have been shown to be as or more powerful than univariate GWAS approaches, taking advantage of cross-trait covariance and reducing multiple testing dimensionality.(Galesloot et al. 2014) Simultaneous assessment of multiple protein measurements relevant to the adhesion pathway may provide additional context to genetic modulation of these proteins at a systemic level. Therefore, in this study, we present a multi-protein quantitative trait locus (mpQTL) association analysis to identify genetic variants associated with co-variation in circulating levels of 15 proteins related to cellular adhesion in the Multi-ethnic Study of Atherosclerosis (MESA).
MESA is a multi-center study that enrolled African (AFA), Chinese (CHN), Hispanic (HIS), and non-Hispanic white (EUR) Americans between 2000–02 (Exam 1), with no history of CVD in order to investigate subclinical and clinical cardiovascular endpoints in a large and diverse population. Participating centers were located in Baltimore, MD; Chicago, IL; Forsyth County, NC; Los Angeles County, CA; northern Manhattan, NY; and Saint Paul, MN. The MESA study has been described in detail elsewhere(Bild et al. 2002). In order to measure a large number of circulating adhesion proteins in a representative sample of the MESA population, a stratified random sample including 720 individuals for each of the four races/ethnicities represented in MESA was used (N = 2880). This subgroup was randomly generated from the parent MESA cohort including all Exam 1 participants who gave consent for DNA sample use, and was stratified by race/ethnicity in order to reflect the diversity of the main MESA study. MESA and its ancillary studies were approved by the Institutional Review Board at participating centers and all participants gave written informed consent.
The assayed genome-wide genotype data consisted of four individual genotype panels on all MESA participants who consented for genetic studies: the Illumina Exome BeadChip(Huyghe et al. 2013), the Illumina Cardio-MetaboChip(Voight et al. 2012), the Illumina iSelect ITMAT/Broad/CARe (IBC) Chip(Keating et al. 2008)and the Affymetrix Genome-Wide Human SNP Array 6.0 (Affymetrix, Santa Clara). Each of the panels had quality control measures individually performed on the genotype data prior to merging them together using Plink v1.07(Purcell et al. 2007) under genome build NCBI build 37, and in total included 1,222,4822 directly genotyped polymorphic autosomal positions across all participants. Evaluation of relatedness revealed multiple participants belonged to sibships; to eliminate family structure one individual from each observed sibship was retained. Population stratification was assessed using STRUCTURE(Pritchard et al. 2000) and EIGENSTRAT(Price et al. 2006) was used to obtain ancestry informative principal components (PCs) based upon all unrelated MESA participants. The total number of informative PCs for subsequent analyses was selected based upon statistical and graphical evaluation of the corresponding eigenvalues.
Fifteen adhesion proteins were measured in plasma or serum samples obtained between September 2002 and February 2004 (Exam 2). Of the 2880 individuals included in the random sample, plasma was available for 2574 participants and serum was available for 2441 participants. Among these, 34 individuals were excluded due to CVD events prior to Exam 2, 1 due to cognitive impairment, and 3 individuals were excluded due to inconsistencies between the field center where they were enrolled and their self-reported race/ethnicity, as ethnic groups were specific for each field center during recruitment. Plasma was collected using EDTA as an anticoagulant and stored in ice, whereas serum was obtained allowing blood samples to clot at room temperature for 40 minutes. Both samples were centrifuged at either 4°C at 2,000g x 15 minutes or 3,000g x 10 minutes for a total of 30,000 g-minutes and subsequently stored at −70°C. Quantitative sandwich enzyme-linked immunosorbent assays (ELISA) were used to measure soluble proteins. Chemokine ligand 21 (CCL21), hepatocyte growth factor (HGF), interleukin-2 soluble receptor α(IL-2sR α), pro-matrix metalloproteinase 1 (pro-MMP-1), matrix metalloproteinase 2 (MMP-2), secretory leukocyte protease inhibitor (SLPI), soluble E-Cadherin (sE-cadherin), soluble intercellular adhesion molecule 1 (sICAM-1), soluble L-selectin (sL-selectin), soluble vascular cell adhesion molecule 1 (sVCAM-1), and tissue inhibitor of metalloproteinase 2 (TIMP-2) were measured in serum, while soluble P-selectin (sP-selectin), regulated on activation normal T cell expressed and secreted (RANTES), stromal derived factor 1α (SDF-1α) and transforming growth factor β1 (TGFβ1) were measured in plasma. The inter-assay coefficient of variation and the minimum detectable level for each of the ELISA assays, as well as the specific type of assay used, are summarized in Table S1.
All circulating protein measurements were initially log-transformed and linear regression was used to obtain protein-level residuals after adjusting for age, sex, and the first three ancestry-informative PCs. Residuals were then centered and scaled per protein to unit variance prior to analysis. Multivariate association analysis of single SNPs with the vector of 15 cellular adhesion protein residuals was conducted using the MQFAM(Ferreira and Purcell 2009) multivariate test of association implemented in PLINK(Purcell et al. 2007). This method is based upon canonical correlation analysis (CCA) and identifies the linear combination of protein measurements (i.e., canonical variate) optimally correlated with SNP minor allele dosage under an additive genetic model, producing an F-statistic and a vector of CCA variable (i.e., protein) loadings, ω, such that −1 ≤ωp≤ 1 for p = 1,…,15. These loadings correspond to correlations between the protein measurement and the canonical variate, and can be interpreted as an indication of the relative importance of that given protein for the association. All polymorphic autosomal SNPs with an empirical minor allele frequency (MAF) ≥5% and variant genotype call rate ≥90% in the respective racial/ethnic strata were considered for analysis. The relative signs of canonical loadings are arbitrary, as CCA is based upon the solution of an eigenvalue equation. To determine directionality of effect, we used linear regression to estimate univariate associations with the protein corresponding to the largest loading magnitude. Since MQFAM can also accommodate missing phenotype data, participants were included in the analysis if at least 8 out of 15 (>50%) protein levels were observed. To adjust for multiple testing, a single-SNP association was declared to be statistically significant if the nominal P-value was below a genome-wide significance threshold of alpha = 5e-08. All analysis steps were stratified by race/ethnicity; to combine analysis p-values across race/ethnicity for purposes of trans-ethnic meta-analysis, we applied Stouffer’s weighted Z method(Stouffer 1949), with weights based on relative number of analyzed samples per race/ethnicity. As a p-value aggregation approach, this analysis does not take into account consistency of protein-specific effects (direction and/or magnitude) across race/ethnicity.
After subject-level exclusions, a total of N = 2,268 participants (NAFA = 516, NCHN = 570, NEUR = 611, NHIS = 571) were eligible for mpQTL association analysis. Distribution summaries of the 15 proteins, separated by race/ethnicity, are presented in Table 1. A total of 14 out of 15 proteins exhibited significant mean differences in protein values by race/ethnicity when adjusting for age and sex; results were comparable with additional covariate adjustment of body mass index, alcohol use, and smoking status. In general, CHN participants tended to have lower circulation of cellular adhesion proteins than the other races/ethnicities, with the notable exception of TGFβ-1. In contrast, EUR and HIS participants tended to have higher protein concentrations, while AFA participants corresponded to the lowest levels of sE-cadherin and sVCAM-1. Graphical representations of the correlation structures of the residual protein measurements by race/ethnicity are presented in Figure S1, indicating strong positive correlations between protein pairs TIMP-2/MMP2 and RANTES/TGF-β1 as well as comparable overall patterns across racial/ethnic groups.
Overall, 970,081 common (MAF ≥5%) SNPs were assessed in our stratified analyses, and the number, genetic location, and pattern of protein association for significantly associated SNPs varied by race/ethnicity. A total of nine distinct loci corresponded to significant associations in at least one race/ethnicity (Table 2), with 13 proteins exhibiting high-magnitude (|ωp|>0.25) loadings among the top findings. Despite multiple strong associations identified in the other races/ethnicities, no significant findings were observed in African Americans. Most associated SNPs were in close physical proximity to one of the structural genes of the assayed proteins, correspondent to high-magnitude CCA protein loadings for that specific protein. Moreover, many protein loading patterns for loci significant across multiple racial/ethnic strata were similar in direction and magnitude. The complete set of significant and suggestive (P<1e-05) stratified analysis results with corresponding protein loadings are presented in Tables S2–S5 and as Manhattan plots (Figures S2–S5).
There were a total of 850,009 SNPs that had association results for >1 race/ethnicity and were evaluated under our trans-ethnic meta-analysis (Table S6). The combined race/ethnicity analysis p-values are presented as a Manhattan plot in Figure 1, revealing seven distinct significant loci. Of these, only one (rs12722588 near IL2RA; P = 7.66E-16) was not previously identified in the stratified analysis. IL2RA encodes the alpha chain of IL2-sr protein complex. The significant meta-analysis SNP rs492602, although not in linkage disequilibrium (LD) with nearby CHN finding FUT2 SNP rs1047781, tags SNP rs601338 in non-Asian populations (r2 > 0.99). Of note, rs1047781 and rs601338 respectively represent population-specific functional SNPs that knockout FUT2-encoded protein function(Kelly et al. 1995).
Two associated loci were identified solely in non-Hispanic whites. SNP rs17074898 at the ALOX5AP locus (P = 1.78E-08) demonstrated high-magnitude loadings for multiple proteins, including sP-selectin, SDF-1α, and TIMP-2, indicating a potential pathway-wide effect. SNP rs7521237 at the KIAA1614 locus (P = 2.2E-08), corresponded to similar loading patterns. Protein loadings and univariate protein association analysis results using linear regression are presented in Figure 2 for both SNPs. Notably, each of these loci would not have been identified by single protein QTL analysis under the same genome-wide significance threshold (min. P = 1.3E-05).
Multiple significant SNPs were identified in proximity to the respective protein coding genes. For example, SNP rs6136 at the SELP/SELL locus was significantly associated with protein levels in EUR and marginally associated in HIS (P = 2.84E-07). Both findings demonstrated strong loadings for sP-selectin and, to a lesser degree, sL-selectin, however in HIS loading magnitudes were also high (|ωp|>0.25) for four additional proteins (RANTES, SLPI, TGFβ-1, sICAM-1). Similar findings in our analyses were identified for MMP1 (MMP1), RANTES (CCL5), IL-2sr (IL2RA), and sICAM-1 (ICAM1). Investigation into potential regulatory function using HaploReg(Ward and Kellis 2012) revealed all but two SNPs (rs10905876 and rs12722588 near IL2RA) corresponded to cis-acting expression QTLs (cis-eQTLS) for the proximal protein coding genes in relevant tissues (Table S7).
We observed multiple significant mpQTL SNP associations within ABO separately across CHN, EUR, and HIS, as well as in our combined meta-analysis. Moreover, these SNPs are known to collectively tag functional alleles that define the ABO blood group (i.e., A, B, and O). Effect heterogeneity was observed across ABO SNPs as well as by racial/ethnic strata, as indicated by the respective SNP protein loading vectors for CHN, EUR, and HIS (Figure 3). These differences were most pronounced for sE-cadherin and sICAM-1, with HIS results exhibiting reduced magnitude effects relative to CHN and EUR participants.
Using a large, multi-ethnic cohort, we observed substantial variation in protein levels by race/ethnicity, indicating potential genetic factors regulating cellular adhesion pathway activity. Although mean levels of the majority of circulating adhesion proteins (e.g., sP-selectin, sL-selectin, sICAM-1) were generally lowest amongst Chinese American participants, rates of subclinical atherosclerosis (defined by presence of coronary artery calcium) for CHN in MESA (59.2%) are second only to EUR participants (70.1%)(Bild et al. 2005). These results indicate that there may exist complex regulatory effects on circulating cellular adhesion protein profiles in atherogenesis beyond broad unidirectional dysregulation of the pathway. Multivariate methods for mpQTL mapping can identify pathway-wide genetic effects of protein regulation while leveraging cross-trait correlation to improve statistical power. Using a large, multi-ethnic cohort, we identified multiple genetic loci with strong multivariate associations with circulating levels of cellular adhesion proteins assayed in serum or plasma. Many of the associated SNPs identified were in close proximity to the structural genes of one of the measured proteins and corresponded to prior evidence of eQTL associations. For example, SNP rs12938, located in the 3′ UTR of SELL, demonstrated high-magnitude loadings for sL-selectin across all race/ethnicities. Beyond structural genes for the measured proteins, some distal loci also demonstrated multivariate associations with adhesion proteins, such as ABO.
ABO blood type has been widely studied for risk association with atherosclerosis(Gong et al. 2014) and CVD(He et al. 2012; Medalie et al. 1971; Saha et al. 1973; von Beckerath et al. 2004; Zhang et al. 2012) and is most well-known to be strongly associated with circulation of von Willebrand Factor, a procoagulant with 22–30% reduced levels in Type O subjects relative to non-O individuals(Souto et al. 2000; Tirado et al. 2005). This glycoprotein is recognized to be directly modified by ABO glycosyltransferases, and Type O subjects have been shown to be at reduced risk for venous thromboembolism(Ohira et al. 2007; Sode et al. 2013; Wiggins et al. 2009; Zakai et al. 2014). Recent GWASs(Barbalic et al. 2010; Chen et al. 2015; Kiechl et al. 2011) have also identified multiple ABO SNP associations with circulating levels of a number of markers of endothelial function, including sP-selectin and sICAM-1. Mechanistically, it has been suggested that ABO glycosyltransferases may regulate endothelial markers through cleavage or proteolysis, since ABO associations with sP-selectin have not been replicated for platelet-bound levels(Barbalic et al. 2010). Our mpQTL findings are consistent with these findings while also providing additional context of other protein effects across the cellular adhesion pathway. Specifically, SNPs perfectly tagging the A1 allele (e.g., rs507666) corresponded to lower sP-selectin and sE-cadherin levels across all races/ethnicities. Moreover, in AFA (P = 3.4E-06), we observed evidence of increased levels of RANTES and TGFβ-1, while HIS participants exhibited increased circulation of SLPI. These results, like others(Chen et al. 2015; Kiechl et al. 2011), seemingly contradict the hypothesis that sP-selectin may mediate the association between ABO blood type and atherosclerotic CVD, as increased sP-selectin has been shown to correlated with higher CVD risk(Ridker et al. 2001). However, potential explanations have been proffered that accommodate this apparently paradoxical relationship(Kiechl et al. 2011). While underlying population differences in overall ABO allele frequencies may contribute to the significant differences we observed in cellular adhesion protein distributions by race/ethnicity, effect heterogeneity of ABO alleles across populations may also be a contributing factor. Further research is necessary to understand the molecular mechanisms underlying these racial/ethnic differences and their role in atherogenesis.
ABO and FUT2 SNP associations with the canonical variates also indicated sizable loadings for sE-cadherin. E-cadherin plays a critical role in epithelial cell-cell adhesion, and its soluble form is a biomarker for inflammatory response. Lower circulation of sE-cadherin levels is also indicative of reduced cellular adhesion and increased cellular motility, and loss of E-cadherin expression is commonly observed in many malignant cancers(van Roy and Berx 2008). FUT2 encodes the enzyme galactoside 2-alpha-L-fucosyltransferase 2 (FUT2), a Golgi stack membrane protein that contributes to the synthesis of an H antigen precursor and of relevance to the Lewis blood group system. Subjects with at least one functional FUT2 allele are referred to phenotypically as secretors, characterized by the additional presence of ABH antigens in plasma other bodily fluids(Slomiany and Slomiany 1978). Our findings for both CHN and our meta-analysis demonstrate independent associations of population-specific functional FUT2 knockout variants, lending credence to the non-secretor phenotype playing a large role in E-cadherin circulation. E-cadherin is one of many proteins targeted for glycosylation, and sugar remodeling is critical for E-cadherin functional regulation (Zhao et al. 2008). Moreover, core N-glycan fucosylation of E-cadherin by fucosyltransferase FUT8 has been shown to be critical to regulation of E-cadherin expression and function(Osumi et al. 2009). Oligosaccharide modification by either ABO glycosyltransferases or FUT2 may similarly alter E-cadherin and impact cellular adhesion.
The SNP rs17074898, significantly associated with adhesion proteins levels in EUR (P = 1.7E-08), corresponded to high-magnitude loadings across a large number proteins, particularly sP-selectin, TIMP-2, and SDF-1α. This SNP is located approximately 24 kb upstream of the gene ALOX5AP and has been previously identified as a cis-eQTL for that gene in whole blood(Westra et al. 2013). ALOX5AP encodes the protein FLAP, the activating protein for 5-lipoxygenase (5-LOX), which in turn is necessary for leukotriene biosynthesis(Vickers 1995). Leukotrienes are inflammatory mediators with evidence of involvement in atherosclerosis(Capra et al. 2013; Haeggstrom and Funk 2011). The ALOX5AP locus itself has also been genetically associated with an increased risk of stroke and/or myocardial infarction in multiple studies(Helgadottir et al. 2005; Helgadottir et al. 2004; Kajimoto et al. 2005), and FLAP is a therapeutic target for asthma and CVD(Evans et al. 2008). Our findings in EUR suggest FLAP dysregulation may have broad downstream regulatory effects on the cellular adhesion pathway and provides additional mechanistic information linking FLAP with CVD. The functional context of EUR-associated SNP rs7521237, displaying a similar protein association pattern to rs17074898, is less clear. An intronic variant within uncharacterized protein gene KIAA1614, rs7521237 has been previously reported to be a cis-eQTL of neighboring gene STX6 (syntaxin-6) in brain tissues(Zou et al. 2012). Syntaxin-6 has been shown to play an important role in angiogenesis by regulating trafficking of VEGFR2 and α5β1 integrin within endothelial cells(Jung et al. 2012), which in turn is regulated by cholesterol levels within the within the trans-Golgi network(Reverter et al. 2014). Although rs17074898 was also evaluated in HIS, this SNP was not genome-wide significant (P = 0.05). Similarly, SNP rs7521237 was evaluated for mpQTL association in AFA (MAF = 0.24) and found to be non-significant (P = 0.95). The population-specificity of these associations in non-Hispanic white participants may indicate effect modifying exposures and/or differences in underlying local linkage disequilibrium patterns, and validation is an independent cohort will be necessary.
There were no genome-wide significant mpQTL associations in African Americans despite relatively comparable sample size and protein measurement distributions to other racial/ethnic strata. Examination of AFA mpQTL association results for independent SNPs present in Table 2, significant in other races/ethnicities, revealed enrichment for lower p-values (Fisher’s combined P < 1e-16) and comparable protein loading patterns for SNPs with PAFA < 0.05. This indicates many of these effects may be present but at modest influences in African Americans, although the biological reasons for this are not immediately clear.
There are many notable strengths to this study, including the extensive amount of genotype data collected, the high quality and simultaneous assessment of the assayed proteins, and the well-studied multi-ethnic cohort. Multiple associations were also reproduced across race/ethnicity in our stratified analyses, providing internal replication of mpQTL loci for the cellular adhesion pathway. However, the numbers of participants in each race/ethnicity stratum, ranging from N = 516 to N = 611, are relatively low by commonly accepted GWAS standards and precluded evaluation of low-frequency/rare variation. Consequently, any modest effects from common variants and/or high impact rare variants will have gone undetected. Finally, additional independent replication of race/ethnicity-specific associations is necessary.
In summary, our mpQTL analysis of circulating levels of proteins in the cellular adhesion pathway revealed multiple significant associations of biological relevance while reproducing many previously identified univariate associations. Many findings indicated potential race/ethnicity-specific effects and/or allelic heterogeneity, notably blood antigen loci, and may be of particular interest to resolving population disparities in cardiovascular disease. Further research into the molecular mechanisms behind the modulation of adhesion pathway protein levels may improve our understanding cellular adhesion and cardiovascular risk.
Cardiometabochip genotyping data was supported in part by grants and contracts R01HL98077, N02-HL-64278, HL071205, UL1TR000124, DK063491, RD831697, and P50 ES015915. Funding for CARe IBC chip genotyping was provided by NHLBI Contract N01-HC-65226. Although the research described in this presentation has been funded in part by the United States Environmental Protection Agency through RD831697 to the University of Washington, it has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred. Funding for adhesion protein levels was provided by NHLBI by grant R01HL98077. MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-001079, UL1-TR-000040, and DK063491.
The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.
The authors declare no conflicts of interest.