|Home | About | Journals | Submit | Contact Us | Français|
Hepatocyte growth factor (HGF) is a mesenchyme-derived pleiotropic factor that regulates cell growth, motility, mitogenesis, and morphogenesis in a variety of cells, and increased serum levels of HGF have been linked to a number of clinical and subclinical cardiovascular disease phenotypes. However, little is currently known regarding what genetic factors influence HGF levels, despite evidence of substantial genetic contributions to HGF variation. Based upon ethnicity-stratified single-variant association analysis and trans-ethnic meta-analysis of 6201 participants of the Multi-Ethnic Study of Atherosclerosis (MESA), we discovered five statistically significant common and low-frequency variants: HGF missense polymorphism rs5745687 (p.E299K) as well as four variants (rs16844364, rs4690098, rs114303452, rs3748034) within or in proximity to HGFAC. We also identified two significant ethnicity-specific gene-level associations (A1BG in African Americans; FASN in Chinese Americans) based upon low-frequency/rare variants, while meta-analysis of gene-level results identified a significant association for HGFAC. However, identified single-variant associations explained modest proportions of the total trait variation and were not significantly associated with coronary artery calcium or coronary heart disease. Our findings indicate genetic factors influencing circulating HGF levels may be complex and ethnically diverse.
Hepatocyte growth factor (HGF) is a mesenchyme-derived pleiotropic factor that regulates cell growth, motility, mitogenesis, and morphogenesis in a variety of cells (Zarnegar & Michalopoulos, 1995). It is secreted as a single polypeptide that is subsequently cleaved to form an active heterodimer (Nakamura & Mizuno, 2010). HGF initiates a tyrosine kinase signaling cascade as the sole ligand to c-Met, an epithelial cell membrane receptor, which in turn induces a variety of biological effects related to tissue regeneration and repair. Originally discovered as a mitogen for hepatocytes (Nakamura et al., 1987, Nakamura et al., 1984), several in vitro studies (Matsumoto & Nakamura, 1992, Nakamura, 1991) have shown HGF to have a regenerative effect on multiple tissue types and elevated circulating HGF levels have been observed in organ injury patients (Funakoshi & Nakamura, 2003). Despite increased circulation of HGF resulting in a systemic exposure of the protein, accumulation of HGF remains localized to the injured tissue (Tajima et al., 1992).
While increased circulation of HGF is a biological response to acute tissue injury, elevated levels have also been associated with a wide variety of cardiovascular disease (CVD) states. In humans, serum HGF has been associated with advanced age, current smoking and diabetes (Miller et al., 2014), systolic blood pressure (Aspinall & Pockros, 2006, Billich, 2002), intima medial thickness (Miller et al., 2014, Billich, 2002), and aorto-iliac artery atherosclerosis (Nieva et al., 2012). Obesity has been associated with higher levels of serum HGF(Rehman et al., 2003) with concomitant decreases following weight loss(Swierczynski et al., 2005). Higher serum HGF has also been associated with presence of coronary atherosclerosis (Mamelak et al., 2012) and chronic renal disease (Tsien et al., 2009). Furthermore, higher serum concentrations have been associated with myocardial infarction, unstable angina, acute aortic dissection, and pulmonary thromboembolism compared to patients with stable angina (Solivetti et al., 1998).
Although a familial aggregation study (Vistoropsky et al., 2008) demonstrated circulating HGF to be highly heritable (h2 = 48.4%), little is currently known regarding what genetic factors influence HGF levels. Previous genetic association analyses with single nucleotide polymorphisms (SNPs) have been limited to two narrowly-focused studies: a single candidate-gene study, which implicated CYP19A1 haplotypes with plasma HGF levels in post-menopausal women (Lin et al., 2012); and a follow-up analysis of a keratoconus risk variant (HGF SNP rs3735520), which was associated with increased serum HGF levels in an Australian control population (Burdon et al., 2011). Genetic variants within the protein encoding gene HGF have also been associated with a number of complex diseases, including hypertension (Motone et al., 2004), myopia (Yanovitch et al., 2009), and ovarian cancer (Goode et al., 2011), although the connection between these variants and circulating HGF levels is unknown. Characterization of genetic factors that influence HGF levels may provide additional insight into biological mechanisms behind HGF regulation and identify new biomarkers for disease risk.
In this study, we present the first extensive investigation of common, low-frequency, and rare-variant genetic associations with serum HGF levels in a large, multi-ethnic cohort. In addition to ethnicity-stratified analyses, we also employ meta-analytic methods to combine results across sub-cohorts, gaining insight into genetic effects common across ethnicities.
The Multi-Ethnic Study of Atherosclerosis (MESA), described in greater detail elsewhere (Nieva et al., 2012), enrolled 6814 participants aged 45-84 from 2000-2002 without existing clinical CVD. This parent study population included 38% non-Hispanic White American (EUR), 28% African American (AFA), 22% Hispanic American (HIS), and 12% Chinese American (CHN) subjects. MESA participants were examined at six field centers located in Baltimore, MD; Chicago, IL; Forsyth County, NC; Los Angeles County, CA; northern Manhattan, NY; and Saint Paul, MN. For this study, Exam 1 serum was available for HGF measurement on 6769 participants. MESA and its ancillary studies were approved by the Institutional Review Board at participating centers and all participants gave written informed consent.
At each visit, information on demographics, cardiovascular risk factors, past medical history and co-morbidities, social history, family history, and medications was collected through a combination of self-administered questionnaires and interview-administered questionnaires. Height was measured while participants were standing without shoes, heels together against a vertical mounted ruler. Body mass index (BMI) was calculated as weight (kg)/height2 (m2). Resting seated blood pressure was measured three times using an automated oscillometric method (Dinamap), and the average of the second and third readings are used in analyses. Hypertension was defined according to the Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC 7) guidelines as systolic blood pressure ≥140 mmHg, diastolic blood pressure ≥90 mmHg, or use of anti-hypertensive medications (Chobanian et al., 2003). Diabetes was defined as any participant who self-reported a physician diagnosis, used diabetic medication, had a fasting glucose ≥126 mg/dL, or a non-fasting glucose of ≥200 mg/dL. Serum glucose was assayed by a hexokinase/glucose-6-phosphate dehydrogenase method. Triglycerides were measured in plasma by a glycerol blanked enzymatic method, and cholesterol was measured in plasma using a cholesterol oxidase method. HDL cholesterol was measured by the cholesterol oxidase method after precipitation of non-HDL-cholesterol with magnesium/dextran. LDL-cholesterol was calculated in specimens having a triglyceride <400 mg/dL via the Friedewald equation.
Blood samples were obtained from fasting participants at MESA Exam 1 and processed within 30 minutes from phlebotomy. Serum was obtained allowing blood samples to clot at room temperature for 40 minutes. Samples were centrifuged at 4°C at 2000 g × 15 minutes or 3000 g × 10 minutes for a total of 30,000 g-minutes and subsequently stored at −70°C. DNA extraction was performed for genotyping assays.
Circulating levels of HGF protein were measured at MESA Exam 1 in serum by a quantitative sandwich enzyme-linked immunosorbent assay (ELISA) using the Human soluble HGF/CD62P Immunoassay kit (R&D Systems, Minneapolis, MN), with a lower limit of detection of 40 pg/mL. The interassay laboratory coefficients of variation for the HGF method were 12.0%, 8.0%, and 7.4% at respective mean concentrations of 686.6, 2039.1, and 4079.5 pg/mL for lyophilized manufacturer’s controls; and 10.4% at a mean concentration of 687.7 pg/mL for an in-house pooled serum control.
The assayed genotype data consisted of three individual genotype panels on all MESA participants who consented for genetic studies: the Illumina Exome BeadChip (Huyghe et al., 2013), the Illumina Cardio-MetaboChip (Mamelak et al., 2012), and the Illumina iSelect ITMAT/Broad/CARe (IBC) Chip (Keating et al., 2008). Each of the three panels had quality control measures individually performed on the genotype data prior to merging them together using Plink v1.07 (Purcell et al., 2007) under genome build NCBI build 37. This produced a genotype dataset for all MESA samples that passed quality control procedures, with a total sample size of N = 6323 (AFA = 1635; CHN = 766; EUR = 2491; HIS = 1431) for 377,173 variants with minor allele frequency (MAF) >0% in at least one subcohort. Population stratification was assessed using STRUCTURE (Pritchard et al., 2000) and EIGENSTRAT (Price et al., 2006) for participants with genome-wide SNP data. All variants in the final merged dataset were annotated using BioR (Kocher et al., 2014). All reported calculations of linkage disequilibrium are based on the 1000 Genomes EUR Phase 1 population unless otherwise noted.
SNP-trait associations for HGF serum levels were tested by linear regression under an assumed additive genetic model with PLINK v1.07 (Purcell et al., 2007), adjusting for age, sex, BMI, smoking status, and the first three ancestry-informative principal components (PCs). Analyses were conducted stratified by ethnicity on any autosomal SNP in the merged genotype data, and SNPs with a subcohort-specific empirical MAF <0.5% and/or call rate <90% were excluded from corresponding single-variant analysis. To combine results across ethnicity, trans-ethnic meta-analysis of association results on all SNPs observed in ≥2 subcohorts was performed using METASOFT (Han & Eskin, 2011), which provides association results under a fixed-effect model. Cochran’s Q statistics and accompanying p-values were calculated to evaluate effect heterogeneity across ethnicities.
For low-frequency and rare variants, single-variant association testing may be less powerful than gene-level aggregation tests, which combine the effects of multiple variants within a given gene (Lee et al., 2014). Ethnicity-stratified gene-based tests were performed such that variants were included in a gene set if they corresponded to an ethnicity-specific MAF ≤5% as well as a maximum MAF across ethnicities ≤10%. Furthermore, we selected variants that were potentially functional based upon the BioR annotation, defined as being either non-synonymous, start- or stop-codon altering, or a splice-site variant for the given gene. To avoid Type I error inflation due to unstable parameter estimation, gene-level associations were evaluated if at least two such variants were present and if the total minor allele count (MAC) across all included variants was ≥20. Testing was conducted using the optimally unified rare-variant association method SKAT-O (Lee et al., 2012), which combines gene-burden testing with the kernel machine method SKAT (Wu et al., 2011). These analyses were adjusted for the same set of covariates used in the single-variant analyses. We additionally conducted a meta-analysis of the ethnicity-specific test results using MetaSKAT (Lee et al., 2013), assuming fixed effects across subcohorts under the SKAT-O model. All testing was completed under the linear-weighted kernel using the default Beta distribution weighting scheme, along with ethnicity-specific MAFs for MetaSKAT.
To adjust for multiple testing, a single-SNP association was declared to be statistically significant if the nominal P-value was below a Bonferroni-corrected alpha level of 0.05, with the adjustment factor based upon the total number of SNPs evaluated across any subcohort (255,930 variants, P < 1.95E-07). Similarly defined thresholds were applied for the gene-level testing based upon the total number of evaluated genes (12,492 genes, P < 4.00e-06).
Follow-up analysis for all significant single-variant meta-analysis results was performed for marginal association with subclinical and clinical CVD by examining their association with coronary artery calcium (CAC) and coronary heart disease (CHD) across the entire MESA cohort with available genetic data (Table S1), stratified by ethnicity. We evaluated CAC Agatston score associations using a Tobit model(Fornage et al., 2004) to account for inflated zero measurements, while incident CHD was evaluated under Cox proportional hazards model. All associations were considered under both a minimally adjusted (age and sex) and fully adjusted model (age, sex, BMI, smoking and alcohol use status, LDL and HDL cholesterol, triglycerides, and hypertension and diabetes status).
A total of 6201 participants (AFA=1550; CHN=762; EUR=2477; HIS=1412) with genotype information (subject call rate >90%) were successfully phenotyped for association analysis in our study. Distributional characteristics of the genotyped variants per ethnicity can be found in Table S2 and Figure S1, while summaries of the HGF levels and their association with traditional CVD risk factors are presented in Table 1 and Table S3, respectively.
Manhattan plots for the stratified single-SNP association analyses are presented in Figure S2. The top SNPs per ethnicity are reported in Table 2. Only one SNP, HGF missense variant rs5745687 (PEUR = 5.374E-11), exceeded the statistical significance threshold, while rs200231675 was borderline significant (PCHN = 2.68E-07). The variant allele for rs5745687 showed strong evidence of association with reduced HGF levels in the EUR, with each additional copy corresponding to a decrease in serum HGF of 85.7 pg/ml.
A Manhattan plot of the single-variant trans-ethnic meta-analysis results is presented in Figure 1, with significant findings highlighted in Table 3. Statistically significant results were localized to two loci: EUR-associated SNP rs5745687 (PMETA = 2.88E-17) as well as a cluster of four SNPs (rs16844364, rs4690098, rs114303452, rs3748034) within or in proximity to gene HGFAC. Regional association plots (Pruim et al., 2010) for these loci are presented in Figure 2. Although Cochran’s Q statistic is known to be conservative (Higgins et al., 2003), results for these five SNPs indicated little to no evidence of variation in true effect size across subcohorts. Analyses assuming effect heterogeneity (Han & Eskin, 2011) did not result in any changes to significance status. A QQ-plot of all single-SNP analyses (meta- and stratified) is displayed in Figure S3, while the proportion of explained variation of HGF by clinical characteristics and significant meta-analysis SNPs is reported in Table 4. Overall, the explained variability by the associated SNPs was very modest, ranging from 0.9-2.6%. Empirical evaluation of pair-wise LD with the four significant SNPs at the HGFAC locus across ethnicity indicated moderate LD between rs16844364 and rs3748034 (r2 from 0.13-0.47), with all other calculations <0.05.
Although the previously reported associated SNP rs3735520, located in the promoter region of HGF, was included in our genotype data, we did not replicate the finding in any of our analyses (min. P-value: PAFA=0.018). We similarly found no significant associations with any variants in the aromatase gene CYP19A1, although those findings corresponded to a highly specific study population. HGF variants in our data previously associated with disease phenotypes (rs2074725 and hypertension (Motone et al., 2004), rs2214825 and ovarian cancer mortality (Goode et al., 2011)) also did not achieve statistical significance in our study.
A QQ-plot of the complete gene-level association results is presented in Figure S4, with significant and suggestive (P<1E-04) findings reported in Table S4. Two genes were significantly associated in our stratified analyses: A1BG (PAFA = 6.93E-07; Table S5) and FASN (PCHN = 4.34E-07; Table S6), while no significant gene-based findings were observed for the remaining subcohorts. Results from the MetaSKAT analysis revealed HGFAC to be significant (P = 5.09E-07; Table S7) when combining results across ethnicities. Examination of the stratified analyses for HGFAC suggests this meta-analysis result to be driven largely by variants present in the EUR and HIS subcohorts (PAFA = 0.706, PCHN = 0.77, PEUR = 1.12E-05, PHIS = 0.013).
Table S8 reports the ethnicity-specific association p-values of serum HGF with CAC and CHD under minimally and fully adjusted models. These findings indicate strong associations with HGF and clinical and subclinical disease in both European and African Americans, with some attenuation after adjustment for risk factors in the latter. Association analysis results for the five significant meta-analysis SNPs with CAC and CHD among all subjects with available genetic data are reported by ethnicity in Table S9. No findings were significant after adjusting for multiple testing (5 SNPs by 4 subcohorts by 2 phenotypes; α = 0.05/40 = 0.00125).
In this study, we evaluated common and rare genetic associations with circulating levels of serum HGF in four ethnically distinct populations within the MESA cohort, constituting the first large-scale genetic association analysis for this trait. HGF is highly expressed in vascular smooth muscle cells (Zarnegar & Michalopoulos, 1995) and plays a critical part in the regulation of cellular adhesion. Moreover, HGF is present in atherosclerotic lesions but not in normal vessels (Solivetti et al., 1998) and c-Met is expressed on vascular smooth muscle cells isolated from the intima of atherosclerotic plaque of carotid arteries (Fischer et al., 2014). Consequently, identification of genetic associations with circulating HGF may elucidate heritability of CVD and provide further insight into the molecular mechanisms of atherosclerosis.
Overall, we identified five single variants to be significantly associated with serum HGF levels based on stratified and meta-analysis findings. HGF polymorphism rs5745687 was highly significant in both our EUR analysis and our trans-ethnic meta-analyses, with consistent effect size observed across ethnicities. The SNP rs5745687 is a missense variant (p.E299K) located in exon 8 of HGF, the protein-coding gene of HGF. Annotation analysis using HaploReg V2 (Ward & Kellis, 2012) with the 1000 Genomes EUR Phase 1 population data did not indicate any other variants in high LD (r2 > 0.6) with rs5745687, while functional predictions from SIFT (Kumar et al., 2009) and PolyPhen (Adzhubei et al., 2013) classified the variant allele effect to be tolerated and benign, respectively. Despite evidence that r5745687 is unlikely to be tagging an underlying causal SNP, the functional role of the variant allele in relation to circulating HGF is unclear.
We also identified four single-SNP associations within or in proximity to gene HGFAC based upon our trans-ethnic meta-analyses, and HGFAC was a significant gene-level finding in our MetaSKAT rare-variant meta-analysis. HGFAC encodes the serine protease HGFA that activates HGF by cleaving the single chain pro-HGF to form the active heterodimeric protein (Kataoka et al., 2003). Common SNP rs16844364 is an intronic RGS12 polymorphism that is also within 5 kb of the transcription start site of HGFAC and 8.5 kb of associated HGFAC intronic variant rs4690098. Despite their relative proximity, the two SNPs are independent of one another (r2 < 0.2). HaploReg annotation for rs16844364 indicated overlap with enhancer-associated histone modifications across multiple cell types, indicating strong evidence for potential regulatory function. Query of the FANTOM5 Transcribed Enhancer Atlas (Andersson et al., 2014) resulted in an identified enhancer site 226 bp from rs16844364 (chr4:3438789-3438989) preferentially expressed in blood vessel cells. However, we are unaware of any published eQTL findings that associate rs16844364 with regulation of HGFAC expression and suppositions about the functional relationship between rs16844364 and HGFAC are conjectural. Variant rs16844401 within exon 12 of HGFAC was recently found to be associated with circulating fibrinogen levels in large meta-analysis of European Americans (Sabater-Lleal et al., 2013). Although this SNP did not demonstrate any evidence of association with HGF in our single-SNP analyses, it was included in our gene-based testing for HGFAC.
We also identified two significant gene-based results in our stratified analyses: A1BG (α-1-B glycoprotein) in the AFA subcohort and FASN (fatty acid synthase, FASN) in the CHN subcohort. A1BG is a protein coding gene for a plasma glycoprotein of unknown function, and is primarily expressed in the liver (Lonsdale et al., 2013). A pharmacogenomic association with the nonsynonymous A1BG SNP rs893184 was previously reported for treatment-related outcomes in high risk CVD white and Hispanic subjects with hypertension (McDonough et al., 2013). However, this particular SNP was not associated with HGF levels in our study, and our gene-based association may be due to a different biological mechanism. FASN is a homodimer multi-enzyme that solely catalyzes fatty acid synthesis in mammals. Upregulation of FASN has been observed in multiple cancers, and inhibition of FASN has been linked to concomitant dysregulation of c-Met protein activation by HGF (Coleman et al., 2009, Zaytseva et al., 2012). It is hypothesized that this may be due to disruption of lipid rafts, which are key factors in compartmentalization of growth factor receptors like c-Met, and aberrant FASN activity could play a role in circulating levels of HGF through this biological mechanism.
There are some notable limitations to our analyses. While our merged genotype dataset is highly enriched for plausibly functional coding variation, it does not provide complete coverage of all functional variants nor were the merged chips designed to tag genome-wide variation. Moreover, much of the array design of the genotype panels used in our analyses is based upon European ancestry. This may limit the ability to detect functional variants in the other ethnicities evaluated in our study. There were also multiple suggestive common variant associations (P < 1E-05, data not shown) in our stratified analyses, indicating that larger ethnicity-specific sample sizes may be needed to identify polymorphisms of modest effect. Subcohort sample sizes for gene-based rare-variant association testing were smaller than generally recommended (Wu et al., 2011), and true associations may have gone undetected. Finally, although our meta-analyses provide a type of internal validation of our single variant findings through consistent effect estimates across ethnicities with diverse phenotype distributions and LD patterns, additional external validation and functional evaluation is necessary to confirm our findings.
In summary, our study identified multiple biologically plausible single-SNP and gene-based associations with serum HGF. In addition to a variant within the protein coding gene HGF, common and rare variants related to HGFAC, a gene coding an activating protein for HGF, were also identified to be associated with protein levels. Our gene-based results indicate rare-variation in A1BG and FASN may also play an important role in regulating circulating HGF in African and Chinese Americans, respectively. Although we did not identify any associations with these variants and CVD, our findings may be relevant to other traits where HGF levels are of consequence, including cancer. These results also suggest genetic factors regulating HGF may be diverse and differ across ethnicity, and that future work evaluating the functional consequences of rare variation within HGFAC, A1BG, and FASN is warranted.
The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org. MESA is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support is provided by grants and contracts N01 HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169 and RR-024156. Funding for adhesion protein levels was provided by NHLBI by grant R01HL98077. The provision of genotyping data was supported in part by the National Center for Advancing Translational Sciences, CTSI grant UL1TR000124, and the National Institute of Diabetes and Digestive and Kidney Disease Diabetes Research Center (DRC) grant DK063491 to the Southern California Diabetes Endocrinology Research Center. NIH/NIEHS P50 ES015915 “The Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) is supported by the U.S. Environmental Protection Agency (EPA) under Science to Achieve Results (STAR) Program Grant # RD831697 and NIH/NIEHS P50 ES015915 award. Although the research described in this presentation has been funded wholly or in part by the United States Environmental Protection Agency through RD831697 to the University of Washington, it has not been subjected to the Agency’s required peer and policy review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred.”
Conflict of Interest
The authors declare no conflict of interest.