|Home | About | Journals | Submit | Contact Us | Français|
Femoral neck (FN) bone geometry is an important predictor of bone strength with high heritability. Previous studies have revealed certain candidate genes for FN bone geometry. However, the majority of the underlying genetic factors remain to be discovered. In this study, pathway-based genome-wide association analysis was performed to explore the joint effects of genes within biological pathways on FN bone geometry variations in a cohort of 1,000 unrelated U.S. whites. Nominal significant associations (nominal p value < 0.05) were observed between 76 pathways and a key FN bone geometry variable - section modulus (Z), biomechanically indicative of bone strength subject to bending. Among them, EphrinA-EphR pathway was most significantly associated with FN Z even after multiple testing adjustments (pFWER value = 0.035). The association of EphrinA-EphR pathway with FN Z was also observed in an independent sample from Framingham Osteoporosis Study. Overall, these results suggest the significant genetic contribution of EphrinA-EphR pathway to femoral neck bone geometry.
Hip fracture occurs when the load applied to hip bone exceeds its mechanical strength. The mortality and morbidity associated with hip fracture become an ever-increasing threat to the public health . Femoral neck (FN) bone geometry, an important predictor of bone strength, is under strong genetic control with heritability ranging from 30–70% [2;3]. To date, some studies have been carried out to identify the genetic basis of FN bone geometry, with certain progresses being made [4–6]. However, the majority of FN bone geometry candidate genes are still largely unknown, leaving a considerable gap in knowledge.
Recently, a pathway-based approach was proposed to further interpret genome-wide association (GWA) data at the level of gene sets . Generally speaking, it is a pathway-driven gene set enrichment analysis, which focuses on groups of genes [i.e. genes sharing a biochemical or cellular function, chromosomal location or regulation, or Gene Ontology (GO) category] that make modest contributions to disease risk, rather than a few single nucleotide polymorphisms (SNPs) and/or genes with strong evidence of association. The method starts with gene ranking based on its degree of association with the studied phenotype to generate a gene list, then uses a statistic enrichment score (ES) to determine whether members of a given gene set tend to be concentrated at the top (or bottom) of the list, i.e. the gene set is related to the phenotype. The significance of ES is assessed by comparing it with the set of score obtained by 1,000 re-sampling runs, adjusting for variation in gene size and controlling for family-wise error rate (FWER). Through application to two Parkinson diseases (PD) and one amyotrophic lateral sclerosis (AMD) GWA data sets, this approach was proven to be powerful in the exploration of disease-susceptibility mechanism .
To identify the potential pathway underlying FN bone geometry variation, we employed this approach to analyze the GWA data from a cohort of 1,000 unrelated U.S. whites. We detected an interesting pathway for FN bone geometry – EphrinA-EphR signaling pathway, whose significant relevance to FN bone geometry was verified in an independent sample from Framingham Osteoporosis Study and some of whose gene components have been known to be important for bone. Our findings have important implications regarding the genetic regulation of bone strength and the pathogenesis of hip fracture.
The study was approved by the Institutional Review Board of University of Missouri – Kansas City. Signed informed-consent documents were obtained from all study participants before they entered the study. The basic characteristics of the study subjects are summarized in Table 1.
Based on our established and expanding genetic repertoire with more than 6,000 subjects, a total of 1,000 unrelated healthy U.S. whites were selected for this study. They were normal healthy subjects defined by a comprehensive suite of exclusion criteria . Briefly, efforts were made to exclude subjects with chronic diseases and conditions that might potentially affect bone metabolism so as to increase the statistical power for detecting bone candidate genes.
Areal bone mineral density (BMD, g/cm2) and bone size (cm2) of FN were measured by using dual energy X-ray absorptiometry (DXA) machines (Hologic Inc., Bedford, MA, USA). The machines were calibrated daily. The coefficient of variation (CV) values of the DXA measurements for FN BMD and bone size were 1.87% and 1.97%, respectively. On the basis of FN BMD and FN bone size, hip structural or hip strength analysis (HSA) was applied to estimate four parameters, which are generally used to describe bone strength in the proximal femur. They are section modulus (Z, in cm3), an index of bone bending strength; cross-sectional area (CSA, in cm2), an indicator of bone axial compression strength; cortical thickness (CT, in cm), an estimate of mean cortical thickness; and buckling ratio (BR), an index of bone structural instability. Briefly, the method assumes that: 1) the cross-section of bone within the femoral neck region is a right circular annulus; 2) 60% of the measured bone mass is cortical; 3) the effective BMD in fully mineralized bone tissue is 1.05 g/cm3. All of these assumptions were substantiated as good approximations . The formulas used to calculate the five parameters are presented below.
where W is the FN periosteal diameter and can be approximated by dividing FN bone size by the width of the region of interest (in Hologic DXA systems, the width of the FN region is standardized at 1.5cm );
, where the ED (Endocortical diameter) is calculated by the formula
A set of 1,151 unrelated Framingham subjects were chosen from Framingham Osteoporosis study . The participants underwent bone densitometry by DXA with a Lunar DPX-L (Lunar Corp., Madison, WI, USA). Femoral geometry parameters was assessed noninvasively by DXA-based HSA [11;13]. The region assessed was the narrowest width of the femoral neck (NN), which overlaps or is proximal to the standard femoral neck region. FN CSA, and FN Z were measured directly from the mass profiles using a principle first described by Martin and Burr . The CV values for the different variables were previously reported to range from 7.9% (CSA) to 10.4% (BR). Detail descriptions could be found in the corresponding previous reports [12;15].
Genomic DNA was extracted from whole human blood using a commercial isolation kit (Gentra systems, Minneapolis, MN, USA) following the protocols detailed in the kit. DNA concentration was assessed by a DU530 UV/VIS spectrophotometer (Beckman Coulter, Inc, Fullerton, CA, USA). Before advancing to the hybridization step, the measurements of DNA concentration were double-checked by pico-green analysis that can detect fluorescent signal enhancement of PicoGreen® dsDNA Quantitation Reagent, which selectively binds to dsDNA [16;17]. Genotyping with Affymetrix mapping 500K array (Affymetrix, Inc., Santa Clara, CA) was performed at the Vanderbilt Microarray Shared Resource at Vanderbilt University Medical Center, Nashville, TN using the standard protocol recommended by the manufacturer (see detailed description in ). Genotyping calls were determined with the dynamic model (DM) algorithm  as well as with the B-RLMM (Bayesian Robust Linear Model with Mahalanobis Distance Classifier) algorithm, an extention of the RLMM (Robust Linear Model with Mahalanobis Distance Classifier)  developed for the Mapping 500K product. DM calls were used for quality control of the genotyping experiment, and unsatisfactory arrays were subjected to re-genotyping. Eventually, 997 subjects (501 males and 496 females) who had at least one array (Nsp or Sty) reaching 93% call rate were retained. BRLMM calls were used for the following pathway-based GWA analyses, where only autosomal SNPs with per-SNP call rates (i.e. call rate for each SNP across all samples) > 95%, Hardy-Weinberg equilibrium (HWE) p values >= 0.001, minor allele frequencies (MAF) >= 5% and the physical distances from genes < 500 kb were evaluated. The distance setting was from the consideration that most enhancers and repressors are < 500 kb away from genes and most LD blocks are < 500 kb.
Genotype (Affymetrix 500K mapping array plus Affymetrix 50K supplemental array) and phenotype information were downloaded from the dbGaP database (http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap). Data download and usage was authorized by the SHARe data-access committee. Only individuals (469 males and 679 females) with genotypic call rates > 93% and SNPs on a given pathway/gene set with per-SNP call rates > 95%, HWE p value >= 0.001 and MAF >= 5% were included in the following association study.
We retrieved 260,190 and 380 annotated human pathways from BioCarta pathway database (http://www.biocarta.com/genes), KEGG pathway database (http://www.genome.ad.jp/kegg/pathway.html) and Ambion GeneAssist™ Pathway Atlas (http://www.ambion.com/tools/pathway), respectively. In addition, we downloaded GO annotation files for human genes from the GO website (http://www.geneontology.org) to integrate the GO information into our study. Gene sets were obtained from Level 4 Biological Process and Molecular Function GO terms. Genes whose GO annotations were in level 5 or lower in the GO hierarchy were assigned to their ancestral GO annotations in level 4. In our analysis, we tested only those pathways/gene sets with 10–200 genes included in our GWA data set so as to alleviate the multiple-testing problem by avoiding testing too narrowly- or too broadly- defined functional categories. Overall, 963 pathways/gene sets were constructed by 14,585 genes/312,172 SNPs.
For raw FN bone geometry in both the U.S. whites and the Framingham sample, we tested their correlations with parameters like age, age2, sex, age/age2-by-sex interactions, height and weight. After adjustments for significant (p value < 0.05) terms (age, age2, sex, height and weight) in each sample, the phenotype data, if not following normal distribution, were further subjected to Box-Cox transformation. General association analyses on individual SNPs were conducted by the Wald test implemented in PLINK (version 1.03) to obtain a t-statistic for each tested SNP.
The main procedures of pathway-based GWA analysis  were briefly summarized as follows:
where . From Formula (1), it is clear that ESS is calculated by walking down the list L, increasing a running-sum statistic when we encounter a gene in S and decreasing it when we encounter genes not in S. The magnitude of the increment depends on the correlation of the gene with the phenotype. In short, ESS is the maximum deviation from zero encountered in the random walk. It will be high if the association signal is concentrated at the top of list L, reflected by the significance level of observed ESS (i.e. nominal p value).
Then, in the context of multiple testing, FWER p value (denoted as pFWER) was calculated through a highly conservative procedure, which sought to ensure that the reported results did not include even a single false positive gene set [7;20]. The pFWER can be calculated as the fraction of all gene sets whose greatest in all permutations was higher than the NESS in the given pathway/gene set S.
For a certain interesting pathway, we performed multiple step-wise regressions using SAS 8.2 (SAS Institute Inc., Cary, NC), to verify the associations of pathway genes with FN Z in Framingham samples, while adjusting for significant covariates (p < 0.05). At first, raw bone geometry value was adjusted through the same procedure, described above, for the U.S. whites. Then, association analyses for all the SNPs in certain interesting pathway were performed using the PLINK software package (version 1.02). In the multiple step-wise regression analysis, each pathway gene was represented by its top significant SNP as independent variable. P value of 0.05 was set as the criteria for the entry and removal of predictive variables. F test was used to assess the significance of the overall model. R-square value as an indicator of how well the model fits the data (e.g., an R-square close to 1.0 indicates that we have accounted for almost all of the variability with the variables specified in the model) was used to evaluate the additive contributions of pathway genes to the trait.
For our U.S. White sample, pathway-based GWA analyses were also conducted for FN BMD in the total population and for FN Z in the sex-specific groups through the same procedure described in the above sub-sections of “Routine statistical analysis” and “Pathway-based GWA analysis in the U.S. whites”. In addition, to detect potential population stratification, which may lead to spurious association results, we used the software Structure 2.2 (http://pritch.bsd.uchicago.edu/software.html) to investigate the potential substructures of the U.S. whites sample and the Framingham sample. The program uses a Markov chain Monte Carlo (MCMC) algorithm to cluster individuals into different cryptic sub-populations on the basis of multi-locus genotype data . To ensure the robustness of our results, we performed independent analyses under three assumed numbers for population strata (k = 2, 3, and 4), using 200 un-linked markers that were randomly selected across the entire genome. To confirm the results achieved through Structure 2.2, we further calculated inflation factors based on median chi-square statistic using a method of genomic control .
In the pathway-based association analysis for FN Z in the total U.S. whites, 76 out of the 963 tested pathways showed significant association at the nominal significance level (p value < 0.05). Among them, two pathways were significant with pFWER less than 0.05. One was 1- and 2-Methylnaphthalene degradation pathway retrieved from KEGG database (pFWER value = 0.021) and the other was EphrinA-Eph receptors (EphR) signaling pathway queried from Ambion database (pFWER value = 0.035) (Figure 1). For the other three tested bone geometric measures (CSA, CT and BR), no significant result was found after multiple testing adjustments, consequently, no further analyses were done for them.
In the sex-specific pathway-based GWA analysis for FN Z, we found the EphrinA-EphR pathway was significant in the male subgroup with a nominal p value of 0.006, while the 1- and 2-Methylnaphthalene degradation pathway was associated with FN Z in the female subgroup at a level of marginal significance (nominal p value = 0.056). However, these two correlations did not remain significant after adjusting for multiple tests. Furthermore, pathway-based GWA analysis showed that these two pathways were not associated with FN BMD (data not shown). In Table 2, basic information about the top 18 pathways associated with –lg(p)value > 2 in the pathway-based GWA analysis for FN Z in the total U.S. whites are summarized, including source database, database entry, pathway name, and association results in the total population as well as in the sex-specific groups.
1- and 2-Methylnaphthalene degradation pathway and EphrinA-EphR pathway are composed of 25 genes and 40 genes, respectively. Most of their pathway genes are covered by our genotype platform. Tables 3 and and44 individually list the information of SNPs representing the 21 1- and 2-Methylnaphthalene degradation pathway genes, and the 35 EphrinA-EphR pathway genes. A total of 15 genes in the 1- and 2-Methylnaphthalene degradation pathway and 21 genes in the EphrinA-EphR pathway were associated with FN Z at the nominal significance level (p value < 0.05). The most significant SNP in PAK gene, rs2105803, achieved a p value of 0.001 for FN Z. Figure 2 depicts running-sum plots for the two pathways (“Gene list rank” versus “ESs”) and a plot for –lg(p)values of all the 14,585 pathway genes (“Gene list rank” versus “-lg nominal p value”) to describe the results of pathway-based GWA analysis for FN Z in the total U.S. whites sample. We observed that most of these two pathway genes were ranked at the top significance level among the 14,585 gene lists. Accordingly, the 1- and 2-Methylnaphthalene degradation pathway and EphrinA-EphR pathway achieved high ESs of 0.542 and 0.482, respectively.
To validate our findings, we conducted multiple step-wise regression analyses for 1- and 2-Methylnaphthalene degradation pathway and EphrinA-EphR pathway in Framingham samples. After adjustment for age, age2, sex, height and weight, 2 main-effect genes (ADH6 and ADH1A) were chosen out of 21 1- and 2-Methylnaphthalene degradation pathway genes (p value < 0.05). They composed the final model that had an overall p value of 0.0021 and could explain 1.24% of FN Z variation. While 8 main-effect genes (EphA1-2, EphA7-8, EFNA3, PAK1, PIK3CG and LIMK2) were chosen out of the 35 EphrinA-EphR pathway genes as predictive variables (p value < 0.05). They composed the final model that had an overall p value < 0.0001 and could explain 6.06% of FN Z variation.
With 200 randomly selected unlinked SNPs, all the subjects in the U.S. whites sample were consistently clustered together under all the three assumed number of population strata, so were the subjects in the Framingham sample. Moreover, the inflation factors were estimated to be 1.030 (BR), 1.009 (Z), 1.035 (CSA) and 1.071 (CT) for U.S. whites sample, and 1.022 (BR), 1.013 (Z), 1.018 (CSA) and 1.047 (CT) for the Framingham sample, suggesting no significant population substructure in the two samples.
In this study, we used the genomic-wide based pathway approach to explore the joint effects of different gene variants in common pathways on FN bone geometry variations. Two pathways were identified as the most statistical promising pathways for FN Z after multiple testing adjustments, 1- and 2-Methylnaphthalene degradation pathway and EphrinA-EphR signaling cascade.
As we known, 1- and 2-Methylnaphthalene are components of polycyclic aromatic hydrocarbons (PAHs) present in cigarette smoke. A recent in vivo study showed PAHs may function as ligand for aryl-hydrocarbon receptor (AhR) a transcription factor that regulates gene expression, to cause the loss of bone mass and strength . However, there has been little direct research into the effects of methylnaphthalene and its metabolites on bone and no biological correlation has been made thereof. In the present study, validation analysis in the Framingham sample demonstrated a significant but modest effect of 1- and 2-Methylnaphthalene degradation pathway genes (R-square=1.24%).
EphR and their cell-surface-bound ephrin ligands, including A- and B-subclass, constitute the largest subfamily of receptor protein-tyrosine kinases. EphrinA can activate EphA receptors and their downstream molecules (as reviewed in ). EphrinA-EphR pathway is known to function in reorganization of the actin cytoskeleton through Rho family GTPases. Rho family GTPases, including RhoA, Rac1 and Cdc42, have been implicated in the contraction of actin cytoskeletal systems, promoting the formation and elongation of lamellipodia and filopodia, respectively . Since actin cytoskeleton including lamellipodia and filopodia are involved in the biological processes of osteoblastic adhesion and migration that are closely related to the bone geometry qualities, it is reasonable to hypothesize that EphrinA-EphR pathway may contribute to the genetic architecture underlying the variation of bone geometry.
In addition, the potential role of EphrinA-EphR pathway was revealed in the skeletal patterning (as reviewed in ). Several molecules such as ephrin A5 and EphA2-5 were expressed on the surface of osteoblasts or pre-osteoblasts [27–29]. Besides, the functions of EFNB2 in osteoclasts and EphB4 in osteoblasts were found to associate with the switch from bone resorption to bone formation . In previous studies, crosstalks were detected between EFNB2 with both EphA3 and EphA4 [31–33], and between EphB2 and ephrinA5 . These cross-talks may bridge A- and B-subclass Eph receptors/ephrins signaling, which suggested more complex regulation of EphrinA-EphR pathway in normal and pathological bone remodeling.
In this study, the pathway-based association analysis in U.S. whites demonstrated that 21 genes in this pathway were associated with FN Z at the nominal significance level (p value < 0.05). These genes included an ephrin (EPNA5) gene and 5 EphR genes (EphA1, EphA3-5 and EphA7), 2 Rho family member genes (Rac1 and cdc42) and their downstream p21-activated kinase genes (PAK1 and PAK7), 4 RhoA downstream genes (ROCK2, LIMK2, CFL1 and ACTG2), 3 adaptor molecular of EphR genes (PIK3CG, FAK1 and PTPN11), and others. Some of the identified pathway genes have been known to be important for bone, especially for osteoblasts. For example, RhoA-ROCK cascade was found to mediate the differentiation of human mesenchymal stem cells (hMSCs) to osteoblast and then promote osteogenesis, specifically via its effects on cytoskeletal tension . GTPases Cdc42 and Rac were also suggested to communicate with receptor of advanced glycation end products to regulate the osteoblast motility . PIK3CG, as an important modulator of extracellular signals, was found to be important for osteoblastic differentiation, recruitment, migration and survival [37–40]. In addition, activated EphA receptors can transmit signals to focal adhesion kinase (FAK), which is protein tyrosine kinase acting as a regulator of the integrin signaling cascade. This kinase was found to be important for mechanotransduction in osteoblasts . Subsequent validation study confirmed our findings were by demonstrating the significant genetic effects of the same EphrinA-EphR pathway genes (EphA7, PAK1, PIK3CG and LIMK2) and the whole pathway on FN Z in the Framingham sample (Table 5).
On the whole, the evidences from multiple aspects (pathway-base GWA analyses, validation analyses and functional relevance to bone) strongly supported the importance of EphrinA-EphR pathway for FN Z. Nevertheless, this pathway was associated with FN Z but not with either the other three FN bone geometry parameters or FN BMD in the total U.S. whites. Besides, in the sex-stratified analysis, this pathway was shown to be associated with FN Z only in the male subgroup at a nominal significance level. These results may reflect site-specific effect due to genetic heterogeneity at different skeletal sites and sex-specific effect only present in male subjects. On the other hand, they might also be treated with caution because of the inflated false-positive and/or false-negative rates caused by increased multiple comparisons and insufficient power in individual subgroups. In this study, efforts like permutation-based adjustments have been taken to alleviate it.
Notably, no one EphrinA-EphR pathway gene attained GWA significance (4.2×10−7) in our conventional GWA studies on FN bone geometry . That is, pathway-based GWA approach may incorporate information from markers with moderate significance levels so as to detect novel genetic risk factors, which can serve as candidates for further replication studies. This approach has the merit of high-throughput in comparison with traditional candidate pathway association strategy, which may only figure out a very small fraction of the whole genetic architecture of complex diseases/traits. Besides, this approach provides an opportunity to systematically study a large number of pathways without assumptions about the causal pathways, and thus may have the higher power to identify new candidate genetic factors. However, a potential limitation of this study is that some of pathway genes were not covered by our genotype platform. For example, the genome coverage for EphrinA-EphR pathway was only 87.5%.
In summary, our results suggested that the polymorphisms in EphrinA-EphR pathway genes may affect FN bone geometry variations. Future molecular studies as well as validation studies with customized pathway array are necessary to further investigate this pathway in order to clarify its role in bone biology.
Investigators of this work were partially supported by grants from NIH (R01 AR050496, R21 AG027110, R01 AG026564, P50 AR055081 and R21 AA015973). The study also benefited from grants from National Science Foundation of China, Xi’an Jiaotong University, and the Ministry of Education of China. We also thank all contributors to the Framingham Heart Study. The Framingham Heart Study and the Framingham SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with Boston University. The Framingham SHARe data used for the analyses described in this manuscript were obtained through dbGaP. This manuscript was not prepared in collaboration with investigators of the Framingham Heart Study and does not necessarily reflect the opinions or views of the Framingham Heart Study, Boston University, or the NHLBI.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.