The New England Centenarian Study (NECS) was founded in 1994 as a longitudinal study of centenarians to determine if centenarians could be a model of healthy human aging. Over time, the NECS along with other centenarian studies have demonstrated that the majority of centenarians markedly delay high mortality risk-associated diseases toward the ends of their lives, but many centenarians have a history of enduring more chronic age-related diseases for many years, women more so than men. However, the majority of centenarians seem to deal with these chronic diseases more effectively, not experiencing disability until well into their nineties. Unlike most centenarians who are less than 101 years old, people who live to the most extreme ages, e.g., 107+ years, are generally living proof of the compression of morbidity hypothesis. That is, they compress morbidity and disability to the very ends of their lives. Various studies have also demonstrated a strong familial component to extreme longevity and now evidence particularly from the NECS is revealing an increasingly important genetic component to survival to older and older ages beyond 100 years. It appears to us that this genetic component consists of many genetic modifiers each with modest effects, but as a group they can have a strong influence.
centenarians; genetic of longevity; heritability of longevity; compression of morbidity; genetic variation
Genome-wide association studies (GWAS) have identified numerous associations between genetic loci and individual phenotypes; however, relatively few GWAS have attempted to detect pleiotropic associations, in which loci are simultaneously associated with multiple distinct phenotypes. We show that pleiotropic associations can be directly modeled via the construction of simple Bayesian networks, and that these models can be applied to produce single or ensembles of Bayesian classifiers that leverage pleiotropy to improve genetic risk prediction. The proposed method includes two phases: (1) Bayesian model comparison, to identify Single-Nucleotide Polymorphisms (SNPs) associated with one or more traits; and (2) cross-validation feature selection, in which a final set of SNPs is selected to optimize prediction. To demonstrate the capabilities and limitations of the method, a total of 1600 case-control GWAS datasets with two dichotomous phenotypes were simulated under 16 scenarios, varying the association strengths of causal SNPs, the size of the discovery sets, the balance between cases and controls, and the number of pleiotropic causal SNPs. Across the 16 scenarios, prediction accuracy varied from 90 to 50%. In the 14 scenarios that included pleiotropically associated SNPs, the pleiotropic model search and prediction methods consistently outperformed the naive model search and prediction. In the two scenarios in which there were no true pleiotropic SNPs, the differences between the pleiotropic and naive model searches were minimal. To further evaluate the method on real data, a discovery set of 1071 sickle cell disease (SCD) patients was used to search for pleiotropic associations between cerebral vascular accidents and fetal hemoglobin level. Classification was performed on a smaller validation set of 352 SCD patients, and showed that the inclusion of pleiotropic SNPs may slightly improve prediction, although the difference was not statistically significant. The proposed method is robust, computationally efficient, and provides a powerful new approach for detecting and modeling pleiotropic disease loci.
pleiotropy; SNP; GWAS; prediction; Bayesian
The inheritance of genetic disease depends on ancestry that must be considered when interpreting genetic association studies and can provide insights when comparing traits in a population. We compared the genetic profiles of African Americans with sickle cell disease to those of Black Africans and Caucasian populations of European descent and found that they are less genetically admixed than other African Americans and have an ancestry similar to Yorubans, Mandenkas and Bantu.
sickle cell disease; genetic ancestry; admixture; genetic association
Despite the success of highly active antiretroviral therapy (HAART), HIV infected individuals remain at increased risk for frailty and declines in physical function that are more often observed in older uninfected individuals. This may reflect premature or accelerated muscle aging.
Skeletal muscle gene expression profiles were evaluated in three uninfected independent microarray datasets including young (19 to 29 years old), middle aged (40 to 45 years old) and older (65 to 85 years old) subjects, and a muscle dataset from HIV infected subjects (36 to 51 years old). Using Bayesian analysis, a ten gene muscle aging signature was identified that distinguished young from old uninfected muscle and included the senescence and cell cycle arrest gene p21/Cip1 (CDKN1A). This ten gene signature was then evaluated in muscle specimens from a cohort of middle aged (30 to 55 years old) HIV infected individuals. Expression of p21/Cip1 and related pathways were validated and further analyzed in a rodent model for HIV infection.
We identify and replicate the expression of a set of muscle aging genes that were prematurely expressed in HIV infected, but not uninfected, middle aged subjects. We validated select genes in a rodent model of chronic HIV infection. Because the signature included p21/Cip1, a cell cycle arrest gene previously associated with muscle aging and fibrosis, we explored pathways related to senescence and fibrosis. In addition to p21/Cip1, we observed HIV associated upregulation of the senescence factor p16INK4a (CDKN2A) and fibrosis associated TGFβ1, CTGF, COL1A1 and COL1A2. Fibrosis in muscle tissue was quantified based on collagen deposition and confirmed to be elevated in association with infection status. Fiber type composition was also measured and displayed a significant increase in slow twitch fibers associated with infection.
The expression of genes associated with a muscle aging signature is prematurely upregulated in HIV infection, with a prominent role for fibrotic pathways. Based on these data, therapeutic interventions that promote muscle function and attenuate pro-fibrotic gene expression should be considered in future studies.
Skeletal muscle; Aging; Gene expression; HIV infection; Senescence
Serum bilirubin levels have been associated with polymorphisms in the UGT1A1 promoter in normal populations and in patients with hemolytic anemias, including sickle cell anemia. When hemolysis occurs circulating heme increases, leading to elevated bilirubin levels and an increased incidence of cholelithiasis. We performed the first genome-wide association study (GWAS) of bilirubin levels and cholelithiasis risk in a discovery cohort of 1,117 sickle cell anemia patients. We found 15 single nucleotide polymorphisms (SNPs) associated with total bilirubin levels at the genome-wide significance level (p value <5×10−8). SNPs in UGT1A1, UGT1A3, UGT1A6, UGT1A8 and UGT1A10, different isoforms within the UGT1A locus, were identified (most significant rs887829, p = 9.08×10−25). All of these associations were validated in 4 independent sets of sickle cell anemia patients. We tested the association of the 15 SNPs with cholelithiasis in the discovery cohort and found a significant association (most significant p value 1.15×10−4). These results confirm that the UGT1A region is the major regulator of bilirubin metabolism in African Americans with sickle cell anemia, similar to what is observed in other ethnicities.
One of the most popular modeling approaches to genetic risk prediction is to use a summary of risk alleles in the form of an unweighted or a weighted genetic risk score, with weights that relate to the odds for the phenotype in carriers of the individual alleles. Recent contributions have proposed the use of Bayesian classification rules using Naïve Bayes classifiers. We examine the relation between the two approaches for genetic risk prediction and show that the methods are mathematically related. In addition, we study the properties of the two approaches and describe how they can be generalized to include various models of inheritance.
genetic risk prediction; genetic score; Naïve Bayes classifier; classification score; classification rule
Sickle cell anemia (SCA, HBB glu6val) is characterized by multiple complications and a high degree of phenotypic variability: some subjects have only sporadic pain crises and few acute hospitalizations, while others experience multiple serious complications, high levels of morbidity, and accelerated mortality. 1 The tumor necrosis factor-α (TNF-α) signaling pathway plays important roles in inflammation and the immune response; variation in this pathway might be expected to modify the overall severity of SCA through the pathway’s effects on the vascular endothelium.2, 3 We examined plasma biomarkers of TNF-α activity and endothelial cell activation for associations with SCA severity in 24 adults (12 mild, 12 severe). Two biomarkers, tumor necrosis factor-α receptor-1 (TNF-R1) and vascular cell adhesion molecule-1 (VCAM-1) were significantly higher in subjects with severe SCA. Along with these biomarker differences, we also examined data from a genome-wide association study (GWAS) using SCA severity as a disease phenotype, and found evidence of genetic association between disease severity and a single nucleotide polymorphism (SNP) in VCAM1, which codes for VCAM-1, and several SNPs in ARFGEF2, a gene involved in TNF-R1 release. 4
Sickle Cell Anemia; TNF-α; Disease Severity
Like most complex phenotypes, exceptional longevity is thought to reflect a combined influence of environmental (e.g., lifestyle choices, where we live) and genetic factors. To explore the genetic contribution, we undertook a genome-wide association study of exceptional longevity in 801 centenarians (median age at death 104 years) and 914 genetically matched healthy controls. Using these data, we built a genetic model that includes 281 single nucleotide polymorphisms (SNPs) and discriminated between cases and controls of the discovery set with 89% sensitivity and specificity, and with 58% specificity and 60% sensitivity in an independent cohort of 341 controls and 253 genetically matched nonagenarians and centenarians (median age 100 years). Consistent with the hypothesis that the genetic contribution is largest with the oldest ages, the sensitivity of the model increased in the independent cohort with older and older ages (71% to classify subjects with an age at death>102 and 85% to classify subjects with an age at death>105). For further validation, we applied the model to an additional, unmatched 60 centenarians (median age 107 years) resulting in 78% sensitivity, and 2863 unmatched controls with 61% specificity. The 281 SNPs include the SNP rs2075650 in TOMM40/APOE that reached irrefutable genome wide significance (posterior probability of association = 1) and replicated in the independent cohort. Removal of this SNP from the model reduced the accuracy by only 1%. Further in-silico analysis suggests that 90% of centenarians can be grouped into clusters characterized by different “genetic signatures” of varying predictive values for exceptional longevity. The correlation between 3 signatures and 3 different life spans was replicated in the combined replication sets. The different signatures may help dissect this complex phenotype into sub-phenotypes of exceptional longevity.
Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals’ DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging.
whole genome sequence; genetics; longevity; centenarian; supercentenarian; aging
With the progressive aging of the human population, there is an inexorable decline in muscle mass, strength and function. Anabolic supplementation with testosterone has been shown to effectively restore muscle mass in both young and elderly men. In this study, we were interested in identifying serum factors that change with age in two distinct age groups of healthy men, and whether these factors were affected by testosterone supplementation.
We measured the protein levels of a number of serum biomarkers using a combination of banked serum samples from older men (60 to 75 years) and younger men (ages 18 to 35), as well as new serum specimens obtained through collaboration. We compared baseline levels of all biomarkers between young and older men. In addition, we evaluated potential changes in these biomarker levels in association with testosterone dose (low dose defined as 125 mg per week or below compared to high dose defined as 300 mg per week or above) in our banked specimens.
We identified nine serum biomarkers that differed between the young and older subjects. These age-associated biomarkers included: insulin-like growth factor (IGF1), N-terminal propeptide of type III collagen (PIIINP), monokine induced by gamma interferon (MIG), epithelial-derived neutrophil-activating peptide 78 (ENA78), interleukin 7 (IL-7), p40 subunit of interleukin 12 (IL-12p40), macrophage inflammatory protein 1β (MIP-1β), platelet derived growth factor β (PDGFβ) and interferon-inducible protein 10 (IP-10). We further observed testosterone dose-associated changes in some but not all age related markers: IGF1, PIIINP, leptin, MIG and ENA78. Gains in lean mass were confirmed by dual energy X-ray absorptiometry (DEXA).
Results from this study suggest that there are potential phenotypic biomarkers in serum that can be associated with healthy aging and that some but not all of these biomarkers reflect gains in muscle mass upon testosterone administration.
Testosterone; Age; Biomarker
microRNA (miRNA) are short, noncoding RNA that negatively regulate gene expression and may play a causal role in invasive breast cancer. Since many genetic aberrations of invasive disease are detectable in early stages, we hypothesized that miRNA expression dysregulation and the predicted changes in gene expression might also be found in early breast neoplasias.
Expression profiling of 365 miRNA by real-time quantitative polymerase chain reaction assay was combined with laser capture microdissection to obtain an epithelium-specific miRNA expression signature of normal breast epithelium from reduction mammoplasty (RM) (n = 9) and of paired samples of histologically normal epithelium (HN) and ductal carcinoma in situ (DCIS) (n = 16). To determine how miRNA may control the expression of codysregulated mRNA, we also performed gene expression microarray analysis in the same paired HN and DCIS samples and integrated this with miRNA target prediction. We further validated several target pairs by modulating the expression levels of miRNA in MCF7 cells and measured the expression of target mRNA and proteins.
Thirty-five miRNA were aberrantly expressed between RM, HN and DCIS. Twenty-nine miRNA and 420 mRNA were aberrantly expressed between HN and DCIS. Combining these two data sets with miRNA target prediction, we identified two established target pairs (miR-195:CCND1 and miR-21:NFIB) and tested several novel miRNA:mRNA target pairs. Overexpression of the putative tumor suppressor miR-125b, which is underexpressed in DCIS, repressed the expression of MEMO1, which is required for ErbB2-driven cell motility (also a target of miR-125b), and NRIP1/RIP140, which modulates the transcriptional activity of the estrogen receptor. Knockdown of the putative oncogenic miRNA miR-182 and miR-183, both highly overexpressed in DCIS, increased the expression of chromobox homolog 7 (CBX7) (which regulates E-cadherin expression), DOK4, NMT2 and EGR1. Augmentation of CBX7 by knockdown of miR-182 expression, in turn, positively regulated the expression of E-cadherin, a key protein involved in maintaining normal epithelial cell morphology, which is commonly lost during neoplastic progression.
These data provide the first miRNA expression profile of normal breast epithelium and of preinvasive breast carcinoma. Further, we demonstrate that altered miRNA expression can modulate gene expression changes that characterize these early cancers. We conclude that miRNA dysregulation likely plays a substantial role in early breast cancer development.
editorial; genetics; risk factors
Individuals from families recruited for the Long Life Family Study (LLFS) (n= 4559) were examined and compared to individuals from other cohorts to determine whether the recruitment targeting longevity resulted in a cohort of individuals with better health and function. Other cohorts with similar data included the Cardiovascular Health Study, the Framingham Heart Study, and the New England Centenarian Study. Diabetes, chronic pulmonary disease and peripheral artery disease tended to be less common in LLFS probands and offspring compared to similar aged persons in the other cohorts. Pulse pressure and triglycerides were lower, high density lipids were higher, and a perceptual speed task and gait speed were better in LLFS. Age-specific comparisons showed differences that would be consistent with a higher peak, later onset of decline or slower rate of change across age in LLFS participants. These findings suggest several priority phenotypes for inclusion in future genetic analysis to identify loci contributing to exceptional survival.
longevity; exceptional survival; family studies; genetics; healthy aging; genome wide association study; multicenter studies; aging phenotypes
Family studies of exceptional longevity can potentially identify genetic and other factors contributing to long life and healthy aging. Although such studies seek families that are exceptionally long lived, they also need living members who can provide DNA and phenotype information. On the basis of these considerations, the authors developed a metric to rank families for selection into a family study of longevity. Their measure, the family longevity selection score (FLoSS), is the sum of 2 components: 1) an estimated family longevity score built from birth-, gender-, and nation-specific cohort survival probabilities and 2) a bonus for older living siblings. The authors examined properties of FLoSS-based family rankings by using data from 3 ongoing studies: the New England Centenarian Study, the Framingham Heart Study, and screenees for the Long Life Family Study. FLoSS-based selection yields families with exceptional longevity, satisfactory sibship sizes and numbers of living siblings, and high ages. Parameters in the FLoSS formula can be tailored for studies of specific populations or age ranges or with different conditions. The first component of the FLoSS also provides a conceptually sound survival measure to characterize exceptional longevity in individuals or families in various types of studies and correlates well with later-observed longevity.
aged, 80 and over; family data; longevity; Shannon information
Population stratification can cause spurious associations in a genome-wide association study (GWAS), and occurs when differences in allele frequencies of single nucleotide polymorphisms (SNPs) are due to ancestral differences between cases and controls rather than the trait of interest. Principal components analysis (PCA) is the established approach to detect population substructure using genome-wide data and to adjust the genetic association for stratification by including the top principal components in the analysis. An alternative solution is genetic matching of cases and controls that requires, however, well defined population strata for appropriate selection of cases and controls.
We developed a novel algorithm to cluster individuals into groups with similar ancestral backgrounds based on the principal components computed by PCA. We demonstrate the effectiveness of our algorithm in real and simulated data, and show that matching cases and controls using the clusters assigned by the algorithm substantially reduces population stratification bias. Through simulation we show that the power of our method is higher than adjustment for PCs in certain situations.
In addition to reducing population stratification bias and improving power, matching creates a clean dataset free of population stratification which can then be used to build prediction models without including variables to adjust for ancestry. The cluster assignments also allow for the estimation of genetic heterogeneity by examining cluster specific effects.
The availability of affordable high throughput technology for parallel genotyping has opened the field of genetics to genome-wide association studies (GWAS), and in the last few years hundreds of articles reporting results of GWAS for a variety of heritable traits have been published. What do these results tell us? Although GWAS have discovered a few hundred reproducible associations, this number is underwhelming in relation to the huge amount of data produced, and challenges the conjecture that common variants may be the genetic causes of common diseases. We argue that the massive amount of genetic data that result from these studies remains largely unexplored and unexploited because of the challenge of mining and modeling enormous data sets, the difficulty of using nontraditional computational techniques and the focus of accepted statistical analyses on controlling the false positive rate rather than limiting the false negative rate. In this article, we will review the common approach to analysis of GWAS data and then discuss options to learn more from these data. We will use examples from our ongoing studies of sickle cell anemia and also GWAS in multigenic traits.
We conducted a genome-wide association study (GWAS) to discover single nucleotide polymorphisms (SNPs) associated with the severity of sickle cell anemia in 1,265 patients with either “severe” or “mild” disease based on a network model of disease severity. We analyzed data using single SNP analysis and a novel SNP set enrichment analysis (SSEA) developed to discover clusters of associated SNPs. Single SNP analysis discovered 40 SNPs that were strongly associated with sickle cell severity (odds for association >1,000); of the 32 that we could analyze in an independent set of 163 patients, five replicated, eight showed consistent effects although failed to reach statistical significance, whereas 19 did not show any convincing association. Among the replicated associations are SNPs in KCNK6 a K+ channel gene. SSEA identified 27 genes with a strong enrichment of significant SNPs (P < 10−6); 20 were replicated with varying degrees of confidence. Among the novel findings identified by SSEA is the telomere length regulator gene TNKS. These studies are the first to use GWAS to understand the genetic diversity that accounts the phenotypic heterogeneity sickle cell anemia as estimated by an integrated model of severity. Additional validation, resequencing, and functional studies to understand the biology and reveal mechanisms by which candidate genes might have their effects are the future goals of this work.
To determine whether the offspring of centenarians have personality characteristics that are distinct from the general population.
Nationwide U.S. sample.
Unrelated offspring of centenarians (n = 246, mean age 75) were compared with published norms.
Using the NEO-Five-Factor Inventory (NEO-FFI) questionnaire, measures of the personality traits neuroticism, extraversion, openness, agreeableness, and conscientiousness were obtained. T-scores and percentiles were calculated according to sex and used to interpret the results.
Male and female offspring of centenarians scored in the low range of published norms for neuroticism and in the high range for extraversion. The women also scored comparatively high in agreeableness. Otherwise, both sexes scored within normal range for conscientiousness and openness, and the men scored within normal range for agreeableness.
Specific personality traits may be important to the relative successful aging demonstrated by the offspring of centenarians. Similarities across four of the five domains between male and female offspring is noteworthy and may relate to their successful aging. Measures of personality are an important phenotype to include in studies that assess genetic and environmental influences of longevity and successful aging.
personality; longevity; centenarian; extraversion; neuroticism; agreeableness
Sickle cell anemia (SCA) is a paradigmatic single gene disorder caused by homozygosity with respect to a unique mutation at the β-globin locus. SCA is phenotypically complex, with different clinical courses ranging from early childhood mortality to a virtually unrecognized condition. Overt stroke is a severe complication affecting 6–8% of individuals with SCA. Modifier genes might interact to determine the susceptibility to stroke, but such genes have not yet been identified. Using Bayesian networks, we analyzed 108 SNPs in 39 candidate genes in 1,398 individuals with SCA. We found that 31 SNPs in 12 genes interact with fetal hemoglobin to modulate the risk of stroke. This network of interactions includes three genes in the TGF-β pathway and SELP, which is associated with stroke in the general population. We validated this model in a different population by predicting the occurrence of stroke in 114 individuals with 98.2% accuracy.
Prevention of diabetic retinopathy would benefit from availability of drugs that preempt the effects of hyperglycemia on retinal vessels. We aimed to identify candidate drug targets by investigating the molecular effects of drugs that prevent retinal capillary demise in the diabetic rat.
RESEARCH DESIGN AND METHODS
We examined the gene expression profile of retinal vessels isolated from rats with 6 months of streptozotocin-induced diabetes and compared it with that of control rats. We then tested whether the aldose reductase inhibitor sorbinil and aspirin, which have different mechanisms of action, prevented common molecular abnormalities induced by diabetes. The Affymetrix GeneChip Rat Genome 230 2.0 array was complemented by real-time RT-PCR, immunoblotting, and immunohistochemistry.
The retinal vessels of diabetic rats showed differential expression of 20 genes of the transforming growth factor (TGF)-β pathway, in addition to genes involved in oxidative stress, inflammation, vascular remodeling, and apoptosis. The complete loop of TGF-β signaling, including Smad2 phosphorylation, was enhanced in the retinal vessels, but not in the neural retina. Sorbinil normalized the expression of 71% of the genes related to oxidative stress and 62% of those related to inflammation. Aspirin had minimal or no effect on these two categories. The two drugs were instead concordant in reducing the upregulation of genes of the TGF-β pathway (55% for sorbinil and 40% for aspirin) and apoptosis (74 and 42%, respectively).
Oxidative and inflammatory stress is the distinct signature that the polyol pathway leaves on retinal vessels. TGF-β and apoptosis are, however, the ultimate targets to prevent the capillary demise in diabetic retinopathy.
Although it is commonly held that survival to age 100 years entails markedly delaying or escaping age-related morbidities, nearly one-third of centenarians have age-related morbidities for 15 or more years. Yet, we have previously observed that many centenarians compress disability toward the end of their lives. Therefore, we hypothesize that for some centenarians, compression of disability rather than morbidity is a key feature for survival to old age.
This cross-sectional, nationwide study included 523 women and 216 men 97 years or older. The participants were stratified by sex and age at onset (age <85 years [termed survivors]and age ≥85 years [termed delayers])of chronic obstructive pulmonary disease, dementia, diabetes, heart disease, hypertension, osteoporosis, Parkinson disease, and stroke. Dependent variables were the Barthel Activities of Daily Living Index (Barthel Index) and the Information-Memory-Concentration test of the Blessed Dementia Scale.
Thirty-two percent of the participants were survivors. For men with hypertension and/or heart disease for 15 or more years, the median Barthel Index score was 90 (independence range, 80–100). For female survivors with hypertension, heart disease, and/or osteoporosis, the median Barthel Index score was 65 (minimal assistance range, 60–79). Generally, men had better function than women: 60% of male survivors had Barthel Index scores of 90 or higher compared with 18% of female survivors (P<.001) and 50% of male delayers had Barthel Index scores of 90 or higher compared with 27% of females delayers (P<.001).
Whereas the compression of both morbidity and disability are essential features of survival to old age for some centenarians, for others, the compression of disability alone may be the key prerequisite. Though far fewer in number, male centenarians tend to have significantly better cognition and physical function than their female counterparts.
BMapBuilder builds maps of pairwise linkage disequilibrium (LD) in either two or three dimensions. The optimized resolution allows for graphical display of LD for single nucleotide polymorphisms (SNPs) in a whole chromosome.