We explore the factor structure of DSM-5 cannabis use disorders, examine its prevalence across European- and African-American respondents as well as its genetic underpinnings, utilizing data from a genome-wide study of single nucleotide polymorphisms (SNPs). We also estimate the heritability of DSM-5 cannabis use disorders explained by these common SNPs.
Data on 3053 subjects reporting a lifetime history of cannabis use were utilized. Exploratory and confirmatory factor analyses were conducted to create a factor score, which was used in a genomewide association analysis. P-values from the single SNP analysis were examined for evidence of gene-based association. The aggregate effect of all SNPs was also estimated using Genome-Wide Complex Traits Analysis.
The unidimensionality of DSM-5 cannabis use disorder criteria was demonstrated. Comparing DSM-IV to DSM-5, a decrease in prevalence of cannabis use disorders was only noted in European-American respondents and was exceedingly modest. For the DSM-5 cannabis use disorders factor score, no SNP surpassed the genome-wide significance testing threshold. However, in the European-American subsample, gene-based association testing resulted in significant associations in 3 genes (C17orf58, BPTF and PPM1D) on chromosome 17q24. In aggregate, 21% of the variance in DSM-5 cannabis use disorders was explained by the genomewide SNPs; however, this estimate was not statistically significant.
DSM-5 cannabis use disorder represents a unidimensional construct, the prevalence of which is only modestly elevated above the DSM-IV version. Considerably larger sample sizes will be required to identify individual SNPs associated with cannabis use disorders and unequivocally establish its polygenic underpinnings.
Cannabis; DSM-5; GWAS; association; genetics; heritability
It is well established that risk for developing psychosis is largely mediated by the influence of genes, but identifying precisely which genes underlie that risk has been problematic. Focusing on endophenotypes, rather than illness risk, is one solution to this problem. Impaired cognition is a well-established endophenotype of psychosis. Here we aimed to characterize the genetic architecture of cognition using phenotypically detailed models as opposed to relying on general IQ or individual neuropsychological measures. In so doing we hoped to identify genes that mediate cognitive ability which might also contribute to psychosis risk. Hierarchical factor models of genetically clustered cognitive traits were subjected to linkage analysis followed by QTL region-specific association analyses in a sample of 1,269 Mexican American individuals from extended pedigrees. We identified four genome wide significant QTLs, two for working and two for spatial memory, and a number of plausible and interesting candidate genes. The creation of detailed models of cognition seemingly enhanced the power to detect genetic effects on cognition and provided a number of possible candidate genes for psychosis.
schizophrenia; genetics; cognition; GWAS; linkage
We report a GWAS for cocaine dependence (CD) in three sets of African- and European-American subjects (AAs and EAs, respectively), to identify pathways, genes, and alleles important in CD risk.
The discovery GWAS dataset (n=5,697 subjects) was genotyped using the Illumina OmniQuad microarray (890,000 analyzed SNPs). Additional genotypes were imputed based on the 1000 Genomes reference panel. Top-ranked findings were evaluated by incorporating information from publicly available GWAS data from 4,063 subjects. Then, the most significant GWAS SNPs were genotyped in 2,549 independent subjects.
We observed one genomewide-significant (GWS) result: rs7086629 at the FAM53B (“family with sequence similarity 53, member B”) locus. This was supported in both AAs and EAs; p-value (meta-analysis of all samples) =4.28×10−8. The gene maps to the same chromosomal region as the maximum peak we observed in a previous linkage study. NCOR2 (nuclear receptor corepressor 1) SNP rs150954431 was associated with p=1.19×10−9 in the EA discovery sample. SNP rs2456778, which maps to CDK1 (“cyclin-dependent kinase 1”), was associated with cocaine-induced paranoia in AAs in the discovery sample only (p=4.68×10−8).
This is the first study to identify risk variants for CD using GWAS. Our results implicate novel risk loci and provide insights into potential therapeutic and prevention strategies.
Cocaine dependence; cocaine-induced paranoia; GWAS; population genetics; European-American and African-American populations
Blood pressure (BP) is a complex trait, with a heritability of 30 to 40%. Several genome wide associated BP loci explain only a small fraction of the phenotypic variation. Family studies can provide an important tool for gene discovery by utilizing trait and genetic transmission information among relative-pairs. We have previously described a quantitative trait locus at chromosome 17q25.3 influencing systolic BP in American Indians of the Strong Heart Family Study (SHFS). This locus has been reported to associate with variation in BP traits in family studies of Europeans, African Americans and Hispanics.
To follow-up persuasive linkage findings at this locus, we performed comprehensive genotyping in the 1-LOD unit support interval region surrounding this QTL using a multi-step strategy. We first genotyped 1,334 single nucleotide polymorphisms (SNPs) in 928 individuals from families that showed evidence of linkage for BP. We then genotyped a second panel of 306 SNPs in all SHFS participants (N = 3,807) for genes that displayed the strongest evidence of association in the region, and, in a third step, included additional genotyping to better cover the genes of interest and to interrogate plausible candidate genes in the region.
Three genes had multiple SNPs marginally associated with systolic BP (TBC1D16, HRNBP3 and AZI1). In BQTN analysis, used to estimate the posterior probability that any variant in each gene had an effect on the phenotype, AZI1 showed the most prominent findings (posterior probability of 0.66). Importantly, upon correction for multiple testing, none of our study findings could be distinguished from chance.
Our findings demonstrate the difficulty of follow-up studies of linkage studies for complex traits, particularly in the context of low powered studies and rare variants underlying linkage peaks.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2261-14-158) contains supplementary material, which is available to authorized users.
A significant proportion of the variability in carotid artery lumen diameter is attributable to genetic factors.
Carotid ultrasonography and genotyping were performed in the 3,300 American Indian participants in the Strong Heart Family Study (SHFS) to identify chromosomal regions harboring novel genes associated with inter-individual variation in carotid artery lumen diameter. Genome-wide linkage analysis was conducted using standard variance component linkage methods, implemented in SOLAR, based on multipoint identity-by-descent matrices.
Genome-wide linkage analysis revealed a significant evidence for linkage for a locus for left carotid artery diastolic and systolic lumen diameter in Arizona SHFS participants on chromosome 7 at 120 cM (lod=4.85 and 3.77, respectively, after sex and age adjustment, and lod=3.12 and 2.72, respectively, after adjustment for sex, age, height, weight, systolic and diastolic blood pressure, diabetes mellitus and current smoking). Other regions with suggestive evidence of linkage for left carotid artery diastolic and systolic lumen diameter was found on chromosome 12 at 153 cM (lod=2.20 and 2.60, respectively, after sex and age adjustment, and lod=2.44 and 2.16, respectively, after full covariate adjustment) in Oklahoma SHFS participants; suggestive linkage for right carotid artery diastolic and systolic lumen diameter was found on chromosome 9 at 154 cM (lod=2.72 and 3.19, respectively after sex and age adjustment, and lod=2.36 and 2.21, respectively, after full covariate adjustment) in Oklahoma SHFS participants.
We found significant evidence for loci influencing carotid artery lumen diameter on chromosome 7q and suggestive linkage on chromosomes 12q and 9q.
genetics; carotid artery; ultrasonography; linkage analysis; variance components
Obesity is a major contributor to the global burden of chronic disease and disability, though current knowledge of causal biologic underpinnings is lacking. Through the regulation of energy homeostasis and interactions with adiposity and gut signals, the brain is thought to play a significant role in the development of this disorder. While neuroanatomic variation has been associated with obesity, it is unclear if this relationship is influenced by common genetic mechanisms. In this study, we sought genetic components that influence both brain anatomy and body mass index (BMI) to provide further insight into the role of the brain in energy homeostasis and obesity.
MRI images of brain anatomy were acquired in 839 Mexican American individuals from large extended pedigrees. Bivariate linkage and quantitative analyses were performed in SOLAR.
Genetic factors associated with increased BMI were also associated with reduced cortical surface area and subcortical volume. We identified two genome-wide quantitative trait loci that influenced BMI and ventral diencephalon volume, and BMI and supramarginal gyrus surface area, respectively.
This study represents the first genetic analyses seeking evidence of pleiotropic effects acting on both brain anatomy and BMI. Results suggest that a region on chromosome 17 contributes to the development of obesity, potentially through leptin-induced signaling in the hypothalamus, and that a region on chromosome 3 appears to jointly influences food-related reward circuitry and the supramarginal gyrus.
BMI; obesity; imaging; brain; pleiotropy
Type 2 diabetes (T2DM) is a complex metabolic disease and is more prevalent in certain ethnic groups such as the Mexican Americans. The goal of our study was to perform a genome-wide linkage analysis to localize T2DM susceptibility loci in Mexican Americans.
We used the phenotypic and genotypic data from 1,122 Mexican American individuals (307 families) who participated in the Veterans Administration Genetic Epidemiology Study (VAGES). Genome-wide linkage analysis was performed, using the variance components approach. Data from two additional Mexican American family studies, the San Antonio Family Heart Study (SAFHS) and the San Antonio Family Diabetes/Gallbladder Study (SAFDGS), were combined with the VAGES data to test for improved linkage evidence.
After adjusting for covariate effects, T2DM was found to be under significant genetic influences (h2 = 0.62, P = 2.7 × 10−6). The strongest evidence for linkage of T2DM occurred between markers D9S1871 and D9S2169 on chromosome 9p24.2-p24.1 (LOD = 1.8). Given that we previously reported suggestive evidence for linkage of T2DM at this region in SAFDGS also, we found the significant and increased linkage evidence (LOD = 4.3, empirical P = 1.0 × 10−5, genome-wide P = 1.6 × 10−3) for T2DM at the same chromosomal region when we performed genome-wide linkage analysis of the VAGES data combined with SAFHS and SAFDGS data.
Significant T2DM linkage evidence was found on chromosome 9p24 in Mexican Americans. Importantly, the chromosomal region of interest in this study overlaps with several recent genome-wide association studies (GWASs) involving T2DM related traits. Given its overlap with such findings and our own initial T2DM association findings in the 9p24 chromosomal region, high throughput sequencing of the linked chromosomal region could identify the potential causal T2DM genes.
Type 2 diabetes; Linkage; Chromosome 9p24; Mexican Americans; VAGES
Pediatric metabolic syndrome (MS) and its cardiometabolic components (MSCs) have become increasingly prevalent, yet little is known about the genetics underlying MS risk in children. We examined the prevalence and genetics of MS-related traits among 670 non-diabetic Mexican American (MA) children and adolescents, aged 6–17 years (49 % female), who were participants in the San Antonio Family Assessment of Metabolic Risk Indicators in Youth (SAFARI) study. These children are offspring or biological relatives of adult participants from three well-established Mexican American family studies in San Antonio, Texas, at increased risk of type 2 diabetes. MS was defined as ≥ 3 abnormalities among 6 MSC measures: waist circumference, systolic and/or diastolic blood pressure, fasting insulin, triglycerides, HDL-cholesterol, and fasting and/or 2-h OGTT glucose. Genetic analyses of MS, number of MSCs (MSC-N), MS factors, and bivariate MS traits were performed. Overweight/obesity (53 %), pre-diabetes (13 %), acanthosis nigricans (33 %), and MS (19 %) were strikingly prevalent, as were MS components, including abdominal adiposity (32 %) and low HDL-cholesterol (32 %). Factor analysis of MS traits yielded three constructs: adipo-insulin-lipid, blood pressure, and glucose factors, and their factor scores were highly heritable. MS itself exhibited 68 % heritability. MSC-N showed strong positive genetic correlations with obesity, insulin resistance, inflammation, and acanthosis nigricans, and negative genetic correlation with physical fitness. MS trait pairs exhibited strong genetic and/or environmental correlations. These findings highlight the complex genetic architecture of MS/MSCs in MA children, and underscore the need for early screening and intervention to prevent chronic sequelae in this vulnerable pediatric population.
Waist circumference (WC), the clinical marker of central obesity, is gaining popularity as a screening tool for type 2 diabetes (T2D). While there is epidemiologic evidence favoring the WC-T2D association, its biological substantiation is generally weak. Our objective was to determine the independent association of plasma lipid repertoire with WC.
Design and methods
We used samples and data from the San Antonio Family Heart Study of 1208 Mexican Americans from 42 extended families. We determined association of plasma lipidomic profiles with the cross-sectionally assessed WC. Plasma lipidomic profiling entailed liquid chromatography with mass spectrometry. Statistical analyses included multivariable polygenic regression models and bivariate trait analyses using the SOLAR software.
After adjusting for age and sex interactions, body mass index, homeostasis model of assessment – insulin resistance, total cholesterol, triglycerides, high density lipoproteins and use of lipid lowering drugs, dihydroceramides as a class were associated with WC. Dihydroceramide species 18:0, 20:0, 22:0 and 24:1 were significantly associated and genetically correlated with WC. Two sphingomyelin species (31:1 and 41:1) were also associated with WC.
Plasma dihydroceramide levels independently associate with WC. Thus, high resolution plasma lipidomic studies can provide further credence to the biological underpinnings of the association of WC with T2D.
waist circumference; lipidomics; central obesity; family studies; Mexican Americans
Certain cognitive measures are heritable and differentiate individuals at risk for schizophrenia from unaffected family members and healthy comparison subjects. These deficits in neurocognitive performance in patients with schizophrenia appear stable in the short-term. However, the duration of most, but not all, longitudinal studies is modest and the majority have relied on traditional average performance measures to examine stability. Using a computerized neurocognitive battery (CNB), we assessed mean performance (accuracy and speed) and intra-individual variability (IIV) in a longitudinal study aimed to examine neurocognitive stability in European-American multiplex families with schizophrenia. Thirty-four patients with schizophrenia, 65 unaffected relatives, and 45 healthy comparison subjects completed the same computerized neurocognitive assessment over approximately 5 years. Measures of mean performance showed that patients had stable accuracy performance but were slower in many neurocognitive domains over time as compared with unaffected family members and healthy subjects. Furthermore, patients and family members showed dissociable patterns of change in IIV for speed across cognitive domains: compared with controls, patients showed higher across-task IIV in performance compared with family members, who showed lower across-task IIV. Patients showed an increase in IIV over time, whereas family members showed a decrease. These findings suggest that measures of mean performance and IIV of speed during a CNB may provide useful information about the genetic susceptibility in schizophrenia.
intra-individual variability; schizophrenia; cognition; family
Discrete time survival analysis (DTSA) was used to assess the age-specific association of event related oscillations (EROs) and CHRM2 gene variants on the onset of regular alcohol use and alcohol dependence. The subjects were 2938 adolescents and young adults ages 12 to 25. Results showed that the CHRM2 gene variants and ERO risk factors had hazards which varied considerably with age. The bulk of the significant age-specific associations occurred in those whose age of onset was under 16. These associations were concentrated in those subjects who at some time took an illicit drug. These results are consistent with studies which associate greater rates of alcohol dependence among those who begin drinking at an early age. The age specificity of the genetic and neurophysiological factors is consistent with recent studies of adolescent brain development, which locate an interval of heightened vulnerability to substance use disorders in the early to mid teens.
alcoholism; CHRM2; survival analysis; ERO; genetics; adolescents
Both as a component of metabolic syndrome and as an independent entity, hypertension poses a continued challenge with regard to its diagnosis, pathogenesis and treatment. Previous studies have documented connections between hypertension and indicators of lipid metabolism. Novel technologies like plasma lipidomic profiling promise a better understanding of disorders in which there is a derangement of the lipid metabolism. However, association of plasma lipidomic profiles with hypertension in a high-risk population, like Mexican Americans, has not been evaluated before. Using the rich data and sample resource from the ongoing San Antonio Family Heart Study, we conducted plasma lipidomic profiling by combining high performance liquid chromatography with tandem mass spectroscopy to characterize 319 lipid species in 1192 individuals from 42 large and extended Mexican American families. Robust statistical analyses employing polygenic regression models, liability threshold models and bivariate trait analyses implemented in the SOLAR software were conducted after accounting for obesity, insulin resistance and relative abundance of various lipoprotein fractions. Diacylglycerols in general and the DG 16:0/22:5 and DG 16:0/22:6 lipid species in particular were significantly associated with systolic, diastolic and mean arterial pressures as well as liability of incident hypertension measured during 7767.42 person-years of follow-up. Four lipid species, including the DG 16:0/22:5 and DG 16:0/22:6 species, showed significant genetic correlations with the liability of hypertension in bivariate trait analyses. Our results demonstrate the value of plasma lipidomic profiling in the context of hypertension and identify disturbance of diacyglycerol metabolism as an independent biomarker of hypertension.
lipidomics; hypertension; blood pressure; lipid species; Mexican Americans
Alcohol dependence (AD) is a heritable substance addiction with adverse physical and psychological consequences, representing a major health and economic burden on societies worldwide. Genes thus far implicated via linkage, candidate gene and genome-wide association studies (GWAS) account for only a small fraction of its overall risk, with effects varying across ethnic groups. Here we investigate the genetic architecture of alcoholism and report on the extent to which common, genome-wide SNPs collectively account for risk of AD in two US populations, African-Americans (AAs) and European-Americans (EAs). Analyzing GWAS data for two independent case-control sample sets, we compute polymarker scores that are significantly associated with alcoholism (P=1.64 × 10−3 and 2.08 × 10−4 for EAs and AAs, respectively), reflecting the small individual effects of thousands of variants derived from patterns of allelic architecture that are population-specific. Simulations show that disease models based on rare and uncommon causal variants (MAF<0.05) best fit the observed distribution of polymarker signals. When scoring bins were annotated for gene location and examined for constituent biological networks, gene enrichment is observed for several cellular processes and functions in both EA and AA populations, transcending their underlying allelic differences. Our results reveal key insights into the complex etiology of AD, raising the possibility of an important role for rare and uncommon variants, and identify polygenic mechanisms that encompass a spectrum of disease liability, with some, such as chloride transporters and glycine metabolism genes, displaying subtle, modifying effects that are likely to escape detection in most GWAS designs.
alcohol dependence; GWAS; polymarker scores; synthetic association; rare variants; pathway analysis
Statistical genetic methods incorporating temporal variation allow for greater understanding of genetic architecture and consistency of biological variation influencing development of complex diseases. This study proposes a bivariate association method jointly testing association of two quantitative phenotypic measures from different time points. Measured genotype association was analyzed for single-nucleotide polymorphisms (SNPs) for systolic blood pressure (SBP) from the first and third visits using 200 simulated Genetic Analysis Workshop 18 (GAW18) replicates. Bivariate association, in which the effect of an SNP on the mean trait values of the two phenotypes is constrained to be equal for both measures and is included as a covariate in the analysis, was compared with a bivariate analysis in which the effect of an SNP was estimated separately for the two measures and univariate association analyses in 9 SNPs that explained greater than 0.001% SBP variance over all 200 GAW18 replicates.The SNP 3_48040283 was significantly associated with SBP in all 200 replicates with the constrained bivariate method providing increased signal over the unconstrained bivariate method. This method improved signal in all 9 SNPs with simulated effects on SBP for nominal significance (p-value <0.05). However, this appears to be determined by the effect size of the SNP on the phenotype. This bivariate association method applied to longitudinal data improves genetic signal for quantitative traits when the effect size of the variant is moderate to large.
Genetic Analysis Workshop 18 (GAW18) focused on identification of genes and functional variants that influence complex phenotypes in human sequence data. Data for the workshop were donated by the T2D-GENES Consortium and included whole genome sequences for odd-numbered autosomes in 464 key individuals selected from 20 Mexican American families, a dense set of single-nucleotide polymorphisms in 959 individuals in these families, and longitudinal data on systolic and diastolic blood pressure measured at 1-4 examinations over a period of 20 years. Simulated phenotypes were generated based on the real sequence data and pedigree structures. In the design of the simulation model, gene expression measures from the San Antonio Family Heart Study (not distributed as part of the GAW18 data) were used to identify genes whose mRNA levels were correlated with blood pressure. Observed variants within these genes were designated as functional in the GAW18 simulation if they were nonsynonymous and predicted to have deleterious effects on protein function or if they were noncoding and associated with mRNA levels. Two simulated longitudinal phenotypes were modeled to have the same trait distributions as the real systolic and diastolic blood pressure data, with effects of age, sex, and medication use, including a genotype-medication interaction. For each phenotype, more than 1000 sequence variants in more than 200 genes present on the odd-numbered autosomes individually explained less than 0.01-2.78% of phenotypic variance. Cumulatively, variants in the most influential gene explained 7.79% of trait variance. An additional simulated phenotype, Q1, was designed to be correlated among family members but to not be associated with any sequence variants. Two hundred replicates of the phenotypes were simulated, with each including data for 849 individuals.
The concept of breeding values, an individual's phenotypic deviation from the population mean as a result of the sum of the average effects of the genes they carry, is of great importance in livestock, aquaculture, and cash crop industries where emphasis is placed on an individual's potential to pass desirable phenotypes on to the next generation. As breeding or genetic values (as referred to here) cannot be measured directly, estimated genetic values (EGVs) are based on an individual's own phenotype, phenotype information from relatives, and, increasingly, genetic data. Because EGVs represent additive genetic variation, calculating EGVs in an extended human pedigree is expected to provide a more refined phenotype for genetic analyses. To test the utility of EGVs in genome-wide association, EGVs were calculated for 847 members of 20 extended Mexican American families based on 100 replicates of simulated systolic blood pressure. Calculations were performed in GAUSS to solve a variation on the standard Best Linear Unbiased Predictor (BLUP) mixed model equation with age, sex, and the first 3 principal components of sample-wide genetic variability as fixed effects and the EGV as a random effect distributed around the relationship matrix. Three methods of calculating kinship were considered: expected kinship from pedigree relationships, empirical kinship from common variants, and empirical kinship from both rare and common variants. Genome-wide association analysis was conducted on simulated phenotypes and EGVs using the additive measured genotype approach in the SOLAR software package. The EGV-based approach showed only minimal improvement in power to detect causative loci.
Genetic Analysis Workshop 18 provided a platform for developing and evaluating statistical methods to analyze whole-genome sequence data from a pedigree-based sample. In this article we present an overview of the data sets and the contributions that analyzed these data. The family data, donated by the Type 2 Diabetes Genetic Exploration by Next-Generation Sequencing in Ethnic Samples Consortium, included sequence-level genotypes based on sequencing and imputation, genome-wide association genotypes from prior genotyping arrays, and phenotypes from longitudinal assessments. The contributions from individual research groups were extensively discussed before, during, and after the workshop in theme-based discussion groups before being submitted for publication.
Mexican Americans are at an increased risk of both thyroid dysfunction and metabolic syndrome (MS). Thus it is conceivable that some components of the MS may be associated with the risk of thyroid dysfunction in these individuals. Our objective was to investigate and replicate the potential association of MS traits with thyroid dysfunction in Mexican Americans.
We conducted association testing for 18 MS traits in two large studies on Mexican Americans – the San Antonio Family Heart Study (SAFHS) and the National Health and Nutrition Examination Survey (NHANES) 2007–10. A total of 907 participants from 42 families in SAFHS and 1633 unrelated participants from NHANES 2007–10 were included in this study. The outcome measures were prevalence of clinical and subclinical hypothyroidism and thyroid function index (TFI) – a measure of thyroid function. For the SAFHS, we used polygenic regression analyses with multiple covariates to test associations in setting of family studies. For the NHANES 2007–10, we corrected for the survey design variables as needed for association analyses in survey data. In both datasets, we corrected for age, sex and their linear and quadratic interactions.
TFI was an accurate indicator of clinical thyroid status (area under the receiver-operating-characteristic curve to detect clinical hypothyroidism, 0.98) in both SAFHS and NHANES 2007–10. Of the 18 MS traits, waist circumference (WC) showed the most consistent association with TFI in both studies independently of age, sex and body mass index (BMI). In the SAFHS and NHANES 2007–10 datasets, each standard deviation increase in WC was associated with 0.13 (p < 0.001) and 0.11 (p < 0.001) unit increase in the TFI, respectively. In a series of polygenic and linear regression models, central obesity (defined as WC ≥ 102 cm in men and ≥88 cm in women) was associated with clinical and subclinical hypothyroidism independent of age, sex, BMI and type 2 diabetes in both datasets. Estimated prevalence of hypothyroidism was consistently high in those with central obesity, especially below 45y of age.
WC independently associates with increased risk of thyroid dysfunction. Use of WC to identify Mexican American subjects at high risk of thyroid dysfunction should be investigated in future studies.
Waist circumference; Central obesity; Thyroid dysfunction; Mexican Americans
Statistical genetic analysis of quantitative traits in large pedigrees is a formidable computational task due to the necessity of taking the non-independence among relatives into account. With the growing awareness that rare sequence variants may be important in human quantitative variation, heritability and association study designs involving large pedigrees will increase in frequency due to the greater chance of observing multiple copies of rare variants amongst related individuals. Therefore, it is important to have statistical genetic test procedures that utilize all available information for extracting evidence regarding genetic association. Optimal testing for marker/phenotype association involves the exact calculation of the likelihood ratio statistic which requires the repeated inversion of potentially large matrices. In a whole genome sequence association context, such computation may be prohibitive. Toward this end, we have developed a rapid and efficient eigensimplification of the likelihood that makes analysis of family data commensurate with the analysis of a comparable sample of unrelated individuals. Our theoretical results which are based on a spectral representation of the likelihood yield simple exact expressions for the expected likelihood ratio test statistic (ELRT) for pedigrees of arbitrary size and complexity. For heritability, the ELRT is:
where ĥ2 and λgi are respectively the heritability and eigenvalues of the pedigree-derived genetic relationship kernel (GRK). For association analysis of sequence variants, the ELRT is given by
where ht2,hq2, and hr2 are the total, quantitative trait nucleotide, and residual heritabilities, respectively. Using these results, fast and accurate analytical power analyses are possible, eliminating the need for computer simulation. Additional benefits of eigensimplification include a simple method for calculation of the exact distribution of the ELRT under the null hypothesis which turns out to differ from that expected under the usual asymptotic theory. Further, when combined with the use of empirical GRKs—estimated over a large number of genetic markers— our theory reveals potential problems associated with non positive semi-definite kernels. These procedures are being added to our general statistical genetic computer package, SOLAR.
Plasma lipidomic studies using high performance liquid chromatography and mass spectroscopy offer detailed insights into metabolic processes. Taking the example of the most abundant plasma lipid class (phosphatidylcholines) we used the rich phenotypic and lipidomic data from the ongoing San Antonio Family Heart Study of large extended Mexican American families to assess the variability of association of the plasma phosphatidylcholine species with metabolic syndrome. Using robust statistical analytical methods, our study made two important observations. First, there was a wide variability in the association of phosphatidylcholine species with risk measures of metabolic syndrome. Phosphatidylcholine 40:7 was associated with a low risk while phosphatidylcholines 32:1 and 38:3 were associated with a high risk of metabolic syndrome. Second, all the odd chain phosphatidylcholines were associated with a reduced risk of metabolic syndrome implying that phosphatidylcholines derived from dairy products might be beneficial against metabolic syndrome. Our results demonstrate the value of lipid species-specific information provided by the upcoming array of lipidomic studies and open potential avenues for prevention and control of metabolic syndrome in high prevalence settings.
high performance liquid chromatography; mass spectroscopy; phosphatidylcholine; molecular biology
Several studies have identified genes associated with alcohol use disorders, but the variation in each of these genes explains only a small portion of the genetic vulnerability. The goal of the present study was to perform a genome-wide association study (GWAS) in extended families from the Collaborative Study on the Genetics of Alcoholism (COGA) to identify novel genes affecting risk for alcohol dependence. To maximize the power of the extended family design we used a quantitative endophenotype, measured in all individuals: number of alcohol dependence symptoms endorsed (symptom count). Secondary analyses were performed to determine if the single nucleotide polymorphisms (SNPs) associated with symptom count were also associated with the dichotomous phenotype, DSM-IV alcohol dependence. This family-based GWAS identified SNPs in C15orf53 that are strongly associated with DSM-IV alcohol (p=4.5×10−8, inflation corrected p=9.4×10−7). Results with DSM-IV alcohol dependence in the regions of interest support our findings with symptom count, though the associations were less significant. Attempted replications of the most promising association results were conducted in two independent samples: non-overlapping subjects from the Study of Addiction: Genes and Environment (SAGE) and the Australian twin-family study of alcohol use disorders (OZALC). Nominal association of C15orf53 with symptom count was observed in SAGE. The variant that showed strongest association with symptom count, rs12912251 and its highly correlated variants (D′=1, r2≥ 0.95), has previously been associated with risk for bipolar disorder.
DSM-IV alcohol dependence symptoms; Family-based GWAS; C15orf53; Quantitative traits
Copy number variation (CNV) remains poorly defined in many populations, including Mexican Americans. We report the discovery and genetic confirmation of copy number variable regions (CNVRs) in subjects of the San Antonio Family Heart and the San Antonio Family Diabetes Gallbladder Studies, both comprised of multigenerational pedigrees of Mexican American descent. In a discovery group of 1677 participants genotyped using Illumina Infinium Beadchips, we identified 2937 unique CNVRs, some with observation frequencies as low as 0.002, using a process that integrates pedigree information with CNV calls made by PennCNV and/or QuantiSNP. Quantitative copy number values had statistically significant (P≤1.792e-5) heritability estimates ranging from 0.139 to 0.863 for 2776 CNVRs. Additionally, 920 CNVRs showed evidence of linkage to their genomic location, providing strong genetic confirmation. Linked CNVRs were enriched in a set of independently identified CNVRs from a second group of 380 samples, confirming that these CNVRs can be used as predefined CNVRs of high confidence. Interestingly, we identified 765 putatively novel variants that do not overlap with the Database of Genomic Variants. This study is the first to use linkage and heritability in multigenerational pedigrees as a confirmation approach for the discovery of CNVRs, and the largest study to date investigating copy number variation on a genome-wide scale in individuals of Mexican American descent. These results provide insight to the structural variation present in Mexican Americans and show the strength of multigenerational pedigrees to elucidate structural variation in the human genome.
copy number variation; Mexican Americans; MODY5; pedigree CNVRs; pedigree
Intima-media thickness (IMT) of the common and internal carotid arteries is an established surrogate for atherosclerosis and predicts risk of stroke and myocardial infarction. Often IMT is measured as the average of these two arteries, yet they are believed to result from separate biological mechanisms. The aim of this study was to conduct a family-based genome-wide association study (GWAS) for IMT to identify polymorphisms influencing IMT and to determine if distinct carotid artery segments are influenced by different genetic components.
Methods and Results
IMT for the common and internal carotid arteries was determined through B-mode ultrasound in 772 Mexican Americans from the San Antonio Family Heart Study. A GWAS utilizing 931,219 single nucleotide polymorphisms (SNPs) was undertaken with six internal and common carotid artery IMT phenotypes utilizing an additive measured genotype model. The most robust association detected was for two SNPs (rs16983261, rs6113474, p=1.60e−7) in complete linkage disequilibrium on chromosome 20p11 for the internal carotid artery near wall, next to the gene PAX1. We also replicated previously reported GWAS regions on chromosomes 19q13 and 7q22. We found no overlapping associations between internal and common carotid artery phenotypes at p<5.0e0−6. The genetic correlation between the two carotid IMT arterial segments was 0.51.
This study represents the first large scale GWAS of carotid IMT in a non-European population and identified several novel loci. We do not detect any shared GWAS signals between common and internal carotid arterial segments but the moderate genetic correlation implies both common and unique genetic components.
intima-media thickness; carotid artery; GWAS; Hispanics
The Enhancing NeuroImaging Genetics through Meta-Analysis (ENIGMA) Consortium is a collaborative network of researchers working together on a range of large-scale studies that integrate data from 70 institutions worldwide. Organized into Working Groups that tackle questions in neuroscience, genetics, and medicine, ENIGMA studies have analyzed neuroimaging data from over 12,826 subjects. In addition, data from 12,171 individuals were provided by the CHARGE consortium for replication of findings, in a total of 24,997 subjects. By meta-analyzing results from many sites, ENIGMA has detected factors that affect the brain that no individual site could detect on its own, and that require larger numbers of subjects than any individual neuroimaging study has currently collected. ENIGMA’s first project was a genome-wide association study identifying common variants in the genome associated with hippocampal volume or intracranial volume. Continuing work is exploring genetic associations with subcortical volumes (ENIGMA2) and white matter microstructure (ENIGMA-DTI). Working groups also focus on understanding how schizophrenia, bipolar illness, major depression and attention deficit/hyperactivity disorder (ADHD) affect the brain. We review the current progress of the ENIGMA Consortium, along with challenges and unexpected discoveries made on the way.
Genetics; MRI; GWAS; Consortium; Meta-analysis; Multi-site
Increased serum uric acid (SUA) is a risk factor for gout and renal and cardiovascular disease (CVD). The purpose of this study was to identify genetic factors that affect the variation in SUA in 632 Mexican Americans participants of the San Antonio Family Heart Study (SAFHS). A genome-wide association (GWA) analysis was performed using the Illumina Human Hap 550K single nucleotide polymorphism (SNP) microarray. We used a linear regression-based association test under an additive model of allelic effect, while accounting for non-independence among family members via a kinship variance component. All analyses were performed in the software package SOLAR. SNPs rs6832439, rs13131257, and rs737267 in solute carrier protein 2 family, member 9 (SLC2A9) were associated with SUA at genome-wide significance (p < 1.3 × 10−7). The minor alleles of these SNPs had frequencies of 36.2, 36.2, and 38.2%, respectively, and were associated with decreasing SUA levels. All of these SNPs were located in introns 3–7 of SLC2A9, the location of the previously reported associations in European populations. When analyzed for association with cardiovascular-renal disease risk factors, conditional on SLC2A9 SNPs strongly associated with SUA, significant associations were found for SLC2A9 SNPs with BMI, body weight, and waist circumference (p < 1.4 × 10−3) and suggestive associations with albumin-creatinine ratio and total antioxidant status (TAS). The SLC2A9 gene encodes an urate transporter that has considerable influence on variation in SUA. In addition to the primary association locus, suggestive evidence (p < 1.9 × 10−6) for joint linkage/association (JLA) was found at a previously-reported urate quantitative trait locus (Logarithm of odds score = 3.6) on 3p26.3. In summary, our GWAS extends and confirms the association of SLC2A9 with SUA for the first time in a Mexican American cohort and also shows for the first time its association with cardiovascular-renal disease risk factors.
variance components decomposition approach; joint linkage/association analysis; kinship; hyperuricemia