Search tips
Search criteria

Results 1-25 (72)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  A Network-Based Kernel Machine Test for the Identification of Risk Pathways in Genome-Wide Association Studies 
Human heredity  2014;76(2):64-75.
Biological pathways provide rich information and biological context on the genetic causes of complex diseases. The logistic kernel machine test integrates prior knowledge on pathways in order to analyze data from genome-wide association studies (GWAS). Here, the kernel converts genomic information of two individuals to a quantitative value reflecting their genetic similarity. With the selection of the kernel one implicitly chooses a genetic effect model. Like many other pathway methods, none of the available kernels accounts for topological structure of the pathway or gene-gene interaction types. However, evidence indicates that connectivity and neighborhood of genes are crucial in the context of GWAS, because genes associated with a disease often interact. Thus, we propose a novel kernel that incorporates the topology of pathways and information on interactions. Using simulation studies, we demonstrate that the proposed method maintains the type I error correctly and can be more effective in the identification of pathways associated with a disease than non-network-based methods. We apply our approach to genome-wide association case control data on lung cancer and rheumatoid arthritis. We identify some promising new pathways associated with these diseases, which may improve our current understanding of the genetic mechanisms.
PMCID: PMC4026009  PMID: 24434848
Kernel Machine Test; Pathways; Networks; Gene-Gene Interactions; Score Test; Generalized Linear Model; Lung Cancer; Rheumatoid Arthritis; Disease Association; Genetic Association Studies
2.  Human Cardiovascular Disease IBC Chip-Wide Association with Weight Loss and Weight Regain in the Look AHEAD Trial 
Human heredity  2013;75(0):160-174.
The present study identified genetic predictors of weight change during behavioral weight loss treatment.
Participants were 3,899 overweight/obese individuals with type 2 diabetes from Look AHEAD, a randomized controlled trial to determine the effects of intensive lifestyle intervention (ILI), including weight loss and physical activity, relative to diabetes support and education, on cardiovascular outcomes. Analyses focused on associations of single nucleotide polymorphisms (SNPs) on the Illumina CARe iSelect (IBC) chip (minor allele frequency >5%; n = 31,959) with weight change at year 1 and year 4, and weight regain at year 4, among individuals who lost ≥ 3% at year 1.
Two novel regions of significant chip-wide association with year-1 weight loss in ILI were identified (p < 2.96E-06). ABCB11 rs484066 was associated with 1.16 kg higher weight per minor allele at year 1, whereas TNFRSF11A, or RANK, rs17069904 was associated with 1.70 kg lower weight per allele at year 1.
This study, the largest to date on genetic predictors of weight loss and regain, indicates that SNPs within ABCB11, related to bile salt transfer, and TNFRSF11A, implicated in adipose tissue physiology, predict the magnitude of weight loss during behavioral intervention. These results provide new insights into potential biological mechanisms and may ultimately inform weight loss treatment.
PMCID: PMC4257841  PMID: 24081232
Type 2 diabetes; Obesity; Weight loss; Diet; Genetics
3.  A Rapid Genome-wide Gene-based Association Test with Multivariate Traits 
Human heredity  2013;76(2):53-63.
A gene-based genome-wide association study (GWAS) provides a powerful alternative to the traditional single SNP association analysis due to its substantial reduction in the multiple testing burden and possible gain in power due to modeling multiple SNPs within a gene. A gene-based association analysis on multivariate traits is often of interest, but imposes substantial analytical as well as computational challenges to implement it at a genome-wide level.
We have proposed a rapid implementation of multivariate multiple linear regression approach (RMMLR) in unrelated individuals as well as in families. Our approach allows for covariates. Moreover the asymptotic distribution of the test statistic is not heavily influenced by the linkage disequilibrium (LD) among the SNPs and hence can be used efficiently to perform a gene-based GWAS. We have developed corresponding R package to implement such multivariate gene-based GWAS with this RMMLR approach.
We compare through extensive simulation several approaches for both single and multivariate traits. Our RMMLR maintains correct type-I error level even for set of SNPs in strong LD. It also has substantial gain in power to detect a gene when it is associated with a subset of the traits. We have also studied their performance on Minnesota Center for Twin Family Research dataset.
In our overall comparison, our RMMLR approach provides an efficient and powerful tool to perform a gene-based GWAS with single or multivariate traits and maintains the type I error appropriately.
PMCID: PMC4228787  PMID: 24247328
Multivariate regression; Gene-based genome-wide association studies; Multivariate trait
4.  Genetic admixture and obesity: recent perspectives and future applications 
Human heredity  2013;75(0):10.1159/000353180.
The process of the colonization of the New World that occurred centuries ago served as a natural experiment, creating unique combinations of genetic material in newly formed admixed populations. The identification and genotyping of ancestry informative markers (AIMs) have allowed for the estimation of proportions of ancestral parental populations among individuals in a sample through the genetic admixture approach. These admixture estimates have been used in different ways to understand the genetic contributions to individual variation in obesity and body composition parameters, particularly among diverse admixed groups known to differ in obesity prevalence within the United States. Although progress has been made through the use of genetic admixture approaches, future investigations are needed in order to explore the interaction of environmental factors with the degree of genetic ancestry in individuals. A challenge to confront at this time would be to further stratify and define environments in progressively more granular terms, including nutrients, muscle biology, stress responses at the cellular level, and the social and built environments.
PMCID: PMC3884567  PMID: 24081225
Genetic admixture; obesity; body composition; race/ethnicity; Ancestry Informative Markers
5.  Identification of Pleiotropic Genetic Effects on Obesity and Brain Anatomy 
Human heredity  2013;75(0):136-143.
Obesity is a major contributor to the global burden of chronic disease and disability, though current knowledge of causal biologic underpinnings is lacking. Through the regulation of energy homeostasis and interactions with adiposity and gut signals, the brain is thought to play a significant role in the development of this disorder. While neuroanatomic variation has been associated with obesity, it is unclear if this relationship is influenced by common genetic mechanisms. In this study, we sought genetic components that influence both brain anatomy and body mass index (BMI) to provide further insight into the role of the brain in energy homeostasis and obesity.
MRI images of brain anatomy were acquired in 839 Mexican American individuals from large extended pedigrees. Bivariate linkage and quantitative analyses were performed in SOLAR.
Genetic factors associated with increased BMI were also associated with reduced cortical surface area and subcortical volume. We identified two genome-wide quantitative trait loci that influenced BMI and ventral diencephalon volume, and BMI and supramarginal gyrus surface area, respectively.
This study represents the first genetic analyses seeking evidence of pleiotropic effects acting on both brain anatomy and BMI. Results suggest that a region on chromosome 17 contributes to the development of obesity, potentially through leptin-induced signaling in the hypothalamus, and that a region on chromosome 3 appears to jointly influences food-related reward circuitry and the supramarginal gyrus.
PMCID: PMC3889074  PMID: 24081229
BMI; obesity; imaging; brain; pleiotropy
6.  Next-Generation Sequence Analysis of Genes Associated with Obesity and Nonalcoholic Fatty Liver Disease-Related Cirrhosis in Extreme Obesity 
Human heredity  2013;75(0):144-151.
Genome-wide association studies (GWAS) have led to the identification of single nucleotide polymorphisms in or near several loci that are associated with the risk of obesity and nonalcoholic fatty liver disease (NAFLD). We hypothesized that missense variants in GWAS and related candidate genes may underlie cases of extreme obesity and NAFLD-related cirrhosis, an extreme manifestation of NAFLD.
We performed whole-exome sequencing on 6 Caucasian patients with extreme obesity [mean body mass index (BMI) 84.4] and 4 obese Caucasian patients (mean BMI 57.0) with NAFLD-related cirrhosis.
Sequence analysis was performed on 24 replicated GWAS and selected candidate obesity genes and 5 loci associated with NAFLD. No missense variants were identified in 19 of the 29 genes analyzed, although all patients carried at least 2 missense variants in the remaining genes without excess homozygosity. One patient with extreme obesity carried 2 novel damaging mutations in BBS1 and was homozygous for benign and damaging MC3R variants. In addition, 1 patient with NAFLD-related cirrhosis was compound heterozygous for rare damaging mutations in PNPLA3.
These results indicate that analyzing candidate loci previously identified by GWAS analyses using whole-exome sequencing is an effective strategy to identify potentially causative missense variants underlying extreme obesity and NAFLD-related cirrhosis.
PMCID: PMC3981063  PMID: 24081230
Extreme obesity; Nonalcoholic fatty liver disease; Cirrhosis; Genome-wide association studies; Whole-exome sequencing
7.  Propagation of Obesity Across Generations: The Roles of Differential Realized Fertility and Assortative Mating by Body Mass Index 
Human heredity  2013;75(0):204-212.
To quantify the extent to which the increase in obesity observed across recent generations of the American population is associated with the individual or combined effects of assortative mating for body mass index (BMI; kg/m2) and differential realized fertility by BMI.
A Monte Carlo framework is formed and informed using data collected from the National Longitudinal Survey of Youth (NLSY). The model has two portions, one that generates childbirth events on an annual basis and another that produces a BMI for each child. Once the model is informed using the data, a reference distribution of offspring BMIs is simulated. We quantify the effects of our factors of interest by removing them from the model and comparing the resulting offspring BMI distributions with that of the baseline scenario.
An association between maternal BMI and number of offspring is evidenced in the NLSY data, as well as the presence of assortative mating. These two factors combined are associated with increased mean BMI (+0.067, C.I. [0.056, 0.078]), increased BMI variance (+0.578, C.I. [0.418, 0.736]) and increased prevalence of obesity (RR 1.032, 95% C.I. [1.023, 1.041]) and BMIs over 40 (RR 1.083, 95% C.I. [1.053, 1.118]) among offspring.
Our investigation suggests that both differential realized fertility and assortative mating by BMI appear to play a role in the increasing prevalence of obesity in America.
PMCID: PMC4010105  PMID: 24081235
Obesity; Body Mass Index; Assortative Mating; Realized Fertility; Monte Carlo Simulation
8.  The positive association of obesity variants with adulthood adiposity strengthens over an 80-year period: A gene-by-birth year interaction 
Human heredity  2013;75(0):175-185.
To test the hypothesis that the statistical effect of obesity-related genetic variants on adulthood adiposity traits depends on birth year.
The study sample included 907 related, non-Hispanic White participants in the Fels Longitudinal Study, born between 1901 and 1986, and aged 25–64.99 years (474 females; 433 males) at the time of measurement. All had both genotype data from which a genetic risk score (GRS) composed of 32 well-replicated obesity-related common single nucleotide polymorphisms was created, and phenotype data (including body mass index (BMI), waist circumference, and the sum of four subcutaneous skinfolds. Maximum likelihood-based variance components analysis was used to estimate trait heritabilities, main effects of GRS and birth year, GRS-by-birth year interaction, sex, and age.
Positive GRS-by-birth year interaction effects were found for BMI (p<0.001), waist circumference (p=0.007), and skinfold thickness (p<0.007). For example, each one-allele increase in GRS was estimated to result in a 0.16 kg/m2 increase in BMI among males born in 1930 compared to a 0.47 kg/m2 increase among those born in 1970.
These novel findings suggest the influence of common obesity susceptibility variants has increased during the obesity epidemic.
PMCID: PMC4091039  PMID: 24081233
gene; genetic; heritability; risk score; obesity; BMI; adiposity; waist circumference; interaction; gene-by-environment interaction; secular trend; single nucleotide polymorphism (SNP)
9.  Linkage of Type 2 Diabetes on Chromosome 9p24 in Mexican Americans: Additional Evidence from the Veterans Administration Genetic Epidemiology Study (VAGES) 
Human heredity  2013;76(1):36-46.
Type 2 diabetes (T2DM) is a complex metabolic disease and is more prevalent in certain ethnic groups such as the Mexican Americans. The goal of our study was to perform a genome-wide linkage analysis to localize T2DM susceptibility loci in Mexican Americans.
We used the phenotypic and genotypic data from 1,122 Mexican American individuals (307 families) who participated in the Veterans Administration Genetic Epidemiology Study (VAGES). Genome-wide linkage analysis was performed, using the variance components approach. Data from two additional Mexican American family studies, the San Antonio Family Heart Study (SAFHS) and the San Antonio Family Diabetes/Gallbladder Study (SAFDGS), were combined with the VAGES data to test for improved linkage evidence.
After adjusting for covariate effects, T2DM was found to be under significant genetic influences (h2 = 0.62, P = 2.7 × 10−6). The strongest evidence for linkage of T2DM occurred between markers D9S1871 and D9S2169 on chromosome 9p24.2-p24.1 (LOD = 1.8). Given that we previously reported suggestive evidence for linkage of T2DM at this region in SAFDGS also, we found the significant and increased linkage evidence (LOD = 4.3, empirical P = 1.0 × 10−5, genome-wide P = 1.6 × 10−3) for T2DM at the same chromosomal region when we performed genome-wide linkage analysis of the VAGES data combined with SAFHS and SAFDGS data.
Significant T2DM linkage evidence was found on chromosome 9p24 in Mexican Americans. Importantly, the chromosomal region of interest in this study overlaps with several recent genome-wide association studies (GWASs) involving T2DM related traits. Given its overlap with such findings and our own initial T2DM association findings in the 9p24 chromosomal region, high throughput sequencing of the linked chromosomal region could identify the potential causal T2DM genes.
PMCID: PMC3919448  PMID: 24060607
Type 2 diabetes; Linkage; Chromosome 9p24; Mexican Americans; VAGES
10.  Intra-familial tests of association between Familial Idiopathic Scoliosis and markers on 9q31.3-q34.3 and 16p12.3-q22.2 
Human heredity  2012;74(1):36-44.
Custom genotyping of markers in families with Familial Idiopathic Scoliosis (FIS) were used to fine-map candidate regions on chromosomes 9 and 16 in order to identify candidate genes that contribute to this disorder and prioritize them for next generation sequence analysis.
Candidate regions on 9q and 16p–16q, previously identified as linked to FIS in a study of 202 families, were genotyped with a high-density map of single nucleotide polymorphisms (SNPs). Tests of linkage for fine-mapping and intra-familial tests of association, including tiled regression, were performed on scoliosis as both a qualitative and quantitative trait.
Results and Conclusions
Nominally significant linkage results were found for markers in both candidate regions. Results from intra-familial tests of association and tiled regression corroborated the linkage findings and identified possible candidate genes suitable for follow-up with next generation sequencing in these same families. Candidate genes that met our prioritization criteria included FAM129B and CERCAM on chromosome 9 and SYT1, GNAO1, and CDH3 on chromosome 16.
PMCID: PMC4123546  PMID: 23154503
idiopathic scoliosis; chromosome 9q; chromosome 16; genetic heterogeneity; genetics; association; family-based association study; complex disease
11.  Joint Analysis for Integrating Two Related Studies of Different Data Types and Different Study Designs Using Hierarchical Modeling Approaches 
Human heredity  2013;74(2):83-96.
A chronic disease such as asthma is the result of a complex sequence of biological interactions involving multiple genes and pathways in response to a multitude of environmental exposures. However, methods to model jointly all factors are still evolving. Some of the current challenges include how to integrate knowledge from different data types and different disciplines, as well as how to utilize relevant external information such as gene annotation to identify novel disease genes and gene-environment interactions.
Using a Bayesian hierarchical modeling framework, we developed two alternative methods for joint analysis of an epidemiologic study of a disease endpoint and an experimental study of intermediate phenotypes, while incorporating external information.
Our simulation studies demonstrated superior performance of the proposed hierarchical models compared to separate analysis with the standard single-level regression modeling approach. The combined analyses of the Southern California Children's Health Study and challenge study data suggest that these joint analytical methods detected more significant genetic main and gene-environment interaction effects than the conventional analysis.
The proposed prior framework is very flexible and can be generalized for an integrative analysis of diverse sources of relevant biological data.
PMCID: PMC4106669  PMID: 23343600
Bayesian hierarchical modeling; Biological related studies; Data integration; Gene-environment interaction; Joint analysis; Markov-chain Monte Carlo (MCMC) methods; Prior knowledge
12.  Incorporating Prior Biologic Information for High Dimensional Rare Variant Association Studies 
Human heredity  2013;74(0):184-195.
Given the increasing scale of rare variant association studies, we introduce a method for high-dimensional studies that integrates multiple sources of data as well as allows for multiple region-specific risk indices.
Our method builds upon the previous Bayesian risk index (BRI) by integrating external biological variant-specific covariates to help guide the selection of associated variants and regions. Our extension also incorporates a second-level of uncertainty as to which regions are associated with the outcome of interest.
Using a set of study-based simulations, we show that our approach leads to an increase in power to detect true associations in comparison to several commonly used alternatives. Additionally, the method provides multi-level inference at the pathway, region and variant levels.
To demonstrate the flexibility of the method to incorporate various types of information and the applicability to a high-dimensional data, we apply our method to a single region within a candidate gene study of second primary breast cancer and to multiple regions within a candidate pathway study of colon cancer.
PMCID: PMC4058572  PMID: 23594496
genetic association studies; Bayesian model uncertainty; Bayes factors; sequence analysis; rare variant analysis
13.  Obtaining Accurate P-values from a Dense SNP Linkage Scan 
Human heredity  2012;74(1):12-16.
A major concern of resequencing studies is that the pathogenicity of most mutations is difficult to predict. To address this concern, linkage (i.e. co-segregation) is often used to exclude mutations, and to better predict pathogenicity among the candidate mutations that remain. However, when linkage disequilibrium (LD) is present in the population but ignored in the analysis, unlinked regions of high LD can provide false evidence for linkage. As a result, the type 1 error of most linkage tests can be inflated, and thousands of neutral mutations may be mistakenly included in a follow-up resequencing study. To illustrate the need for concern, we simulated data on a sparsely spaced panel of SNPs (average spacing 1.27 cM) using an LD pattern estimated from real data. In our simulations, we find that the type 1 error of the maximum LOD can be as high as 14%. Therefore, to control the type 1 error of linkage tests across a wide range of study designs, we created Haplodrop—a fast and flexible simulation program that generates the haplotypes of founders with LD and then ‘drops’ these haplotypes with recombination to all non-founders in the pedigree. Haplodrop agrees well with existing software, accommodates arbitrary pedigree structures, and scales easily to the whole genome. Moreover, by correctly excluding mutations that lie in unlinked regions of high LD, Haplodrop should help reduce the multiple testing burden of resequencing studies.
PMCID: PMC4034466  PMID: 23038223
Type I error; linkage analysis; next-generation sequencing; linkage disequilibrium
14.  Population genetics of rare variants and complex diseases 
Human heredity  2013;74(0):118-128.
Identifying drivers of complex traits from the noisy signals of genetic variation obtained from high throughput genome sequencing technologies is a central challenge faced by human geneticists today. We hypothesize that the variants involved in complex diseases are likely to exhibit non-neutral evolutionary signatures. Uncovering the evolutionary history of all variants is therefore of intrinsic interest for complex disease research. However, doing so necessitates the simultaneous elucidation of the targets of natural selection and population-specific demographic history.
Here we characterize the action of natural selection operating across complex disease categories, and use population genetic simulations to evaluate the expected patterns of genetic variation in large samples. We focus on populations that have experienced historical bottlenecks followed by explosive growth (consistent with most human populations), and describe the differences between evolutionarily deleterious mutations and those that are neutral.
Genes associated with several complex disease categories exhibit stronger signatures of purifying selection than non-disease genes. In addition, loci identified through genome-wide association studies of complex traits also exhibit signatures consistent with being in regions recurrently targeted by purifying selection. Through simulations, we show that population bottlenecks and rapid growth enables deleterious rare variants to persist at low frequencies just as long as neutral variants, but low frequency and common variants tend to be much younger than neutral variants. This has resulted in a large proportion of modern-day rare alleles that have a deleterious effect on function, and that potentially contribute to disease susceptibility.
The key question for sequencing-based association studies of complex traits is how to distinguish between deleterious and benign genetic variation. We used population genetic simulations to uncover patterns of genetic variation that distinguish these two categories, especially derived allele age, thereby providing inroads into novel methods for characterizing rare genetic variation driving complex diseases.
PMCID: PMC3698246  PMID: 23594490
Natural selection; deleterious; simulation; population genetics; rare variants
16.  Identification of rare variants from exome sequence in a large pedigree with autism 
Human heredity  2013;74(0):153-164.
We carried out analyses with the goal of identifying rare variants in exome sequence data that contribute to disease risk for a complex trait. We analyzed a large, 47-member multigenerational pedigree with 11 cases of autism spectrum disorder, using genotypes from three technologies representing increasing resolution: a multiallelic linkage marker panel; a dense diallelic marker panel; and variants from exome sequencing. Genome-scan marker genotypes were available on most subjects, and exome sequence data was available on 5 subjects. We used genome-scan linkage analysis to identify and prioritize the chromosome 22 region of interest, and to select subjects for exome sequencing. Inheritance vectors (IVs) generated by Markov chain Monte Carlo analysis of multilocus marker data were the foundation of most analyses. Genotype imputation used IVs to determine which sequence variants reside on the haplotype that co-segregates with the autism diagnosis. Together with a rare-allele frequency filter, we identified only one rare variant on the risk haplotype, illustrating the potential of this approach to prioritize variants. The associated gene, MYH9, is biologically unlikely, and we speculate that for this complex trait, the key variants may lie outside the exome.
PMCID: PMC3722055  PMID: 23594493
Imputation; inheritance vector; linkage analysis; haplotype; MCMC
17.  Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error 
Human heredity  2013;74(0):10.1159/000346824.
As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing.
Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification.
The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data.
We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error.
Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs.
Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values.
In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci.
PMCID: PMC3863939  PMID: 23594495
next gen; rare variant; trend test; genetic association; GWAS; allele; locus
18.  Genetic Variation at NCAN Locus is Associated with Inflammation and Fibrosis in Non-alcoholic Fatty Liver Disease in Morbid Obesity 
Human heredity  2013;75(1):10.1159/000346195.
Obesity-associated non-alcoholic fatty liver disease (NAFLD) may cause liver dysfunction and failure. In a previously reported genome-wide association meta-analysis, single nucleotide polymorphisms (SNPs) near PNPLA3, NCAN, GCKR, LYPLAL1 and PPP1R3B were associated with NAFLD and with distinctive serum lipid profiles. The present study examined the relevance of these variants to NAFLD in extreme obesity.
In 1,092 bariatric patients, the candidate SNPs were genotyped and association analyses with liver histology and serum lipids were performed.
We replicated the association of hepatosteatosis with PNPLA3 rs738409[G] and with NCAN rs2228603[T]. We also replicated the association of rs2228603[T] with hepatic inflammation and fibrosis. Rs2228603[T] was associated with lower serum LDL, total cholesterol and triglycerides. After stratification by the presence or absence of NAFLD, these associations were present predominantly in the subgroup with NAFLD.
NCAN rs2228603[T] is a risk factor for liver inflammation and fibrosis, suggesting that this locus is responsible for progression from steatosis to steatohepatitis. In this bariatric cohort, rs2228603[T] was associated with low serum lipids only in patients with NAFLD. This supports a NAFLD model in which the liver may sequester triglycerides as a result of either increased triglyceride uptake and/or decreased lipolysis.
PMCID: PMC3864002  PMID: 23594525
Obesity; dyslipidemia; steatohepatitis; cirrhosis; steatosis
19.  A Rapid Association Test Procedure Robust under Different Genetic Models Accounting for Population Stratification 
Human heredity  2013;75(1):23-33.
For genome-wide association studies (GWAS) in case-control data with stratification, a commonly used association test is the generalized Armitage (GA) trend test implemented in the software EIGENSTRAT. The GA trend test uses principal component analysis to correct for population stratification. It usually assumes an additive disease model and can have high power when the underlying disease model is additive or multiplicative, but may have relatively low power when the underlying disease model is recessive or dominant. The purpose of this paper is to provide a test procedure for GWAS with increased power over the GA trend test under the recessive and dominant models while maintaining the power of the GA trend test under the additive and multiplicative models.
We extend a Hardy-Weinberg disequilibrium (HWD) trend test for a homogeneous population to account for population stratification, and then propose a robust association test procedure for GWAS that incorporates information from the extended HWD trend test into the GA trend test.
Results and Conclusions
Our simulation studies and application of our method to a GWAS data set indicate that our proposed method can achieve the purpose described above.
PMCID: PMC3786013  PMID: 23571404
generalized sequential Bonferroni procedure; genome-wide association studies; Hardy-Weinberg trend test; robust test; recessive model
20.  Efficient simulation of epistatic interactions in case-parent trios 
Human heredity  2013;75(1):12-22.
Statistical approaches to evaluate interactions between single nucleotide polymorphisms (SNPs) and SNP-environment interactions are of great importance in genetic association studies, as susceptibility to complex disease might be related to the interaction of multiple SNPs and/or environmental factors. With these methods under active development, algorithms to simulate genomic data sets are needed, to ensure proper type I error control of newly proposed methods, and to compare power with existing methods. In this manuscript we propose an efficient method for a haplotype-based simulation of case-parent trios, when the disease risk is thought to depend on possibly higher order epistatic interactions, or gene-environment interactions with binary exposures.
PMCID: PMC3800020  PMID: 23548797
Case-parent trios; interactions; single nucleotide polymorphisms; haplotypes
21.  Imputation of rare variants in next generation association studies 
Human heredity  2013;74(0):196-204.
The role of rare variants has become a focus in the search for association with complex traits. Imputation is a powerful and cost-efficient tool to access variants that have not been directly typed, but there are several challenges when imputing rare variants, most notably reference panel selection. Extensions to rare variant association tests to incorporate genotype uncertainty from imputation are discussed, as well as the use of imputed low frequency and rare variants in the study of population isolates.
PMCID: PMC3954458  PMID: 23594497
association test; low frequency variant; reference panel; sequencing
22.  A Novel Kernel for Correcting Size Bias in the Logistic Kernel Machine Test with an Application to Rheumatoid Arthritis 
Human heredity  2013;74(2):97-108.
The logistic kernel machine test (LKMT) is a testing procedure tailored towards high-dimensional genetic data. Its use in pathway analyses of GWA case-control studies results from its computational efficiency and flexibility of incorporating additional information via the kernel. The kernel can be any positive definite function; unfortunately its form strongly influences the power and bias. Most authors have recommended the use of the simple linear kernel. We demonstrate via a simulation that the probability of rejecting the null hypothesis of no association just by chance increases with the number of SNPs or genes in the pathway when applying this kernel.
We propose a novel kernel that includes an appropriate standardization, in order to protect against any inflation of false positive results. Moreover, our novel kernel contains information on gene membership of SNPs in the pathway.
In an application to data from the NARAC Rheumatoid Arthritis Consortium, we find that even this basic genomic structure can improve the ability of the LKMT to identify meaningful associations. We also demonstrate that the standardization effectively eliminates problems with size bias.
We recommend the use of our standardized kernel and urge caution when using non-adjusted kernels in the LKMT to conduct pathway analysis.
PMCID: PMC3779069  PMID: 23466369
Logistic Kernel Machine Regression; Size Bias; Pathway Analysis; GWAS; Rheumatoid Arthritis
23.  Estimating the Range of Obesity Treatment Response Variability in Humans: Methods and Illustrations 
Human heredity  2013;75(0):127-135.
The rising prevalence of human obesity worldwide has focused research on a variety of interventions that result in highly varied degrees of weight loss (WL). The advent of genomic testing has quantified estimates of both the contribution of genetic factors to the development of obesity as well as racial/ethnic variation of risk alleles across sub-populations. More recent studies have examined genetic associations with effectiveness of WL interventions, but to date are unable to explain a large proportion of the variance observed.
We describe and provide two illustrations of statistical methods to estimate upper and lower bounds of WL treatment response heterogeneity (TRH) in the absence of genotypic data, using published summary statistics and a raw dataset from weight loss studies.
Thirty-two studies had some evidence of a positive mean treatment effect with respect to the control intervention. Twelve of these 32 studies exhibited WL TRH. Of these 12, three exhibited an estimated proportion of >5% of the sampled population having an outcome opposite the mean effect. In the raw dataset, bounds estimations for change in waist circumference revealed tighter ranges in men than women.
Future studies may be able to take advantage of multiple approaches, including the method we describe, to identify and quantify the presence of TRH in studies of WL or related outcomes.
PMCID: PMC3906849  PMID: 24081228
obesity; diet; exercise; human variability; treatment heterogeneity; tightening bounds
24.  Regression Modeling of Allele Frequencies and Testing Hardy Weinberg Equilibrium 
Human heredity  2013;74(2):71-82.
Tests for whether observed genotype proportions fit Hardy Weinberg Equilibrium (HWE) are widely used in population genetics analyses, as well as to evaluate quality of genotype data. To date, all methods testing for HWE require subjects to be classified into discrete categories, yet it is becoming clear that the distribution of allele frequencies tends to be smooth over geographic regions.
To evaluate the HWE assumption, we develop new approaches to model allele frequencies as functions of covariates, and use these models to test whether there is residual correlation between the two alleles of subjects; lack of residual correlation supports the null hypothesis of HWE, but conditional on how the covariates influence the allele frequencies.
By simulations, we illustrate that a simple statistical test of residual correlation of alleles adequately controls the Type-I error rate, while maintaining power that is comparable to standard tests for HWE.
Our approach can be implemented in standard software, enabling more flexible and powerful ways to evaluate the association of covariates with allele frequencies, and whether these associations “explain” departures from HWE when the covariates are ignored, opening new strategies to evaluate the quality of genotype data generated by next-generation sequencing assays.
PMCID: PMC3708318  PMID: 23328647
logistic regression; over-dispersion quasi-likelihood; residual correlation
25.  The Robustness of Generalized Estimating Equations for Association Tests in Extended Family Data 
Human heredity  2012;74(1):17-26.
Variance-component analysis (VCA), the traditional method for handling correlations within families in genetic association studies, is computationally intensive for genome-wide analyses, and the computational burden of VCA, a likelihood-based test, increases with family size and the number of genetic markers. Alternative approaches that do not require the computation of familial correlations is preferable, provided that they do not inflate type I error or decrease power. We performed a simulation study to evaluate practical alternatives to VCA that use regression with generalized estimating equations (GEE) in extended family data. We compared the properties of linear regression with GEE applied to an entire extended family structure (GEE-EXT) and GEE applied to nuclear family structures split from these extended families (GEE-SPL) to variance-components likelihood-based methods (FastAssoc). GEE-EXT was evaluated with and without robust variance estimators to estimate the standard errors. We observed similar average type I error rates from GEE-EXT and FastAssoc compared to GEE-SPL. Type I error rates for the GEE-EXT method with a robust variance estimator were marginally higher than the nominal rate when the MAF was < 0.1, but were close to nominal rate when MAF ≥ 0.2. All methods gave consistent effect estimates and had similar power. In summary, the GEE framework with the robust variance estimator, the computationally fastest and least data management intensive, appears to work well in extended families and thus provides a reasonable alternative to full variance components approaches for extended pedigrees in the GWAS setting.
PMCID: PMC3736986  PMID: 23038411
Generalized estimating equation; Variance components analysis; Family-based association study; Genome-wide scan

Results 1-25 (72)