Search tips
Search criteria

Results 1-11 (11)

Clipboard (0)
more »
Year of Publication
Document Types
1.  Bivariate genetic association analysis of systolic and diastolic blood pressure by copula models 
BMC Proceedings  2014;8(Suppl 1):S72.
We conduct genetic association analysis in the subset of unrelated individuals from the San Antonio Family Studies pedigrees, applying a two-stage approach to take account of the dependence between systolic and diastolic blood pressure (SBP and DBP). In the first stage, we adjust blood pressure for the effects of age, sex, smoking, and use of antihypertensive medication based on a novel modification of censored regression. In the second stage, we model the bivariate distribution of the adjusted SBP and DBP phenotypes by a copula function with interpretable SBP-DBP correlation parameters. This allows us to identify genetic variants associated with each of the adjusted blood pressures, as well as variants that explain the association between the two phenotypes. Within this framework, we define a pleiotropic variant as one that reduces the SBP-DBP correlation. Our results for whole genome sequence variants in the gene ULK4 on chromosome 3 suggest that inference obtained from a copula model can be more informative than findings from the SBP-specific and DBP-specific univariate models alone.
PMCID: PMC4143670  PMID: 25519342
2.  An exploration of heterogeneity in genetic analysis of complex pedigrees: linkage and association using whole genome sequencing data in the MAP4 region 
BMC Proceedings  2014;8(Suppl 1):S107.
We conduct pedigree-based linkage and association analyses of simulated systolic blood pressure data in the nonascertained large Mexican American pedigrees provided by Genetic Analysis Workshop 18, focusing on observed sequence variants in MAP4 on chromosome 3. Because pedigrees are large and sequence data have been completed by imputation, it is feasible to conduct analysis for each pedigree separately as well as for all pedigrees combined. We are interested in quantifying and explaining between-pedigree heterogeneity in linkage and association signals. To this end, we first examine minor allele frequency differences between pedigrees. In some of the pedigrees, rare and low-frequency variants occur at a higher prevalence than in all pedigrees combined. In simulation replicate 1, we conduct variance-components linkage and association analysis of all 894 MAP4 variants to compare analytic approaches in single pedigree and combined analysis. In all 200 replicates, we similarly examine the 15 causal variants in MAP4 known under the generating model. We illustrate how random allele frequency variation among pedigrees leads to heterogeneity in pedigree-specific linkage and association signals.
PMCID: PMC4143705  PMID: 25519361
3.  Multiphase analysis by linkage, quantitative transmission disequilibrium, and measured genotype: systolic blood pressure in complex Mexican American pedigrees 
BMC Proceedings  2014;8(Suppl 1):S108.
We apply a multiphase strategy for pedigree-based genetic analysis of systolic blood pressure data collected in a longitudinal study of large Mexican American pedigrees. In the first phase, we conduct variance-components linkage analysis to identify regions that may harbor quantitative trait loci. In the second phase, we carry out pedigree-based association analysis in a selected region with common and low-frequency variants from genome-wide association studies and whole genome sequencing data. Using sequencing data, we compare approaches to pedigree analysis in a 10 megabase candidate region on chromosome 3 harboring a gene previously identified by a consortium for blood pressure genome-wide association studies. We observe that, as expected, the measured genotype analysis tends to provide larger signals than the quantitative transmission disequilibrium test. We also observe that while linkage signals are contributed by common variants, strong associations are found mainly at rare variants. Multiphase analysis can improve computational efficiency and reduce the multiple testing burden.
PMCID: PMC4143726  PMID: 25519311
4.  Are quantitative trait-dependent sampling designs cost-effective for analysis of rare and common variants? 
BMC Proceedings  2011;5(Suppl 9):S111.
Use of trait-dependent sampling designs in whole-genome association studies of sequence data can reduce total sequencing costs with modest losses of statistical efficiency. In a quantitative trait (QT) analysis of data from the Genetic Analysis Workshop 17 mini-exome for unrelated individuals in the Asian subpopulation, we investigate alternative designs that sequence only 50% of the entire cohort. In addition to a simple random sampling design, we consider extreme-phenotype designs that are of increasing interest in genetic association analysis of QTs, especially in studies concerned with the detection of rare genetic variants. We also evaluate a novel sampling design in which all individuals have a nonzero probability of being selected into the sample but in which individuals with extreme phenotypes have a proportionately larger probability. We take differential sampling of individuals with informative trait values into account by inverse probability weighting using standard survey methods which thus generalizes to the source population. In replicate 1 data, we applied the designs in association analysis of Q1 with both rare and common variants in the FLT1 gene, based on knowledge of the generating model. Using all 200 replicate data sets, we similarly analyzed Q1 and Q4 (which is known to be free of association with FLT1) to evaluate relative efficiency, type I error, and power. Simulation study results suggest that the QT-dependent selection designs generally yield greater than 50% relative efficiency compared to using the entire cohort, implying cost-effectiveness of 50% sample selection and worthwhile reduction of sequencing costs.
PMCID: PMC3287835  PMID: 22373146
5.  Two-stage study designs combining genome-wide association studies, tag single-nucleotide polymorphisms, and exome sequencing: accuracy of genetic effect estimates 
BMC Proceedings  2011;5(Suppl 9):S64.
Genome-wide association studies (GWAS) test for disease-trait associations and estimate effect sizes at tag single-nucleotide polymorphisms (SNPs), which imperfectly capture variation at causal SNPs. Sequencing studies can examine potential causal SNPs directly; however, sequencing the whole genome or exome can be prohibitively expensive. Costs can be limited by using a GWAS to detect the associated region(s) at tag SNPs followed by targeted sequencing to identify and estimate the effect size of the causal variant. Genetic effect estimates obtained from association studies can be inflated because of a form of selection bias known as the winner’s curse. Conversely, estimates at tag SNPs can be attenuated compared to the causal SNP because of incomplete linkage disequilibrium. These two effects oppose each other. Analysis of rare SNPs further complicates our understanding of the winner’s curse because rare SNPs are difficult to tag and analysis can involve collapsing over multiple rare variants. In two-stage analysis of Genetic Analysis Workshop 17 simulated data sets, we find that selection at the tag SNP produces upward bias in the estimate of effect at the causal SNP, even when the tag and causal SNPs are not well correlated. The bias similarly carries through to effect estimates for rare variant summary measures. Replication studies designed with sample sizes computed using biased estimates will be under-powered to detect a disease-causing variant. Accounting for bias in the original study is critical to avoid discarding disease-associated SNPs at follow up.
PMCID: PMC3287903  PMID: 22373407
6.  Transmission-ratio distortion in the Framingham Heart Study 
BMC Proceedings  2009;3(Suppl 7):S51.
Transmission-ratio distortion (TRD) is a phenomenon in which the segregation of alleles does not obey Mendel's laws. As a simple example, a recessive locus that results in fetal lethality will result in live-born individuals sharing more alleles at this locus than expected under Mendel's laws. This could result in apparent linkage of the phenotype of 'being alive' to such a chromosomal regions. Further, this could result in false-positive linkage when 'affected-only' parametric or non-parametric linkage analysis is performed. Similarly, loci demonstrating TRD may be detectable in family-based association tests as deviant transmission of alleles. Therefore, TRD could result in confounding of family-based association studies of diseases. The Framingham Heart Study data available for Genetic Analysis Workshop 16 is a suitable dataset to determine whether there are loci in the genome that reveal TRD because of the large number of individuals from families, the high-resolution genotyping, and the population-based nature of the study. We have used both genome-wide linkage and family-based association methods to determine whether there are loci that demonstrate TRD in the Framingham Heart Study. Family-based association analysis identified thousands of loci with apparent TRD. However, the vast majority of these are likely the result of genotyping errors with application of strict quality control criteria to the genotype data, and automated inspection of the intensity plots, we identify a small number of loci that may show true TRD, including rs1000548 in intron 6 of S-antigen (arrestin, SAG) on chromosome 2 (p = 7 × 10-10).
PMCID: PMC2795951  PMID: 20018044
7.  Region-based analysis in genome-wide association study of Framingham Heart Study blood lipid phenotypes 
BMC Proceedings  2009;3(Suppl 7):S127.
Due to the high-dimensionality of single-nucleotide polymorphism (SNP) data, region-based methods are an attractive approach to the identification of genetic variation associated with a certain phenotype. A common approach to defining regions is to identify the most significant SNPs from a single-SNP association analysis, and then use a gene database to obtain a list of genes proximal to the identified SNPs. Alternatively, regions may be defined statistically, via a scan statistic. After categorizing SNPs as significant or not (based on the single-SNP association p-values), a scan statistic is useful to identify regions that contain more significant SNPs than expected by chance. Important features of this method are that regions are defined statistically, so that there is no dependence on a gene database, and both gene and inter-gene regions can be detected. In the analysis of blood-lipid phenotypes from the Framingham Heart Study (FHS), we compared statistically defined regions with those formed from the top single SNP tests. Although we missed a number of single SNPs, we also identified many additional regions not found as SNP-database regions and avoided issues related to region definition. In addition, analyses of candidate genes for high-density lipoprotein, low-density lipoprotein, and triglyceride levels suggested that associations detected with region-based statistics are also found using the scan statistic approach.
PMCID: PMC2795900  PMID: 20017993
8.  Genome-wide association analyses of North American Rheumatoid Arthritis Consortium and Framingham Heart Study data utilizing genome-wide linkage results 
BMC Proceedings  2009;3(Suppl 7):S103.
The power of genome-wide association studies can be improved by incorporating information from previous study findings, for example, results of genome-wide linkage analyses. Weighted false-discovery rate (FDR) control can incorporate genome-wide linkage scan results into the analysis of genome-wide association data by assigning single-nucleotide polymorphism (SNP) specific weights. Stratified FDR control can also be applied by stratifying the SNPs into high and low linkage strata. We applied these two FDR control methods to the data of North American Rheumatoid Arthritis Consortium (NARAC) study and the Framingham Heart Study (FHS), combining both association and linkage analysis results. For the NARAC study, we used linkage results from a previous genome scan of rheumatoid arthritis (RA) phenotype. For the FHS study, we obtained genome-wide linkage scores from the same 550 k SNP data used for the association analyses of three lipids phenotypes (HDL, LDL, TG). We confirmed some genes previously reported for association with RA and lipid phenotypes. Stratified and weighted FDR methods appear to give improved ranks to some of the replicated SNPs for the RA data, suggesting linkage scan results could provide useful information to improve genome-wide association studies.
PMCID: PMC2795874  PMID: 20017967
9.  Evidence of linkage to chromosome 1 for early age of onset of rheumatoid arthritis and HLA marker DRB1 genotype in NARAC data 
BMC Proceedings  2007;1(Suppl 1):S78.
Focusing on chromosome 1, a recursive partitioning linkage algorithm (RP) was applied to perform linkage analysis on the rheumatoid arthritis NARAC data, incorporating covariates such as HLA-DRB1 genotype, age at onset, severity, anti-cyclic citrullinated peptide (anti-CCP), and life time smoking. All 617 affected sib pairs from the ascertained families were used, and an RP linkage model was used to identify linkage possibly influenced by covariates. This algorithm includes a likelihood ratio (LR)-based splitting rule, a pruning algorithm to identify optimal tree size, and a bootstrap method for final tree selection.
The strength of the linkage signals was evaluated by empirical p-values, obtained by simulating marker data under null hypothesis of no linkage. Two suggestive linkage regions on chromosome 1 were detected by the RP linkage model, with identified associated covariates HLA-DRB1 genotype and age at onset. These results suggest possible gene × gene and gene × environment interactions at chromosome 1 loci and provide directions for further gene mapping.
PMCID: PMC2367509  PMID: 18466580
10.  One-stage design is empirically more powerful than two-stage design for family-based genome-wide association studies 
BMC Proceedings  2007;1(Suppl 1):S137.
Finding a genetic marker associated with a trait is a classic problem in human genetics. Recently, two-stage approaches have gained popularity in marker-trait association studies, in part because researchers hope to reduce the multiple testing problem by testing fewer markers in the final stage. We compared one two-stage family-based approach to an analogous single-stage method, calculating the empirical type I error rates and power for both methods using fully simulated data sets modeled on nuclear families with rheumatoid arthritis, and data sets of real single-nucleotide polymorphism genotypes from Centre d'Etude du Polymorphisme Humain pedigrees with simulated traits. In these analyses performed in the absence of population stratification, the single-stage method was consistently more powerful than the two-stage method for a given type I error rate. To explore the sources of this difference, we performed a case study comparing the individual steps of two-stage designs, the two-stage design itself, and the analogous one-stage design.
PMCID: PMC2367501  PMID: 18466480
11.  Application of bivariate mixed counting process models to genetic analysis of rheumatoid arthritis severity 
BMC Proceedings  2007;1(Suppl 1):S120.
We sought to i) identify putative genetic determinants of the severity of rheumatoid arthritis in the NARAC (North American Rheumatoid Arthritis Consortium) data, ii) assess whether known candidate genes for disease status are also associated with disease severity in those affected, and iii) determine whether heterogeneity among the severity phenotypes can be explained by genetic and/or host factors. These questions are addressed by developing bivariate mixed-counting process models for numbers of tender and swollen joints to evaluate genetic association of candidate polymorphisms, such as DRB1, and selected single-nucleotide polymorphisms in known candidate genes/regions for rheumatoid arthritis, including PTPN22, and those in the regions identified by a genome-wide linkage scan of disease severity using the dense Illumina single-nucleotide polymorphism panel. The counting process framework provides a flexible approach to account for the duration of rheumatoid arthritis, an attractive feature when modeling severity of a disease. Moreover, we found a gain in efficiency when using a bivariate compared to a univariate counting process model.
PMCID: PMC2367476  PMID: 18466462

Results 1-11 (11)