Family-based association approaches such as the transmission-disequilibrium test (TDT) are used extensively in the study of genetic traits because they are generally robust to the presence of population structure. However, these approaches necessarily involve recruitment of families, which is more costly and time-consuming than sampling unrelated individuals in the population-based approaches. Therefore, a family-based approach, which has high power, would be appealing because of the gain in time and cost due to the reduced sample size that is required to attain adequate power. Here we introduce a new family-based transmission test using the joint transmission status from affected sib pairs. We show that by including the transmission status of both siblings, our method gives higher power than the TDT design, while maintaining the correct type I error rate. We use the simulated data from affected sib-pair families with rheumatoid arthritis provided by Genetic Analysis Workshop 15 to illustrate our approach.
Traditional transmission disequilibrium test (TDT) based methods for genetic association analyses are robust to population stratification at the cost of a substantial loss of power. We here describe a novel method for family-based association studies that corrects for population stratification with the use of an extension of principal component analysis (PCA). Specifically, we adopt PCA on unrelated parents in each family. We then infer principal components for children from those for their parents through a TDT-like strategy. Two test statistics within variance-components model are proposed for association tests. Simulation results show that the proposed tests have correct type I error rates regardless of population stratification, and have greatly improved power over two popular TDT-based methods: QTDT and FBAT. The application to the Genetic Analysis Workshop 16 (GAW16) data sets attests to the feasibility of the proposed method.
Family Based Association Tests (FBATs); Transmission Disequilibrium Test (TDT); Principal Component Analysis (PCA); Variance-Components
Quantitative trait transmission/disequilibrium tests (quantitative TDTs) are commonly used in family-based genetic association studies of quantitative traits. Despite the availability of various quantitative TDTs, some users are not aware of the properties of these tests and the relationships between them. This review aims at outlining the broad features of the various quantitative TDT procedures carried out in the frequently used QTDT and FBAT packages. Specifically, we discuss the “Rabinowitz” and the “Monks-Kaplan” procedures, as well as the various “Abecasis” and “Allison” regression-based procedures. We focus on the models assumed in these tests and the relationships between them. Moreover, we discuss what hypotheses are tested by the various quantitative TDTs, what testing procedures are best suited to various forms of data, and whether the regression-based tests overcome population stratification problems. Finally, we comment on power considerations in the choice of the test to be used. We hope this brief review will shed light on the similarities and differences of the various quantitative TDTs.
The Transmission Disequilibrium Test (TDT) is a family-based test for association based on the rate of transmission of alleles from heterozygous parents to affected offspring, and has gained popularity as this test preserves the Type I error rate. Population stratification results in a decreased number of heterozygous parents compared to that expected assuming Hardy-Weinberg Equilibrium (Wahlund Effect). We show that population stratification changes the relative proportion of the informative mating types. The decrease in the number of heterozygous parents and the change in the relative proportion of the informative mating types result in significant changes to the sample sizes required to achieve the power desired. We show examples of the changes in sample sizes, and provide an easy method for estimating TDT sample sizes in the presence of population stratification. This method potentially aids in reducing the number of false negative association studies.
Transmission disequilibrium test; population stratification; power
The Transmission Disequilibrium Test (TDT) is a family-based test for association based on the rate of transmission of alleles from heterozygous parents to affected offspring, and has gained popularity as this test preserves the Type I error rate. Population stratification results in a decreased number of heterozygous parents compared to that expected assuming Hardy–Weinberg Equilibrium (Wahlund Effect). We show that population stratification changes the relative proportion of the informative mating types. The decrease in the number of heterozygous parents and the change in the relative proportion of the informative mating types result in significant changes to the sample sizes required to achieve the power desired. We show examples of the changes in sample sizes, and provide an easy method for estimating TDT sample sizes in the presence of population stratification. This method potentially aids in reducing the number of false-negative association studies.
Transmission Disequilibrium Test; population stratification; power
As genome-wide association studies (GWAS) are becoming more popular, two approaches, among others, could be considered in order to improve statistical power for identifying genes contributing subtle to moderate effects to human diseases. The first approach is to increase sample size, which could be achieved by combining both unrelated and familial subjects together. The second approach is to jointly analyze multiple correlated traits. In this study, by extending generalized estimating equations (GEEs), we propose a simple approach for performing univariate or multivariate association tests for the combined data of unrelated subjects and nuclear families. In particular, we correct for population stratification by integrating principal component analysis and transmission disequilibrium test strategies. The proposed method allows for multiple siblings as well as missing parental information. Simulation studies show that the proposed test has improved power compared to two popular methods, EIGENSTRAT and FBAT, by analyzing the combined data, while correcting for population stratification. In addition, joint analysis of bivariate traits has improved power over univariate analysis when pleiotropic effects are present. Application to the Genetic Analysis Workshop 16 (GAW16) data sets attests to the feasibility and applicability of the proposed method.
There are two major classes of genetic association analyses: population based and family based. Population-based case–control studies have been the method of choice due to the ease of data collection. However, population stratification is one of the major limitations of case–control studies, while family-based studies are protected against stratification. In this study, we carry out extensive simulations under different disease models (both Mendelian as well as complex) to evaluate the relative powers of the two approaches in detecting association.
MATERIALS AND METHODS:
The power comparisons are based on a case–control design comprising 200 cases and 200 controls versus a Transmission Disequilibrium Test (TDT) or Pedigree Disequilibrium Test (PDT) design with 200 informative trios. We perform the allele-level test for case–control studies, which is based on the difference of allele frequencies at a single nucleotide polymorphism (SNP) between unrelated cases and controls. The TDT and the PDT are based on preferential allelic transmissions at a SNP from heterozygous parents to the affected offspring. We considered five disease modes of inheritance: (i) recessive with complete penetrance (ii) dominant with complete penetrance and (iii), (iv) and (v) complex diseases with varying levels of penetrances and phenocopies.
We find that while the TDT/PDT design with 200 informative trios is in general more powerful than a case–control design with 200 cases and 200 controls (except when the heterozygosity at the marker locus is high), it may be necessary to sample a very large number of trios to obtain the requisite number of informative families.
The current study provides insights into power comparisons between population-based and family-based association studies.
Allelic association; informative trios; complex genetic disorder
To overcome the "spurious" association caused by population stratification in population-based association studies, we propose a principal-component based method that can use both family and unrelated samples at the same time. More specifically, we adapt the multivariate logistic model, which is often used in segregation analysis and can allow for the family correlation structure, for association analysis. To correct the effect of hidden population structure, the first ten principal-components calculated from the matrix of marker genotype data are incorporated as covariates in the model. To test for the association, the marker of interest is also incorporated as a covariate in the model. We applied the proposed method to the second generation (i.e., the Offspring Cohort), in the Genetic Analysis Workshop 16 Framingham Heart Study 50 k data set to evaluate the performance of the method. Although there may have been difficulty in the convergence while maximizing the likelihood function as indicated by a flat likelihood, the distribution of the empirical p-values for the test statistic does show that the method has a correct type I error rate whenever the variance-covariance matrix of the estimates can be computed.
Family based association study (FBAS) has the advantages of controlling for population stratification and testing for linkage and association simultaneously. We propose a retrospective multilevel model (rMLM) approach to analyze sibship data by using genotypic information as the dependent variable. Simulated data sets were generated using the simulation of linkage and association (SIMLA) program. We compared rMLM to sib transmission/disequilibrium test (S-TDT), sibling disequilibrium test (SDT), conditional logistic regression (CLR) and generalized estimation equations (GEE) on the measures of power, type I error, estimation bias and standard error. The results indicated that rMLM was a valid test of association in the presence of linkage using sibship data. The advantages of rMLM became more evident when the data contained concordant sibships. Compared to GEE, rMLM had less underestimated odds ratio (OR). Our results support the application of rMLM to detect gene-disease associations using sibship data. However, the risk of increasing type I error rate should be cautioned when there is association without linkage between the disease locus and the genotyped marker.
Eight candidate genes selected in this study were previously associated with gene-environment interactions in asthma in an urban area. These genes were analyzed in a familial collection from a founder and remote population (Saguenay–Lac-Saint-Jean; SLSJ) located in an area with low air levels of ozone but with localized areas of relatively high air pollutant levels, such as sulphur dioxide, when compared to many urban areas. Polymorphisms (SNPs) were extracted from the genome-wide association study (GWAS) performed on the SLSJ familial collection. A transmission disequilibrium test (TDT) was performed using the entire family sample (1,428 individuals in 254 nuclear families). Stratification according to the proximity of aluminium, pulp and paper industries was also analyzed. Two genes were associated with asthma in the entire sample before correction (CAT and NQO1) and one was associated after correction for multiple analyses (CAT). Two genes were associated when subjects were stratified according to the proximity of aluminium industries (CAT and NQO1) and one according to the proximity of pulp and paper industries (GSTP1). However, none of them resisted correction for multiple analyses. Given that the spatial pattern of environmental exposures can be complex and inadequately represented by a few stationary monitors and that exposures can also come from sources other than the standard outdoor air pollution (e.g., indoor air, occupation, residential wood smoke), a new approach and new tools are required to measure specific and individual pollutant exposures in order to estimate the real impact of gene-environment interactions on respiratory health.
asthma; gene-environment interactions; aluminium industries; pulp and paper industries; air pollution
Multimarker transmission/disequilibrium tests (TDTs) are powerful association and linkage tests used to perform genome-wide filtering in the search for disease susceptibility loci. In contrast to case/control studies, they have a low rate of false positives for population stratification and admixture. However, the length of a region found in association with a disease is usually very large because of linkage disequilibrium (LD). Here, we define a multimarker proportional TDT (mTDTP) designed to improve locus specificity in complex diseases that has good power compared to the most powerful multimarker TDTs. The test is a simple generalization of a multimarker TDT in which haplotype frequencies are used to weight the effect that each haplotype has on the whole measure. Two concepts underlie the features of the metric: the ‘common disease, common variant’ hypothesis and the decrease in LD with chromosomal distance. Because of this decrease, the frequency of haplotypes in strong LD with common disease variants decreases with increasing distance from the disease susceptibility locus. Thus, our haplotype proportional test has higher locus specificity than common multimarker TDTs that assume a uniform distribution of haplotype probabilities. Because of the common variant hypothesis, risk haplotypes at a given locus are relatively frequent and a metric that weights partial results for each haplotype by its frequency will be as powerful as the most powerful multimarker TDTs. Simulations and real data sets demonstrate that the test has good power compared with the best tests but has remarkably higher locus specificity, so that the association rate decreases at a higher rate with distance from a disease susceptibility or disease protective locus.
For genome-wide association studies in family-based designs, we propose a powerful two-stage testing strategy that can be applied in situations in which parent-offspring trio data are available and all offspring are affected with the trait or disease under study. In the first step of the testing strategy, we construct estimators of genetic effect size in the completely ascertained sample of affected offspring and their parents that are statistically independent of the family-based association/transmission disequilibrium tests (FBATs/TDTs) that are calculated in the second step of the testing strategy. For each marker, the genetic effect is estimated (without requiring an estimate of the SNP allele frequency) and the conditional power of the corresponding FBAT/TDT is computed. Based on the power estimates, a weighted Bonferroni procedure assigns an individually adjusted significance level to each SNP. In the second stage, the SNPs are tested with the FBAT/TDT statistic at the individually adjusted significance levels. Using simulation studies for scenarios with up to 1,000,000 SNPs, varying allele frequencies and genetic effect sizes, the power of the strategy is compared with standard methodology (e.g., FBATs/TDTs with Bonferroni correction). In all considered situations, the proposed testing strategy demonstrates substantial power increases over the standard approach, even when the true genetic model is unknown and must be selected based on the conditional power estimates. The practical relevance of our methodology is illustrated by an application to a genome-wide association study for childhood asthma, in which we detect two markers meeting genome-wide significance that would not have been detected using standard methodology.
The current state of genotyping technology has enabled researchers to conduct genome-wide association studies of up to 1,000,000 SNPs, allowing for systematic scanning of the genome for variants that might influence the development and progression of complex diseases. One of the largest obstacles to the successful detection of such variants is the multiple comparisons/testing problem in the genetic association analysis. For family-based designs in which all offspring are affected with the disease/trait under study, we developed a methodology that addresses this problem by partitioning the family-based data into two statistically independent components. The first component is used to screen the data and determine the most promising SNPs. The second component is used to test the SNPs for association, where information from the screening is used to weight the SNPs during testing. This methodology is more powerful than standard procedures for multiple comparisons adjustment (i.e., Bonferroni correction). Additionally, as only one data set is required for screening and testing, our testing strategy is less susceptible to study heterogeneity. Finally, as many family-based studies collect data only from affected offspring, this method addresses a major limitation of previous methodologies for multiple comparisons in family-based designs, which require variation in the disease/trait among offspring.
The FAM123 gene family comprises three members, FAM123A, the tumor suppressor WTX(FAM123B) and FAM123C. WTX is required for normal development and causally contributes to human disease, in part through its regulation of β-catenin-dependent WNT signaling. The roles of FAM123A and FAM123C in signaling, cell behavior and human disease remain less understood. We defined and compared the protein-protein interaction networks for each member of the FAM123 family by affinity purification and mass spectrometry. Protein localization and functional studies suggest that the FAM123 family members have conserved and divergent cellular roles. In contrast to WTX and FAM123C, we found that microtubule-associated proteins were enriched in the FAM123A protein interaction network. FAM123A interacted with and tracked dynamic microtubules in a plus-end direction. Domain interaction experiments revealed a ‘SKIP’ amino acid motif in FAM123A that mediated interaction with the microtubule tip tracking proteins EB1 and EB3, and therefore with microtubules. Cells depleted of FAM123A showed compartment-specific effects on microtubule dynamics, increased actomyosin contractility, larger focal adhesions and decreased cell migration. These effects required binding of FAM123A to and inhibition of the guanine nucleotide exchange factor ARHGEF2, a microtubule-associated activator of RhoA. Together, these data suggest that the ‘family-unique’ SKIP motif enables FAM123A to bind EB proteins, localize to microtubules and coordinate microtubule dynamics and actomyosin contractility.
Case-control genetic association studies in admixed populations are known to be susceptible to genetic confounding due to population stratification. The transmission/disequilibrium test (TDT) approach can avoid this problem. However, the TDT is expensive and impractical for late- onset diseases. Case-control study designs, in which cases and controls are matched by admixture, can be an appealing and suitable alternative for genetic association studies in admixed populations. In this study, we applied this matching strategy when recruiting our African American participants in the Study of African American, Asthma, Genes and Environments (SAGE). Group admixture in this cohort consists of 83% African ancestry and 17% European ancestry, which was consistent with reports from other studies. By carrying out several complementary analyses, our results show that there is substructure in the cohort, but that the admixture distributions are almost identical in cases and controls, and also in cases only. We performed association tests for asthma-related traits with ancestry, and only found that FEV1, a measure for baseline pulmonary function, was associated with ancestry after adjusting for socio-economic and environmental risk factors (P = 0.01). We did not observe an excess of type I error rate in our association tests for ancestry informative markers (AIMs) and asthma-related phenotypes when ancestry was not adjusted in the analyses. Furthermore, using the association tests between genetic variants in a known asthma candidate gene, β2 adrenergic receptor (β2AR) and ΔFEF25-75, an asthma-related phenotype, as an example, we demonstrated population stratification was not a confounder in our genetic association. Our present work demonstrates that admixture-matched case-control strategies can efficiently control for population stratification confounding in admixed populations.
The availability of a large number of dense SNPs, high-throughput genotyping and computation methods promotes the application of family-based association tests. While most of the current family-based analyses focus only on individual traits, joint analyses of correlated traits can extract more information and potentially improve the statistical power. However, current TDT-based methods are low-powered. Here, we develop a method for tests of association for bivariate quantitative traits in families. In particular, we correct for population stratification by the use of an integration of principal component analysis and TDT. A score test statistic in the variance-components model is proposed. Extensive simulation studies indicate that the proposed method not only outperforms approaches limited to individual traits when pleiotropic effect is present, but also surpasses the power of two popular bivariate association tests termed FBAT-GEE and FBAT-PC, respectively, while correcting for population stratification. When applied to the GAW16 datasets, the proposed method successfully identifies at the genome-wide level the two SNPs that present pleiotropic effects to HDL and TG traits.
Haplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits.
In this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies.
We present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.
For many complex diseases, quantitative traits contain more information than dichotomous traits. One of the approaches used to analyse these traits in family-based association studies is the quantitative transmission disequilibrium test (QTDT). The QTDT is a regression-based approach that models simultaneously linkage and association. It splits up the association effect in a between- and a within-family genetic component to adjust and test for population stratification and includes a variance components method to model linkage. We extend this approach to detect gene–gene interactions between two unlinked QTLs by adjusting the definition of the between- and within-family component and the variance components included in the model. We simulate data to investigate the influence of the epistasis model, linkage disequilibrium patterns between the markers and the QTLs, and allele frequencies on the power and type I error rates of the approach. Results show that for some of the investigated settings, power gains are obtained in comparison with FAM-MDR. We conclude that our approach shows promising results for candidate-gene studies where too few markers are available to correct for population stratification using standard methods (for example EIGENSTRAT). The proposed method is applied to real-life data on hypertension from the FLEMENGHO study.
QTDT; epistasis; association; linkage
Multimarker Transmission/Disequilibrium Tests (TDTs) are very robust association tests to population admixture and structure which may be used to identify susceptibility loci in genome-wide association studies. Multimarker TDTs using several markers may increase power by capturing high-degree associations. However, there is also a risk of spurious associations and power reduction due to the increase in degrees of freedom. In this study we show that associations found by tests built on simple null hypotheses are highly reproducible in a second independent data set regardless the number of markers. As a test exhibiting this feature to its maximum, we introduce the multimarker
-Groups TDT (), a test which under the hypothesis of no linkage, asymptotically follows a distribution with degree of freedom regardless the number of markers. The statistic requires the division of parental haplotypes into two groups: disease susceptibility and disease protective haplotype groups. We assessed the test behavior by performing an extensive simulation study as well as a real-data study using several data sets of two complex diseases. We show that test is highly efficient and it achieves the highest power among all the tests used, even when the null hypothesis is tested in a second independent data set. Therefore, turns out to be a very promising multimarker TDT to perform genome-wide searches for disease susceptibility loci that may be used as a preprocessing step in the construction of more accurate genetic models to predict individual susceptibility to complex diseases.
Despite the success of genome-wide association studies (GWASs) in detecting common variants (minor allele frequency ≥0.05) many suggested that rare variants also contribute to the genetic architecture of diseases. Recently, researchers demonstrated that rare variants can show a strong stratification which may not be corrected by using existing methods. In this paper, we focus on a case-parents study and consider methods for testing group-wise association between multiple rare (and common) variants in a gene region and a disease. All tests depend on the numbers of transmitted mutant alleles from parents to their diseased children across variants and hence they are robust to the effect of population stratification. We use extensive simulation studies to compare the performance of four competing tests: the largest single-variant transmission disequilibrium test (TDT), multivariable test, combined TDT, and a likelihood ratio test based on a random-effects model. We find that the likelihood ratio test is most powerful in a wide range of settings and there is no negative impact to its power performance when common variants are also included in the analysis. If deleterious and protective variants are simultaneously analyzed, the likelihood ratio test was generally insensitive to the effect directionality, unless the effects are extremely inconsistent in one direction.
Background/Aims: Recent studies have implicated a region on chromosome 1q21-23, including the NOS1AP gene, in susceptibility to schizophrenia. However, replication studies have been inconsistent, a fact that could partly relate to the marked psychopathological heterogeneity of schizophrenia. The aim of this study is to evaluate association of polymorphisms in the NOS1AP gene region to schizophrenia, in patients from a South American population isolate, and to assess if these variants are associated with specific clinical dimensions of the disorder. Methods: We genotyped 24 densely spaced SNPs in the NOS1AP gene region in a schizophrenia trio sample. The transmission disequilibrium test (TDT) was applied to single marker and haplotype data. Association to clinical dimensions (identified by factor analysis) was evaluated using a quantitative transmission disequilibrium test (QTDT). Results: We found significant association between eight SNPs in the NOS1AP gene region to schizophrenia (minimum p value = 0.004). The QTDT analysis of clinical dimensions revealed an association to a dimension consisting mainly of negative symptoms (minimum p value 0.001). Conclusions: Our findings are consistent with a role for NOS1AP in susceptibility to schizophrenia, especially for the ‘negative syndrome’ of the disorder.
NOS1AP; Schizophrenia; Clinical heterogeneity; Genetic association; Psychiatric genetics
Two polymorphisms in the IL4 (G/C 3′-UTR) and IL5 (C-703T) genes were studied in a sample of families whose probands had atopic bronchial asthma (BA) (66 families,
n = 183) and in a group of non-cognate individuals with the severe form of the disease
(n = 34). The samples were collected from the Russian population in the city of Tomsk
(Russia). Using the transmission/disequilibrium test (TDT), a significant association
of allele C-703 IL5 with BA was established (TDT = 4.923, p = 0.007 ± 0.0007). The
analysis of 40 individuals with mild asthma and 49 patients with the severe form
of the disease revealed a negative association of genotype GG IL4 (OR = 0.39, 95%
CI = 0.15−0.99, p = 0.035), and also a trend towards a positive association of the
GC IL4 genotype (OR = 2.52, 95% CI = 0.98−6.57, p = 0.052) with mild BA. There
was a concordance of the clinical classification of BA severity with the ‘genotype’
(McNemar’s χ2 test with continuity correction constituted 0.03, d.f. = 1, p = 0.859).
These results suggest that polymorphisms in the IL4 and IL5 genes contribute to the
susceptibility to atopic BA and could determine the clinical course of the disease.
Attention deficit hyperactivity disorder (ADHD) is a common and highly heritable disorder of childhood characterized by inattention, hyperactivity and impulsivity. Molecular genetic and pharmacological studies suggest the involvement of the dopaminergic, serotonergic and noradrenergic neurotransmitter systems in the pathogenesis of ADHD. Monoamine oxidase A (MAO-A) encodes an enzyme that degrades biogenic amines, including neurotransmitters such as norepinephrine, dopamine and serotonin. In this study we examined a 30 bp promoter variable number tandem repeat (VNTR) and a functional G/T single nucleotide polymorphism (SNP) at position 941 in exon 8 (941G/T) of MAO-A for association with ADHD in a Taiwanese sample of 212 ADHD probands.
Within-family transmission disequilibrium test (TDT) was used to analyse association of MAO-A polymorphisms with ADHD in a Taiwanese population.
A nominally significant association was found between the G-allele of 941G/T in MAO-A and ADHD (TDT: P = 0.034. OR = 1.57). Haplotype analysis identified increased transmission of a haplotype consisting of the 3-repeat allele of the promoter VNTR and the G-allele of the 941G/T SNP (P = 0.045) to ADHD cases which the strong association with the G-allele drove.
These findings suggest the importance of the 941G/T MAO-A polymorphism in the development of ADHD in the Taiwanese population. These results replicate previously published findings in a Caucasian sample.
We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.
The purpose of this study was to investigate the contribution of MSX1 gene to the risk of nonsyndromic cleft lip with or without cleft palate (NS-CL ± P) in the Korean population. The samples consisted of 142 NS-CL ± P families (9 with cleft lip, 26 with cleft lip and alveolus, and 107 with cleft lip and palate; 76 trios and 66 dyads). Three single nucleotide polymorphisms (SNPs: rs3821949, rs12532, and rs4464513) were tested for association with NS-CL ± P case-parent trios using transmission disequilibrium test (TDT) and conditional logistic regression models (CLRMs). Minor allele frequency, heterozygosity, χ2 test for Hardy-Weinberg equilibrium, and pairwise linkage disequilibrium (LD) at each SNP were computed. The family- and haplotype-based association test programs were used to perform allelic and genotypic TDTs for individual SNPs and to fabricate sliding windows of haplotypes. Genotypic odds ratios (GORs) were obtained from CLRMs using R software. Although the family-based TDT indicated a meaningful association for rs3821949 (P = 0.028), the haplotype analysis did not reveal any significant association with rs3821949, rs12532, or rs4464513. The A allele at rs3821949 had a significant increased risk of NS-CL ± P (GOR, 1.64; 95% confidence interval,1.03-2.63; P = 0.038, additive model). A positive association is suggested between MSX1 rs3821949 and NS-CL ± P in the Korean population.
MSX1 SNP; Nonsyndromic Cleft Lip with or without Palate; Korean; Association Analysis
In this paper, we propose a sequential probability ratio test (SPRT) to overcome the problem of limited samples in studies related to complex genetic diseases. The results of this novel approach are compared with the ones obtained from the traditional transmission disequilibrium test (TDT) on simulated data. Although TDT classifies single-nucleotide polymorphisms (SNPs) to only two groups (SNPs associated with the disease and the others), SPRT has the flexibility of assigning SNPs to a third group, that is, those for which we do not have enough evidence and should keep sampling. It is shown that SPRT results in smaller ratios of false positives and negatives, as well as better accuracy and sensitivity values for classifying SNPs when compared with TDT. By using SPRT, data with small sample size become usable for an accurate association analysis.
transmission disequilibrium test; sequential probability ratio test; SNPs; simulation study; family-based association study