Copy number variants (CNVs) may play an important part in the development of common birth defects such as oral clefts, and individual patients with multiple birth defects (including clefts) have been shown to carry small and large chromosomal deletions. In this paper we investigate de novo deletions defined as DNA segments missing in an oral cleft proband but present in both unaffected parents. We compare de novo deletion frequencies in children of European ancestry with an isolated, non-syndromic oral cleft to frequencies in children of European ancestry from randomly sampled trios.
We identified a genome-wide significant 62 kilo base (kb) non-coding region on chromosome 7p14.1 where de novo deletions occur more frequently among oral cleft cases than controls. We also observed wider de novo deletions among cleft lip and palate (CLP) cases than seen among cleft palate (CP) and cleft lip (CL) cases.
This study presents a region where de novo deletions appear to be involved in the etiology of oral clefts, although the underlying biological mechanisms are still unknown. Larger de novo deletions are more likely to interfere with normal craniofacial development and may result in more severe clefts. Study protocol and sample DNA source can severely affect estimates of de novo deletion frequencies. Follow-up studies are needed to further validate these findings and to potentially identify additional structural variants underlying oral clefts.
Oral clefts; DNA copy numbers; de novo deletions; Case-parent trios
Nonsyndromic cleft palate (CP) is one of the most common human birth defects and both genetic and environmental risk factors contribute to its etiology. We conducted a genome-wide association study (GWAS) using 550 CP case-parent trios ascertained in an international consortium. Stratified analysis among trios with different ancestries was performed to test for GxE interactions with common maternal exposures using conditional logistic regression models. While no single nucleotide polymorphism (SNP) achieved genome-wide significance when considered alone, markers in SLC2A9 and the neighboring WDR1 on chromosome 4p16.1 gave suggestive evidence of gene-environment interaction with environmental tobacco smoke (ETS) among 259 Asian trios when the models included a term for GxE interaction. Multiple SNPs in these two genes were associated with increased risk of nonsyndromic CP if the mother was exposed to ETS during the peri-conceptual period (3 months prior to conception through the first trimester). When maternal ETS was considered, fifteen of 135 SNPs mapping to SLC2A9 and 9 of 59 SNPs in WDR1 gave P values approaching genome-wide significance (10−6
Isolated, non-syndromic cleft lip with or without cleft palate (iCL±P) is a common human congenital malformation with a complex and heterogeneous etiology. Genes coding for fibroblast growth factors and their receptors (FGF/FGFR genes) are excellent candidate genes.
We tested single nucleotide polymorphic (SNP) markers in 10 FGF/FGFR genes (including FGFBP1, FGF2, FGF10, FGF18, FGFR1, FGFR2, FGF19, FGF4, FGF3, and FGF9) for genotypic effects, interactions with one another, and with common maternal environmental exposures in 221 Asian and 76 Maryland case-parent trios ascertained through a child with iCL±P.
Both FGFR1 and FGF19 yielded evidence of linkage and association in the transmission disequilibrium test, confirming previous evidence. Haplotypes of three SNPs in FGFR1 were nominally significant among Asian trios. Estimated ORs for individual SNPs and haplotypes of multiple markers in FGF19 ranged between1.31-1.87. We also found suggestive evidence of maternal genotypic effects for markers in FGF2 and FGF10 among Asian trios. Tests for gene-environment (GxE) interaction between markers in FGFR2 and maternal smoking or multivitamin supplementation yielded significant evidence of GxE interaction separately. Tests of gene-gene (GxG) interaction using Cordell's method yielded significant evidence between SNPs in FGF9 and FGF18, which was confirmed in an independent sample of trios from an international consortium.
Our results suggest several genes in the FGF/FGFR family may influence risk to iCL±P through distinct biological mechanisms.
FGF/FGFR; oral clefts; maternal effects; gene-environment interaction; gene-gene interaction
The hypothesis that germ-line polymorphisms in DNA repair genes influence cancer risk has previously been tested primarily on a cancer site-specific basis. The purpose of this study was to test the hypothesis that DNA repair gene allelic variants contribute to globally elevated cancer risk by measuring associations with risk of all cancers that occurred within a population-based cohort. In the CLUE II cohort study established in 1989 in Washington County, MD, this study was comprised of all 3619 cancer cases ascertained through 2007 compared with a sample of 2296 with no cancer. Associations were measured between 759 DNA repair gene single nucleotide polymorphisms (SNPs) and risk of all cancers. A SNP in O6-methylguanine-DNA methyltransferase, MGMT, (rs2296675) was significantly associated with overall cancer risk [per minor allele odds ratio (OR) 1.30, 95% confidence interval (CI) 1.19–1.43 and P-value: 4.1 × 10−8]. The association between rs2296675 and cancer risk was stronger among those aged ≤54 years old than those who were ≥55 years at baseline (P-for-interaction = 0.021). OR were in the direction of increased risk for all 15 categories of malignancies studied (P < 0.0001), ranging from 1.22 (P = 0.42) for ovarian cancer to 2.01 (P = 0.008) for urinary tract cancers; the smallest P-value was for breast cancer (OR 1.45, P = 0.0002). The results indicate that the minor allele of MGMT SNP rs2296675, a common genetic marker with 37% carriers, was significantly associated with increased risk of cancer across multiple tissues. Replication is needed to more definitively determine the scientific and public health significance of this observed association.
Accelerated lung function decline is a key COPD phenotype; however its genetic control remains largely unknown.
We performed a genome-wide association study using the Illumina Human660W-Quad v.1_A BeadChip. Generalized estimation equations were used to assess genetic contributions to lung function decline over a 5-year period in 4,048 European-American Lung Health Study participants with largely mild COPD. Genotype imputation was performed using reference HapMap II data. To validate regions meeting genome-wide significance, replication of top SNPs was attempted in independent cohorts. Three genes (TMEM26, ANK3 and FOXA1) within the regions of interest were selected for tissue expression studies using immunohistochemistry.
Measurements and Main Results
Two intergenic SNPs (rs10761570, rs7911302) on chromosome 10 and one SNP on chromosome 14 (rs177852) met genome-wide significance after Bonferroni. Further support for the chromosome 10 region was obtained by imputation, the most significantly associated imputed SNPs (rs10761571, rs7896712) being flanked by observed markers rs10761570 and rs7911302. Results were not replicated in four general population cohorts or a smaller cohort of subjects with moderate to severe COPD; however, we show novel expression of genes near regions of significantly associated SNPS, including TMEM26 and FOXA1 in airway epithelium and lung parenchyma, and ANK3 in alveolar macrophages. Levels of expression were associated with lung function and COPD status.
We identified two novel regions associated with lung function decline in mild COPD. Genes within these regions were expressed in relevant lung cells and their expression related to airflow limitation suggesting they may represent novel candidate genes for COPD susceptibility.
COPD; lung function decline; GWAS; genome wide association; genes; polymorphisms
Genome-wide association studies (GWAs) have identified thousands of DNA loci associated with a variety of traits. Statistical inference is almost always based on single marker hypothesis tests of association and the respective p-values with Bonferroni correction. Since commercially available genomic arrays interrogate hundreds of thousands or even millions of loci simultaneously, many causal yet undetected loci are believed to exist because the conditional power to achieve a genome-wide significance level can be low, in particular for markers with small effect sizes and low minor allele frequencies and in studies with modest sample size. However, the correlation between neighboring markers in the human genome due to linkage disequilibrium (LD) resulting in correlated marker test statistics can be incorporated into multi-marker hypothesis tests, thereby increasing power to detect association. Herein, we establish a theoretical benchmark by quantifying the maximum power achievable for multi-marker tests of association in case-control studies, achievable only when the causal marker is known. Using that genotype correlations within an LD block translate into an asymptotically multivariate normal distribution for score test statistics, we develop a set of weights for the markers that maximize the non-centrality parameter, and assess the relative loss of power for other approaches. We find that the method of Conneely and Boehnke (2007) based on the maximum absolute test statistic observed in an LD block is a practical and powerful method in a variety of settings. We also explore the effect on the power that prior biological or functional knowledge used to narrow down the locus of the causal marker can have, and conclude that this prior knowledge has to be very strong and specific for the power to approach the maximum achievable level, or even beat the power observed for methods such as the one proposed by Conneely and Boehnke (2007).
genome-wide association studies; linkage disequilibrium; multi-marker tests; multiplicity adjustment; single nucleotide polymorphisms
Admixture is a potential source of confounding in genetic association studies, so it becomes important to detect and estimate admixture in a sample of unrelated individuals. Populations of African descent in the US and the Caribbean share similar historical backgrounds but the distributions of African admixture may differ. We selected 416 ancestry informative markers (AIMs) to estimate and compare admixture proportions using STRUCTURE in 906 unrelated African Americans (AAs) and 294 Barbadians (ACs) from a study of asthma. This analysis showed AAs on average were 72.5% African, 19.6% European and 8% Asian, while ACs were 77.4% African, 15.9% European, and 6.7% Asian which were significantly different. A principal components analysis based on these AIMs yielded one primary eigenvector that explained 54.04% of the variation and captured a gradient from West African to European admixture. This principal component was highly correlated with African vs. European ancestry as estimated by STRUCTURE (r2 = 0.992, r2 = 0.912, respectively). To investigate other African contributions to African American and Barbadian admixture, we performed PCA on ~14,000 (14k) genome-wide SNPs in AAs, ACs, Yorubans, Luhya and Maasai African groups, and estimated genetic distances (FST). We found AAs and ACs were closest genetically (FST = 0.008), and both were closer to the Yorubans than the other East African populations. In our sample of individuals of African descent, ~400 well-defined AIMs were just as good for detecting substructure as ~14,000 random SNPs drawn from a genome-wide panel of markers.
admixture; African Americans; African Caribbeans; African ancestry; genetic distance
A personal history of basal cell carcinoma (BCC) is associated with increased risk of other malignancies, but the reason is unknown. The hedgehog pathway is critical to the etiology of BCC, and is also believed to contribute to susceptibility to other cancers. This study tested the hypothesis that hedgehog pathway and pathway-related gene variants contribute to the increased risk of subsequent cancers among those with a history of BCC.
The study was nested within the ongoing CLUE II cohort study, established in 1989 in Washington County, Maryland, USA. The study consisted of a cancer-free control group (n=2,296) compared to three different groups of cancer cases ascertained through 2007, those diagnosed with: 1) Other (non-BCC) cancer only (n=2,349); 2) BCC only (n=534); and 3) BCC plus other cancer (n=446). The frequencies of variant alleles were compared among these four groups for 20 single nucleotide polymorphisms (SNPs) in 6 hedgehog pathway genes (SHH, IHH, PTCH2, SMO, GLI1, SUFU), and also 22 SNPs in VDR and 8 SNPs in FAS, which have cross-talk with the hedgehog pathway.
Comparing those with both BCC and other cancer versus those with no cancer, no significant associations were observed for any of the hedgehog pathway SNPs, or for the FAS SNPs. One VDR SNP was nominally significantly associated with the BCC cancer-prone phenotype, rs11574085 [per minor allele odds ratio (OR) 1.38, 95% confidence interval (CI) 1.05–1.82; p-value=0.02].
The hedgehog pathway gene SNPs studied, along with the VDR and FAS SNPs studied, are not strongly associated with the BCC cancer-prone phenotype.
skin cancer; genetics; polymorphisms; hedgehog; vitamin D receptor; fas
For unknown reasons, non-melanoma skin cancer (NMSC) is associated with increased risk of other malignancies. Focusing solely on DNA repair or DNA repair-related genes, this study tested the hypothesis that DNA repair gene variants contribute to the increased cancer risk associated with a personal history of NMSC. From the parent CLUE II cohort study, established in 1989 in Washington County, MD, the study consisted of a cancer-free control group (n 5 2296) compared with three mutually exclusive groups of cancer cases ascertained through 2007: (i) Other (non-NMSC) cancer only (n 5 2349); (ii) NMSC only (n 5 694) and (iii) NMSC plus other cancer (n 5 577). The frequency of minor alleles in 759 DNA repair gene single nucleotide polymorphisms (SNPs) was compared in these four groups. Comparing those with both NMSC and other cancer versus those with no cancer, 10 SNPs had allelic trend P-values <0.01. The two top-ranked SNPs were both within the thymine DNA glycosylase gene (TDG). One was a non-synonymous coding SNP (rs2888805) [per allele odds ratio (OR) 1.40, 95% confidence interval (CI) 1.16–1.70; P-value 5 0.0006] and the other was an intronic SNP in high linkage disequilibrium with rs2888805 (rs4135150). None of the associations had a P-value <6.6310−5, the threshold for statistical significance after correcting for multiple comparisons. The results pinpoint DNA repair genes most likely to contribute to the NMSC cancer-prone phenotype. A promising lead is genetic variants in TDG, important not only in base excision repair but also in regulating the epigenome and gene expression, which may contribute to the NMSC-associated increase in overall cancer risk.
We have conducted the first meta-analyses for nonsyndromic cleft lip with or without cleft palate (NSCL/P) using data from the two largest genome-wide association studies published to date. We confirmed associations with all previously identified loci and identified six additional susceptibility regions (1p36, 2p21, 3p11.1, 8q21.3, 13q31.1 and 15q22). Analysis of phenotypic variability identified the first specific genetic risk factor for NSCLP (nonsyndromic cleft lip plus palate) (rs8001641; PNSCLP = 6.51 × 10−11; homozygote relative risk = 2.41, 95% confidence interval (CI) 1.84–3.16).
In a recent genome wide association study (GWAS) from an international consortium, evidence of linkage and association in chr8q24 was much stronger among non-syndromic cleft lip/palate (CL/P) case-parent trios of European ancestry than among trios of Asian ancestry. We examined marker information content and haplotype diversity across 13 recruitment sites (from Europe, USA and Asia) separately, and conducted principal components analysis (PCA) on parents. As expected, PCA revealed large genetic distances between Europeans and Asians, and a north-south cline from Korea to Singapore in Asia, with Filipino parents forming a somewhat distinct Southeast Asian cluster. Hierarchical clustering of SNP heterozygosity revealed two major clades consistent with PCA results. All genotyped SNPs giving p<10−6 in the allelic TDT showed higher heterozygosity in Europeans than Asians. On average, European ancestry parents had higher haplotype diversity than Asians. Imputing additional variants across chr8q24 increased the strength of statistical evidence among Europeans and also revealed a significant signal among Asians (although it did not reach genome-wide significance). Tests for SNP-population interaction were negative, indicating the lack of strong signal for 8q24 in families of Asian ancestry was not due to any distinct genetic effect, but could simply reflect low power due to lower allele frequencies in Asians.
cleft lip with/without cleft palate; 8q24; genome wide association; imputation
We performed a genome wide association analysis of maternally-mediated genetic effects and parent-of-origin effects on risk of orofacial clefting using over 2,000 case-parent triads collected through an international cleft consortium. We used log-linear regression models to test individual SNPs. For SNPs with a p-value <10−5 for maternal genotypic effects, we also applied a haplotype-based method, TRIMM, to extract potential information from clusters of correlated SNPs. None of the SNPs were significant at the genome wide level. Our results suggest neither maternal genome nor parent of origin effects play major roles in the etiology of orofacial clefting in our sample. This finding is consistent with previous genetic studies and recent population-based cohort studies in Norway and Denmark, which showed no apparent difference between mother-to-offspring and father-to-offspring recurrence of clefting. We, however, cannot completely rule out maternal genome or parent of origin effects as risk factors because very small effects might not be detectable with our sample size, they may influence risk through interactions with environmental exposures or may act through a more complex network of interacting genes. Thus the most promising SNPs identified by this study may still be worth further investigation.
GWAS; CL/P; CP; maternal genes; parent-of-origin; family-based study; association study
This study examined the association between 49 markers in the Runt-related transcription factor 2 (RUNX2) gene and nonsyndromic cleft lip with/without cleft palate (CL/P) among 326 Chinese case-parent trios, while considering gene-environment (GxE) interaction and parent-of-origin effects. Five single-nucleotide polymorphisms (SNPs) showed significant evidence of linkage and association with CL/P and these results were replicated in an independent European sample of 825 case-parent trios. We also report compelling evidence for interaction between markers in RUNX2 and environmental tobacco smoke (ETS). Although most marginal SNP effects (i.e., ignoring maternal exposures) were not statistically significant, eight SNPs were significant when considering possible interaction with ETS when testing for gene (G) and GxE interaction simultaneously or when considering GxE alone. Independent samples from European populations showed consistent evidence of significant GxETS interaction at two SNPs (rs6904353 and rs7748231). Our results suggest genetic variation in RUNX2 may influence susceptibility to CL/P through interacting with ETS.
RUNX2; oral clefts; gene-environment interaction; parent-of-origin effects; imprinting
In studies of case-parent trios, we define copy number variants (CNVs) in the offspring that differ from the parental copy numbers as de novo and of interest for their potential functional role in disease. Among the leading array-based methods for discovery of de novo CNVs in case-parent trios is the joint hidden Markov model (HMM) implemented in the PennCNV software. However, the computational demands of the joint HMM are substantial and the extent to which false positive identifications occur in case-parent trios has not been well described. We evaluate these issues in a study of oral cleft case-parent trios.
Our analysis of the oral cleft trios reveals that genomic waves represent a substantial source of false positive identifications in the joint HMM, despite a wave-correction implementation in PennCNV. In addition, the noise of low-level summaries of relative copy number (log R ratios) is strongly associated with batch and correlated with the frequency of de novo CNV calls. Exploiting the trio design, we propose a univariate statistic for relative copy number referred to as the minimum distance that can reduce technical variation from probe effects and genomic waves. We use circular binary segmentation to segment the minimum distance and maximum a posteriori estimation to infer de novo CNVs from the segmented genome. Compared to PennCNV on simulated data, MinimumDistance identifies fewer false positives on average and is comparable to PennCNV with respect to false negatives. Genomic waves contribute to discordance of PennCNV and MinimumDistance for high coverage de novo calls, while highly concordant calls on chromosome 22 were validated by quantitative PCR. Computationally, MinimumDistance provides a nearly 8-fold increase in speed relative to the joint HMM in a study of oral cleft trios.
Our results indicate that batch effects and genomic waves are important considerations for case-parent studies of de novo CNV, and that the minimum distance is an effective statistic for reducing technical variation contributing to false de novo discoveries. Coupled with segmentation and maximum a posteriori estimation, our algorithm compares favorably to the joint HMM with MinimumDistance being much faster.
Trios; Oral cleft; Copy number variants; de novo; High-throughput arrays; Segmentation; batch effects; Genomic waves
Clonal mosaicism for large chromosomal anomalies (duplications, deletions and uniparental disomy) was detected using SNP microarray data from over 50,000 subjects recruited for genome-wide association studies. This detection method requires a relatively high frequency of cells (>5–10%) with the same abnormal karyotype (presumably of clonal origin) in the presence of normal cells. The frequency of detectable clonal mosaicism in peripheral blood is low (<0.5%) from birth until 50 years of age, after which it rises rapidly to 2–3% in the elderly. Many of the mosaic anomalies are characteristic of those found in hematological cancers and identify common deleted regions that pinpoint the locations of genes previously associated with hematological cancers. Although only 3% of subjects with detectable clonal mosaicism had any record of hematological cancer prior to DNA sampling, those without a prior diagnosis have an estimated 10-fold higher risk of a subsequent hematological cancer (95% confidence interval = 6–18).
Nucleotide excision repair (NER) is responsible for protecting DNA in skin cells against ultraviolet radiation-induced damage. Using a candidate pathway approach, a matched case-control study nested within a prospective, community-based cohort was carried out to test the hypothesis that single nucleotide polymorphisms (SNPs) in NER genes are associated with susceptibility to non-melanoma skin cancer (NMSC). Histologically-confirmed cases of NMSC (n=900) were matched to controls (n=900) on age, gender, and skin type. Associations were measured between NMSC and 221 SNPs in 26 NER genes. Using the additive model, two tightly linked functional SNPs in ERCC6 were significantly associated with increased risk of NMSC: rs2228527 (odds ratio (OR) 1.57, 95% confidence interval (CI) 1.20 – 2.05), and rs2228529 (OR 1.57, 95% CI 1.20 – 2.05). These associations were confined to basal cell carcinoma of the skin (BCC) (rs2228529, OR 1.78, 95% CI 1.30 – 2.44; rs2228527 OR 1.78, 95% CI 1.31 – 2.43). These hypothesis-generating findings suggest functional variants in ERCC6 may be associated with an increased risk of NMSC that may be specific to BCC.
Case–parent trio studies concerned with children affected by a disease and their parents aim to detect single nucleotide polymorphisms (SNPs) showing a preferential transmission of alleles from the parents to their affected offspring. A popular statistical test for detecting such SNPs associated with disease in this study design is the genotypic transmission/disequilibrium test (gTDT) based on a conditional logistic regression model, which usually needs to be fitted by an iterative procedure. In this article, we derive exact closed-form solutions for the parameter estimates of the conditional logistic regression models when testing for an additive, a dominant, or a recessive effect of a SNP, and show that such analytic parameter estimates also exist when considering gene–environment interactions with binary environmental variables. Because the genetic model underlying the association between a SNP and a disease is typically unknown, it might further be beneficial to use the maximum over the gTDT statistics for the possible effects of a SNP as test statistic. We therefore propose a procedure enabling a fast computation of the test statistic and the permutation-based p-value of this MAX gTDT. All these methods are applied to whole-genome scans of the case–parent trios from the International Cleft Consortium. These applications show our procedures dramatically reduce the required computing time compared to the conventional iterative methods allowing, for example, the analysis of hundreds of thousands of SNPs in a few minutes instead of several hours.
Conditional logistic regression; Family-based design; Genome-wide association studies; Genotypic transmission/disequilibrium test; International Cleft Consortium; MAX test
Long chain polyunsaturated fatty acids (LC-PUFAs) are essential for brain structure, development, and function, and adequate dietary quantities of LC-PUFAs are thought to have been necessary for both brain expansion and the increase in brain complexity observed during modern human evolution. Previous studies conducted in largely European populations suggest that humans have limited capacity to synthesize brain LC-PUFAs such as docosahexaenoic acid (DHA) from plant-based medium chain (MC) PUFAs due to limited desaturase activity. Population-based differences in LC-PUFA levels and their product-to-substrate ratios can, in part, be explained by polymorphisms in the fatty acid desaturase (FADS) gene cluster, which have been associated with increased conversion of MC-PUFAs to LC-PUFAs. Here, we show evidence that these high efficiency converter alleles in the FADS gene cluster were likely driven to near fixation in African populations by positive selection ∼85 kya. We hypothesize that selection at FADS variants, which increase LC-PUFA synthesis from plant-based MC-PUFAs, played an important role in allowing African populations obligatorily tethered to marine sources for LC-PUFAs in isolated geographic regions, to rapidly expand throughout the African continent 60–80 kya.
Non-syndromic cleft palate (CP) is a common birth defect with a complex and heterogeneous etiology involving both genetic and environmental risk factors. We conducted a genome wide association study (GWAS) using 550 case-parent trios, ascertained through a CP case collected in an international consortium. Family based association tests of single nucleotide polymorphisms (SNP) and three common maternal exposures (maternal smoking, alcohol consumption and multivitamin supplementation) were used in a combined 2 df test for gene (G) and gene-environment (G×E) interaction simultaneously, plus a separate 1 df test for G×E interaction alone. Conditional logistic regression models were used to estimate effects on risk to exposed and unexposed children. While no SNP achieved genome wide significance when considered alone, markers in several genes attained or approached genome wide significance when G×E interaction was included. Among these, MLLT3 and SMC2 on chromosome 9 showed multiple SNPs resulting in increased risk if the mother consumed alcohol during the peri-conceptual period (3 months prior to conception through the first trimester). TBK1 on chr. 12 and ZNF236 on chr. 18 showed multiple SNPs associated with higher risk of CP in the presence of maternal smoking. Additional evidence of reduced risk due to G×E interaction in the presence of multivitamin supplementation was observed for SNPs in BAALC on chr. 8. These results emphasize the need to consider G×E interaction when searching for genes influencing risk to complex and heterogeneous disorders, such as non-syndromic CP.
The receptor tyrosine kinase-like orphan receptor 2 (ROR2) gene has been recently shown to play important roles in palatal development in animal models and resides in the chromosomal region linked to non syndromic cleft lip with or without cleft palate in humans. The aim of this study was to investigate the possible association between ROR2 gene and non-syndromic oral clefts.
Here we tested 38 eligible single-nucleotide polymorphisms (SNPs) in ROR2 gene in 297 non-syndromic cleft lip with or without cleft palate and in 82 non-syndromic cleft palate case parent trios recruited from Asia and Maryland. Family Based Association Test was used to test for deviation from Mendelian inheritance. Plink software was used to test potential parent of origin effect. Possible maternally mediated in utero effects were assessed using the TRIad Multi-Marker approach under an assumption of mating symmetry in the population.
Significant evidence of linkage and association was shown for 3 SNPs (rs7858435, rs10820914 and rs3905385) among 57 Asian non-syndromic cleft palate trios in Family Based Association Tests. P values for these 3 SNPs equaled to 0.000068, 0.000115 and 0.000464 respectively which were all less than the significance level (0.05/38=0.0013) adjusted by strict Bonferroni correction. Relevant odds ratios for the risk allele were 3.42 (1.80–6.50), 3.45 (1.75–6.67) and 2.94 (1.56–5.56), respectively. Statistical evidence of linkage and association was not shown for study groups other than non-syndromic cleft palate. Neither evidence for parent-of-origin nor maternal genotypic effect was shown for any of the ROR2 markers in our analysis for all study groups.
Our results provided evidence of linkage and association between the ROR2 gene and a gene controlling risk to non-syndromic cleft palate.
receptor tyrosine kinase-like orphan receptor 2; cleft lip; cleft palate; association; transmission disequilibrium test
Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number and integration of the marker-level estimates with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R.
copy number; batch effects; robust; multilevel model; high-throughput; oligonucleotide array
Motivation: Changes in the copy number of chromosomal DNA segments [copy number variants (CNVs)] have been implicated in human variation, heritable diseases and cancers. Microarray-based platforms are the current established technology of choice for studies reporting these discoveries and constitute the benchmark against which emergent sequence-based approaches will be evaluated. Research that depends on CNV analysis is rapidly increasing, and systematic platform assessments that distinguish strengths and weaknesses are needed to guide informed choice.
Results: We evaluated the sensitivity and specificity of six platforms, provided by four leading vendors, using a spike-in experiment. NimbleGen and Agilent platforms outperformed Illumina and Affymetrix in accuracy and precision of copy number dosage estimates. However, Illumina and Affymetrix algorithms that leverage single nucleotide polymorphism (SNP) information make up for this disadvantage and perform well at variant detection. Overall, the NimbleGen 2.1M platform outperformed others, but only with the use of an alternative data analysis pipeline to the one offered by the manufacturer.
Availability: The data is available from http://rafalab.jhsph.edu/cnvcomp/.
Contact: firstname.lastname@example.org; email@example.com; firstname.lastname@example.org
Supplementary information: Supplementary data are available at Bioinformatics online.