Obesity is an increasingly common disorder that predisposes to several medical conditions, including type 2 diabetes. We investigated whether large and rare copy-number variations (CNVs) differentiate moderate to extreme obesity from never-overweight control subjects.
RESEARCH DESIGN AND METHODS
Using single nucleotide polymorphism (SNP) arrays, we performed a genome-wide CNV survey on 430 obese case subjects (BMI >35 kg/m2) and 379 never-overweight control subjects (BMI <25 kg/m2). All subjects were of European ancestry and were genotyped on the Illumina HumanHap550 arrays with ∼550,000 SNP markers. The CNV calls were generated by PennCNV software.
CNVs >1 Mb were found to be overrepresented in case versus control subjects (odds ratio [OR] = 1.5 [95% CI 0.5–5]), and CNVs >2 Mb were present in 1.3% of the case subjects but were absent in control subjects (OR = infinity [95% CI 1.2–infinity]). When focusing on rare deletions that disrupt genes, even more pronounced effect sizes are observed (OR = 2.7 [95% CI 0.5–27.1] for CNVs >1 Mb). Interestingly, obese case subjects who carry these large CNVs have moderately high BMI and do not appear to be extreme cases. Several CNVs disrupt known candidate genes for obesity, such as a 3.3-Mb deletion disrupting NAP1L5 and a 2.1-Mb deletion disrupting UCP1 and IL15.
Our results suggest that large CNVs, especially rare deletions, confer risk of obesity in patients with moderate obesity and that genes impacted by large CNVs represent intriguing candidates for obesity that warrant further study.
Copy number variations (CNVs) are genomic structural variants that are found in healthy populations and have been observed to be associated with disease susceptibility. Existing methods for CNV detection are often performed on a sample-by-sample basis, which is not ideal for large datasets where common CNVs must be estimated by comparing the frequency of CNVs in the individual samples. Here we describe a simple and novel approach to locate genome-wide CNVs common to a specific population, using human ancestry as the phenotype.
We utilized our previously published Genome Alteration Detection Analysis (GADA) algorithm to identify common ancestry CNVs (caCNVs) and built a caCNV model to predict population structure. We identified a 73 caCNV signature using a training set of 225 healthy individuals from European, Asian, and African ancestry. The signature was validated on an independent test set of 300 individuals with similar ancestral background. The error rate in predicting ancestry in this test set was 2% using the 73 caCNV signature. Among the caCNVs identified, several were previously confirmed experimentally to vary by ancestry. Our signature also contains a caCNV region with a single microRNA (MIR270), which represents the first reported variation of microRNA by ancestry.
We developed a new methodology to identify common CNVs and demonstrated its performance by building a caCNV signature to predict human ancestry with high accuracy. The utility of our approach could be extended to large case–control studies to identify CNV signatures for other phenotypes such as disease susceptibility and drug response.
Bipolar disorder (BPD) is a common psychiatric illness with a complex mode of inheritance. Besides traditional linkage and association studies, which require large sample sizes, analysis of common and rare chromosomal copy number variants (CNVs) in extended families may provide novel insights into the genetic susceptibility of complex disorders. Using the Illumina HumanHap550 BeadChip with over 550,000 SNP markers, we genotyped 46 individuals in a three-generation Old Order Amish pedigree with 19 affected (16 BPD and three major depression) and 27 unaffected subjects. Using the PennCNV algorithm, we identified 50 CNV regions that ranged in size from 12 to 885 kb and encompassed at least 10 single nucleotide polymorphisms (SNPs). Of 19 well characterized CNV regions that were available for combined genotype-expression analysis 11 (58%) were associated with expression changes of genes within, partially within or near these CNV regions in fibroblasts or lymphoblastoid cell lines at a nominal P value <0.05. To further investigate the mode of inheritance of CNVs in the large pedigree, we analyzed a set of four CNVs, located at 6q27, 9q21.11, 12p13.31 and 15q11, all of which were enriched in subjects with affective disorders. We additionally show that these variants affect the expression of neuronal genes within or near the rearrangement. Our analysis suggests that family based studies of the combined effect of common and rare CNVs at many loci may represent a useful approach in the genetic analysis of disease susceptibility of mental disorders.
The genome-wide presence of copy number variations (CNVs), which was shown to affect the expression and function of genes, has been recently suggested to confer risk for various human disorders, including Amyotrophic Lateral Sclerosis (ALS). We have performed a genome-wide CNV analysis using PennCNV tool and 733K GWAS data of 117 Turkish ALS patients and 109 matched healthy controls. Case-control association analyses have implicated the presence of both common (>5%) and rare (<5%) CNVs in the Turkish population. In the framework of this study, we identified several common and rare loci that may have an impact on ALS pathogenesis. None of the CNVs associated has been implicated in ALS before, but some have been reported in different types of cancers and autism. The most significant associations were shown for 41 kb and 15 kb intergenic heterozygous deletions (Chr11: 50,545,009–50,586,426 and Chr19: 20,860,930–20,875,787) both contributing to increased risk for ALS. CNVs in coding regions of the MAP4K3, HLA-B, EPHA3 and DPYD genes were detected however, after validation by Log R Ratio (LRR) values and TaqMan CNV genotyping, only EPHA3 deletion remained as a potential protective factor for ALS (p = 0.0065024). Based on the knowledge that EPHA4 has been previously shown to rescue SOD1 transgenic mice from ALS phenotype and prolongs survival, EPHA3 may be a promising candidate for therepuetic interventions.
Studies that analyzed single nucleotide polymorphisms (SNP) in various genes have shown that genetic factors are strongly associated with age-related macular degeneration (AMD) susceptibility. Copy number variation (CNV) may be an additional type of genetic variation that contributes to AMD pathogenesis. This study investigated CNV in 4 AMD-relevant genes in Korean AMD patients and control subjects.
Four CNV candidate regions located in AMD-relevant genes (VEGFA, ARMS2/HTRA1, CFH and VLDLR), were selected based on the outcomes of our previous study which elucidated common CNVs in the Asian populations. Real-time PCR based TaqMan Copy Number Assays were performed on CNV candidates in 273 AMD patients and 257 control subjects.
The predicted copy number (PCN, 0, 1, 2 or 3+) of each region was called using the CopyCaller program. All candidate genes except ARMS2/HTRA1 showed CNV in at least one individual, in which losses of VEGFA and VLDLR represent novel findings in the Asian population. When the frequencies of PCN were compared, only the gain in VLDLR showed significant differences between AMD patients and control subjects (p = 0.025). Comparisons of the raw copy values (RCV) revealed that 3 of 4 candidate genes showed significant differences (2.03 vs. 1.92 for VEGFA, p<0.01; 2.01 vs. 1.97 for CFH, p<0.01; 1.97 vs. 2.01, p<0.01 for ARMS2/HTRA1).
CNVs located in AMD-relevant genes may be associated with AMD susceptibility. Further investigations encompassing larger patient cohorts are needed to elucidate the role of CNV in AMD pathogenesis.
Structural variations such as copy number variants (CNV) influence the expression of different phenotypic traits. Algorithms to identify CNVs through SNP-array platforms are available. The ability to evaluate well-characterized CNVs such as GSTM1 (1p13.3) deletion provides an important opportunity to assess their performance.
773 cases and 759 controls from the SBC/EPICURO Study were genotyped in the GSTM1 region using TaqMan, Multiplex Ligation-dependent Probe Amplification (MLPA), and Illumina Infinium 1 M SNP-array platforms. CNV callings provided by TaqMan and MLPA were highly concordant and replicated the association between GSTM1 and bladder cancer. This was not the case when CNVs were called using Illumina 1 M data through available algorithms since no deletion was detected across the study samples. In contrast, when the Log R Ratio (LRR) was used as a continuous measure for the 5 probes contained in this locus, we were able to detect their association with bladder cancer using simple regression models or more sophisticated methods such as the ones implemented in the CNVtools package.
This study highlights an important limitation in the CNV calling from SNP-array data in regions of common aberrations and suggests that there may be added advantage for using LRR as a continuous measure in association tests rather than relying on calling algorithms.
Bladder cancer risk; Glutathione S-transferase mu 1 (GSTM1); Copy number variation (CNV); SNP-array
A major motivation for seeking disease-associated genetic variation is to identify novel risk processes. Although rare copy number variants (CNVs) appear to contribute to attention deficit hyperactivity disorder (ADHD), common risk variants (single-nucleotide polymorphisms [SNPs]) have not yet been detected using genome-wide association studies (GWAS). This raises the concern as to whether future larger-scale, adequately powered GWAS will be worthwhile. The authors undertook a GWAS of ADHD and examined whether associated SNPs, including those below conventional levels of significance, influenced the same biological pathways affected by CNVs.
The authors analyzed genome-wide SNP frequencies in 727 children with ADHD and 5,081 comparison subjects. The gene sets that were enriched in a pathway analysis of the GWAS data (the top 5% of SNPs) were tested for an excess of genes spanned by large, rare CNVs in the children with ADHD.
No SNP achieved genome-wide significance levels. As previously reported in a subsample of the present study, large, rare CNVs were significantly more common in case subjects than comparison subjects. Thirteen biological pathways enriched for SNP association significantly overlapped with those enriched for rare CNVs. These included cholesterol-related and CNS development pathways. At the level of individual genes, CHRNA7, which encodes a nicotinic receptor subunit previously implicated in neuropsychiatric disorders, was affected by six large duplications in case subjects (none in comparison subjects), and SNPs in the gene had a gene-wide p value of 0.0002 for association in the GWAS.
Both common and rare genetic variants appear to be relevant to ADHD and index-shared biological pathways.
Alcohol dependence (AD) is a complex disorder characterized by psychiatric and physiological dependence on alcohol. AD is reflected by regular alcohol drinking, which is highly inheritable. In this study, to identify susceptibility genes associated with alcohol drinking, we performed a genome-wide association study of copy number variants (CNVs) in 2,286 Caucasian subjects with Affymetrix SNP6.0 genotyping array. We replicated our findings in 1,627 Chinese subjects with the same genotyping array. We identified two CNVs, CNV207 (combined p-value 1.91E-03) and CNV1836 (combined p-value 3.05E-03) that were associated with alcohol drinking. CNV207 and CNV1836 are located at the downstream of genes LTBP1 (870 kb) and FGD4 (400 kb), respectively. LTBP1, by interacting TGFB1, may down-regulate enzymes directly participating in alcohol metabolism. FGD4 plays a role in clustering and trafficking GABAA receptor and subsequently influence alcohol drinking through activating CDC42. Our results provide suggestive evidence that the newly identified CNV regions and relevant genes may contribute to the genetic mechanism of alcohol dependence.
Variation in human intelligence is approximately 50% heritable, but understanding of the genes involved is limited. Several forms of genetic variation remain under-studied in relation to intelligence, one of which is copy number variation (CNV). Using single-nucleotide polymorphism (SNP) -based microarrays, we genotyped CNVs genome-wide in a birth cohort of 723 New Zealanders, and correlated them with four intelligence-related phenotypes. We found no significant association for any common CNV after false discovery correction, which is consistent with previous work. In contrast to a previous study, however, we found no effect on any cognitive measure of rare CNV burden, defined as total number of bases inserted or deleted in CNVs rarer than 5%. We discuss possible reasons for this failure to replicate, including interaction between CNV and aging in determining the effects of rare CNVs. While our results suggest that no CNV assayable by SNP chips contributes more than a very small amount to variation in human intelligence, it remains possible that common CNVs in segmental duplication arrays, which are not well covered by SNP chips, are important contributors.
Aging is a biological process strongly determined by genetics. However, only a few single nucleotide polymorphisms (SNPs) have been reported to be consistently associated with aging. While investigating whether copy number variations (CNVs) could fill this gap, we focused on CNVs that have not been studied in previous SNP-based searches via tagging SNPs.
TaqMan qPCR assays were developed to quantify 20 common CNVs in 222 senior American Caucasians in order to reveal possible association with longevity. The replication study was comprised of 1283 community-dwelling senior European Caucasians. Replicated CNVs were further investigated for association with healthy aging and aging-related diseases, while association with longevity was additionally tested in Caenorhabditis elegans.
In the discovery study of ≥80 vs.<80 years old seniors, a homozygous intronic CNV deletion in the CNTNAP4 gene was inversely associated with survival to the age of 80 (OR=0.51, 95%CI 0.29-0.87, p=0.015 before correction for multiple testing). After stratification by sex, association remained significant in females (OR=0.41, 95%CI 0.21-0.77, p=0.007), but not in males (OR=0.97, 95%CI 0.33-2.79, p=1). The finding was validated in a replication study (OR=0.66, 95%CI 0.48-0.90, p=0.011 for females). CNTNAP4 association with longevity was supported by a marked 25% lifespan change in C. elegans after knocking down the ortholog gene. An inverse association of the CNV del/del variant with female healthy aging was observed (OR=0.39, 95%CI 0.19-0.76, p=0.006). A corresponding positive association with aging-related diseases was revealed for cognitive impairment (OR=2.17, 95%CI 1.11-4.22, p=0.024) and, in independent studies, for Alzheimer’s (OR=4.07, 95%CI 1.17-14.14, p=0.036) and Parkinson’s (OR=1.59, 95%CI 1.03-2.42, p=0.041) diseases.
This is the first demonstration for association of the CNTNAP4 gene and one of its intronic CNV polymorphisms with aging. Association with particular aging-related diseases awaits replication and independent validation.
Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.
DNA copy number variations (CNVs) are an important component of genetic variation, affecting a greater fraction of the genome than single nucleotide polymorphisms (SNPs). The advent of high-resolution SNP arrays has made it possible to identify CNVs. Characterization of widespread constitutional (germline) CNVs has provided insight into their role in susceptibility to a wide spectrum of diseases, and somatic CNVs can be used to identify regions of the genome involved in disease phenotypes. The role of CNVs as risk factors for cancer is currently underappreciated. However, the genomic instability and structural dynamism that characterize cancer cells would seem to make this form of genetic variation particularly intriguing to study in cancer. Here, we provide a detailed overview of the current understanding of the CNVs that arise in the human genome and explore the emerging literature that reveals associations of both constitutional and somatic CNVs with a wide variety of human cancers.
Copy number variations (CNVs), a major source of human genetic polymorphism, have been suggested to have an important role in genetic susceptibility to common diseases such as cancer, immune diseases and neurological disorders. Nasopharyngeal carcinoma (NPC) is a multifactorial tumor closely associated with genetic background and with a male preponderance over female (3:1). Previous genome-wide association studies have identified single-nucleotide polymorphisms (SNPs) that are associated with NPC susceptibility. Here, we sought to explore the possible association of CNVs with NPC predisposition. Utilizing genome-wide SNP-based arrays and five CNV-prediction algorithms, we identified eight regions with CNV that were significantly overrepresented in NPC patients compared with healthy controls. These CNVs included six deletions (on chromosomes 3, 6, 7, 8 and 19), and two duplications (on chromosomes 7 and 12). Among them, the CNV located at chromosome 6p21.3, with single-copy deletion of the MICA and HCP5 genes, showed the highest association with NPC. Interestingly, it was more specifically associated with an increased NPC risk among males. This gender-specific association was replicated in an independent case–control sample using a self-established deletion-specific polymerase chain reaction strategy. To the best of our knowledge, this is the first study to explore the role of constitutional CNVs in NPC, using a genome-wide platform. Moreover, we identified eight novel candidate regions with CNV that merit future investigation, and our results suggest that similar to neuroblastoma and prostate cancer, genetic structural variations might contribute to NPC predisposition.
Extensive studies are currently being performed to associate disease susceptibility with one form of genetic variation, namely single nucleotide polymorphisms (SNPs). In recent years another type of common genetic variation has been characterised, namely structural variation, including copy number variations (CNVs). To determine the overall contribution of CNVs to complex phenotypes we have performed association analyses of expression levels of 14,925 transcripts with SNPs and CNVs in individuals who are part of the International HapMap project. SNPs and CNVs captured 83.6% and 17.7% of the total detected genetic variation in gene expression, respectively, but the signals from the two types of variation had little overlap. Interrogation of the genome for both types of variants may be an effective way to elucidate the causes of complex phenotypes and disease in humans.
The detection of copy number variants (CNVs) and the results of CNV-disease association studies rely on how CNVs are defined, and because array-based technologies can only infer CNVs, CNV-calling algorithms can produce vastly different findings. Several authors have noted the large-scale variability between CNV-detection methods, as well as the substantial false positive and false negative rates associated with those methods. In this study, we use variations of four common algorithms for CNV detection (PennCNV, QuantiSNP, HMMSeg, and cnvPartition) and two definitions of overlap (any overlap and an overlap of at least 40% of the smaller CNV) to illustrate the effects of varying algorithms and definitions of overlap on CNV discovery.
Methodology and Principal Findings
We used a 56 K Illumina genotyping array enriched for CNV regions to generate hybridization intensities and allele frequencies for 48 Caucasian schizophrenia cases and 48 age-, ethnicity-, and gender-matched control subjects. No algorithm found a difference in CNV burden between the two groups. However, the total number of CNVs called ranged from 102 to 3,765 across algorithms. The mean CNV size ranged from 46 kb to 787 kb, and the average number of CNVs per subject ranged from 1 to 39. The number of novel CNVs not previously reported in normal subjects ranged from 0 to 212.
Conclusions and Significance
Motivated by the availability of multiple publicly available genome-wide SNP arrays, investigators are conducting numerous analyses to identify putative additional CNVs in complex genetic disorders. However, the number of CNVs identified in array-based studies, and whether these CNVs are novel or valid, will depend on the algorithm(s) used. Thus, given the variety of methods used, there will be many false positives and false negatives. Both guidelines for the identification of CNVs inferred from high-density arrays and the establishment of a gold standard for validation of CNVs are needed.
Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected associations between CNVs and certain phenotypes are constantly increasing. However, while several software packages support the determination of CNVs from SNP chip data, the downstream statistical inference of CNV-phenotype associations is still subject to complicated and inefficient in-house solutions, thus strongly limiting the performance of GWAS based on CNVs.
CONAN is a freely available client-server software solution which provides an intuitive graphical user interface for categorizing, analyzing and associating CNVs with phenotypes. Moreover, CONAN assists the evaluation process by visualizing detected associations via Manhattan plots in order to enable a rapid identification of genome-wide significant CNV regions. Various file formats including the information on CNVs in population samples are supported as input data.
CONAN facilitates the performance of GWAS based on CNVs and the visual analysis of calculated results. CONAN provides a rapid, valid and straightforward software solution to identify genetic variation underlying the 'missing' heritability for complex traits that remains unexplained by recent GWAS. The freely available software can be downloaded at http://genepi-conan.i-med.ac.at.
Genome-wide association studies (GWAS) using Copy Number Variation (CNV) are becoming a central focus of genetic research. CNVs have successfully provided target genome regions for some disease conditions where simple genetic variation (i.e., SNPs) has previously failed to provide a clear association.
Here we present a new R package, that integrates: (i) data import from most common formats of Affymetrix, Illumina and aCGH arrays; (ii) a fast and accurate segmentation algorithm to call CNVs based on Genome Alteration Detection Analysis (GADA); and (iii) functions for displaying and exporting the Copy Number calls, identification of recurrent CNVs, multivariate analysis of population structure, and tools for performing association studies. Using a large dataset containing 270 HapMap individuals (Affymetrix Human SNP Array 6.0 Sample Dataset) we demonstrate a flexible pipeline implemented with the package. It requires less than one minute per sample (3 million probe arrays) on a single core computer, and provides a flexible parallelization for very large datasets. Case-control data were generated from the HapMap dataset to demonstrate a GWAS analysis.
The package provides the tools for creating a complete integrated pipeline from data normalization to statistical association. It can effciently handle a massive volume of data consisting of millions of genetic markers and hundreds or thousands of samples with very accurate results.
Recent discovery of the copy number variation (CNV) in normal individuals has widened our understanding of genomic variation. However, most of the reported CNVs have been identified in Caucasians, which may not be directly applicable to people of different ethnicities. To profile CNV in East-Asian population, we screened CNVs in 3578 healthy, unrelated Korean individuals, using the Affymetrix Genome-Wide Human SNP array 5.0. We identified 144 207 CNVs using a pooled data set of 100 randomly chosen Korean females as a reference. The average number of CNVs per genome was 40.3, which is higher than that of CNVs previously reported using lower resolution platforms. The median size of CNVs was 18.9 kb (range 0.2–5406 kb). Copy number losses were 4.7 times more frequent than copy number gains. CNV regions (CNVRs) were defined by merging overlapping CNVs identified in two or more samples. In total, 4003 CNVRs were defined encompassing 241.9 Mb accounting for ∼8% of the human genome. A total of 2077 CNVRs (51.9%) were potentially novel. Known CNVRs were larger and more frequent than novel CNVRs. Sixteen percent of the CNVRs were observed in ≥1% of study subjects and 24% overlapped with the OMIM genes. A total of 476 (11.9%) CNVRs were associated with segmental duplications. CNVS/CNVRs identified in this study will be valuable resources for studying human genome diversity and its association with disease.
Genome-wide association (GWA) studies have identified common variants that are associated with a variety of traits and diseases, but most studies have been performed in European-derived populations. Here, we describe the first genome-wide analyses of imputed genotype and copy number variants (CNVs) for anthropometric measures in African-derived populations: 1188 Nigerians from Igbo-Ora and Ibadan, Nigeria, and 743 African-Americans from Maywood, IL. To improve the reach of our study, we used imputation to estimate genotypes at ∼2.1 million single-nucleotide polymorphisms (SNPs) and also tested CNVs for association. No SNPs or common CNVs reached a genome-wide significance level for association with height or body mass index (BMI), and the best signals from a meta-analysis of the two cohorts did not replicate in ∼3700 African-Americans and Jamaicans. However, several loci previously confirmed in European populations showed evidence of replication in our GWA panel of African-derived populations, including variants near IHH and DLEU7 for height and MC4R for BMI. Analysis of global burden of rare CNVs suggested that lean individuals possess greater total burden of CNVs, but this finding was not supported in an independent European population. Our results suggest that there are not multiple loci with strong effects on anthropometric traits in African-derived populations and that sample sizes comparable to those needed in European GWA studies will be required to identify replicable associations. Meta-analysis of this data set with additional studies in African-ancestry populations will be helpful to improve power to detect novel associations.
Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins1–4. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs5–9. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with ~550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 × 10−3). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 × 10−3). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 × 10−6). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.
Genetic factors predisposing individuals to cancer remain elusive in the majority of patients with a familial or clinical history suggestive of hereditary breast cancer. Germline DNA copy number variation (CNV) has recently been implicated in predisposition to cancers such as neuroblastomas as well as prostate and colorectal cancer. We evaluated the role of germline CNVs in breast cancer susceptibility, in particular those with low population frequencies (rare CNVs), which are more likely to cause disease."
Using whole-genome comparative genomic hybridization on microarrays, we screened a cohort of women fulfilling criteria for hereditary breast cancer who did not carry BRCA1/BRCA2 mutations.
The median numbers of total and rare CNVs per genome were not different between controls and patients. A total of 26 rare germline CNVs were identified in 68 cancer patients, however, a proportion that was significantly different (P = 0.0311) from the control group (23 rare CNVs in 100 individuals). Several of the genes affected by CNV in patients and controls had already been implicated in cancer.
This study is the first to explore the contribution of germline CNVs to BRCA1/2-negative familial and early-onset breast cancer. The data suggest that rare CNVs may contribute to cancer predisposition in this small cohort of patients, and this trend needs to be confirmed in larger population samples.
The genetic determinants for aggressiveness of prostate cancer (PCa) are poorly understood. Copy-number variations (CNVs) are one of the major sources for genetic diversity and critically modulate cellular biology and human diseases. We hypothesized that CNVs may be associated with PCa aggressiveness. To test this hypothesis, we conducted a genome-wide common CNVs analysis in 448 aggressive and 500 nonaggressive PCa cases recruited from Johns Hopkins Hospital (JHH1) using Affymetrix 6.0 arrays. Suggestive associations were further confirmed using single-nucleotide polymorphisms (SNPs) that tagged the CNVs of interest in an additional 2895 aggressive and 3094 nonaggressive cases, including those from the remaining case subjects of the JHH study (JHH2), the NCI Cancer Genetic Markers of Susceptibility (CGEMS) Study, and the CAncer of the Prostate in Sweden (CAPS) Study. We found that CNP2454, a 32.3 kb deletion polymorphism at 20p13, was significantly associated with aggressiveness of PCa in JHH1 [odds ratio (OR) = 1.30, 95% confidence interval (CI): 1.01–1.68; P = 0.045]. The best-tagging SNP for CNP2454, rs2209313, was used to confirm this finding in both JHH1 (P = 0.045) and all confirmation study populations combined (P = 1.77 × 10−3). Pooled analysis using all 3353 aggressive and 3584 nonaggressive cases showed the T allele of rs2209313 was significantly associated with an increased risk of aggressive PCa (OR = 1.17, 95% CI: 1.07–1.27; P = 2.75 × 10−4). Our results indicate that genetic variations at 20p13 may be responsible for the progression of PCa.
To date, hundreds of thousands of copy-number variation (CNV) data have been reported using various platforms. The proportion of Asians in these data is, however, relatively small as compared with that of other ethnic groups, such as Caucasians and Yorubas. Because of limitations in platform resolution and the high noise level in signal intensity, in most CNV studies (particularly those using single nucleotide polymorphism arrays), the average number of CNVs in an individual is less than the number of known CNVs. In this study, we ascertained reliable, common CNV regions (CNVRs) and identified actual frequency rates in the Korean population to provide more CNV information. We performed two-stage analyses for detecting structural variations with two platforms. We discovered 576 common CNVRs (88 CNV segments on average in an individual), and 87% (501 of 576) of these CNVRs overlapped by ≥1 bp with previously validated CNV events. Interestingly, from the frequency analysis of CNV profiles, 52 of 576 CNVRs had a frequency rate of <1% in the 8842 individuals. Compared with other common CNV studies, this study found six common CNVRs that were not reported in previous CNV studies. In conclusion, we propose the data-driven detection approach to discover common CNVRs including those of unreported in the previous Korean CNV study while minimizing false positives. Through our approach, we successfully discovered more common CNVRs than previous Korean CNV study and conducted frequency analysis. These results will be a valuable resource for the effective level of CNVs in the Korean population.
common copy-number variation; CNV profile; Asian CNV; structural variation
Copy number variations (CNV) are important causal genetic variations for human disease; however, the lack of a statistical model has impeded the systematic testing of CNVs associated with disease in large-scale cohort.
Here, we developed a novel integrated strategy to test CNV-association in genome-wide case-control studies. We converted the single-nucleotide polymorphism (SNP) signal to copy number states using a well-trained hidden Markov model. We mapped the susceptible CNV-loci through SNP site-specific testing to cope with the physiological complexity of CNVs. We also ensured the credibility of the associated CNVs through further window-based CNV-pattern clustering. Genome-wide data with seven diseases were used to test our strategy and, in total, we identified 36 new susceptible loci that are associated with CNVs for the seven diseases: 5 with bipolar disorder, 4 with coronary artery disease, 1 with Crohn's disease, 7 with hypertension, 9 with rheumatoid arthritis, 7 with type 1 diabetes and 3 with type 2 diabetes. Fifteen of these identified loci were validated through genotype-association and physiological function from previous studies, which provide further confidence for our results. Notably, the genes associated with bipolar disorder converged in the phosphoinositide/calcium signaling, a well-known affected pathway in bipolar disorder, which further supports that CNVs have impact on bipolar disorder.
Our results demonstrated the effectiveness and robustness of our CNV-association analysis and provided an alternative avenue for discovering new associated loci of human diseases.
Copy-number variants (CNVs) are a source of genetic variation that increasingly are associated with human disease. However, the role of CNVs in human lifespan is to date unknown. To identify CNVs that influence mortality at old age, we analyzed genome-wide CNV data in 5178 participants of Rotterdam Study (RS1) and positive findings were evaluated in 1714 participants of the second cohort of the Rotterdam Study (RS2) and in 4550 participants of Framingham Heart Study (FHS). First, we assessed the total burden of rare (frequency <1%) and common (frequency >1%) CNVs for association with mortality during follow-up. These analyses were repeated by stratifying CNVs by type and size. Secondly, we assessed individual common CNV regions (CNVR) for association with mortality. We observed that the burden of common but not of rare CNVs influences mortality. A higher burden of large (≥500 kb) common deletions associated with 4% higher mortality [hazard ratio (HR) per CNV 1.04, 95% confidence interval (CI) 1.02–1.07, P = 5.82 × 10−5] in the 11 442 participants of RS1, RS2 and FHS. In the analysis of 312 individual common CNVRs, we identified two regions (11p15.5; 14q21.3) that associated with higher mortality in these cohorts. The 11p15.5 region (combined HR 1.59, 95% CI 1.31–1.93, P = 2.87 × 10−6) encompasses 41 genes, of which some have previously been related to longevity, whereas the 14q21.3 region (combined HR 1.57, 95% CI 1.19–2.07, P = 1.53 × 10−3) does not encompass any genes. In conclusion, the burden of large common deletions, as well as common CNVs in 11p15.5 and 14q21.3 region, associate with higher mortality.