Obesity is an increasingly common disorder that predisposes to several medical conditions, including type 2 diabetes. We investigated whether large and rare copy-number variations (CNVs) differentiate moderate to extreme obesity from never-overweight control subjects.
RESEARCH DESIGN AND METHODS
Using single nucleotide polymorphism (SNP) arrays, we performed a genome-wide CNV survey on 430 obese case subjects (BMI >35 kg/m2) and 379 never-overweight control subjects (BMI <25 kg/m2). All subjects were of European ancestry and were genotyped on the Illumina HumanHap550 arrays with ∼550,000 SNP markers. The CNV calls were generated by PennCNV software.
CNVs >1 Mb were found to be overrepresented in case versus control subjects (odds ratio [OR] = 1.5 [95% CI 0.5–5]), and CNVs >2 Mb were present in 1.3% of the case subjects but were absent in control subjects (OR = infinity [95% CI 1.2–infinity]). When focusing on rare deletions that disrupt genes, even more pronounced effect sizes are observed (OR = 2.7 [95% CI 0.5–27.1] for CNVs >1 Mb). Interestingly, obese case subjects who carry these large CNVs have moderately high BMI and do not appear to be extreme cases. Several CNVs disrupt known candidate genes for obesity, such as a 3.3-Mb deletion disrupting NAP1L5 and a 2.1-Mb deletion disrupting UCP1 and IL15.
Our results suggest that large CNVs, especially rare deletions, confer risk of obesity in patients with moderate obesity and that genes impacted by large CNVs represent intriguing candidates for obesity that warrant further study.
The genome-wide presence of copy number variations (CNVs), which was shown to affect the expression and function of genes, has been recently suggested to confer risk for various human disorders, including Amyotrophic Lateral Sclerosis (ALS). We have performed a genome-wide CNV analysis using PennCNV tool and 733K GWAS data of 117 Turkish ALS patients and 109 matched healthy controls. Case-control association analyses have implicated the presence of both common (>5%) and rare (<5%) CNVs in the Turkish population. In the framework of this study, we identified several common and rare loci that may have an impact on ALS pathogenesis. None of the CNVs associated has been implicated in ALS before, but some have been reported in different types of cancers and autism. The most significant associations were shown for 41 kb and 15 kb intergenic heterozygous deletions (Chr11: 50,545,009–50,586,426 and Chr19: 20,860,930–20,875,787) both contributing to increased risk for ALS. CNVs in coding regions of the MAP4K3, HLA-B, EPHA3 and DPYD genes were detected however, after validation by Log R Ratio (LRR) values and TaqMan CNV genotyping, only EPHA3 deletion remained as a potential protective factor for ALS (p = 0.0065024). Based on the knowledge that EPHA4 has been previously shown to rescue SOD1 transgenic mice from ALS phenotype and prolongs survival, EPHA3 may be a promising candidate for therepuetic interventions.
Copy number variations (CNVs) are genomic structural variants that are found in healthy populations and have been observed to be associated with disease susceptibility. Existing methods for CNV detection are often performed on a sample-by-sample basis, which is not ideal for large datasets where common CNVs must be estimated by comparing the frequency of CNVs in the individual samples. Here we describe a simple and novel approach to locate genome-wide CNVs common to a specific population, using human ancestry as the phenotype.
We utilized our previously published Genome Alteration Detection Analysis (GADA) algorithm to identify common ancestry CNVs (caCNVs) and built a caCNV model to predict population structure. We identified a 73 caCNV signature using a training set of 225 healthy individuals from European, Asian, and African ancestry. The signature was validated on an independent test set of 300 individuals with similar ancestral background. The error rate in predicting ancestry in this test set was 2% using the 73 caCNV signature. Among the caCNVs identified, several were previously confirmed experimentally to vary by ancestry. Our signature also contains a caCNV region with a single microRNA (MIR270), which represents the first reported variation of microRNA by ancestry.
We developed a new methodology to identify common CNVs and demonstrated its performance by building a caCNV signature to predict human ancestry with high accuracy. The utility of our approach could be extended to large case–control studies to identify CNV signatures for other phenotypes such as disease susceptibility and drug response.
Skeletal muscle is a major component of the human body. Age-related loss of muscle mass and function contributes to some public health problems such as sarcopenia and osteoporosis. Skeletal muscle, mainly composed of appendicular lean mass (ALM), is a heritable trait. Copy number variation (CNV) is a common type of human genome variant which may play an important role in the etiology of many human diseases. In this study, we performed genome-wide association analyses of CNV for ALM in 2,286 Caucasian subjects. We then replicated the major findings in 1,627 Chinese subjects. Two CNVs, CNV1191 and CNV2580, were detected to be associated with ALM (p = 2.26×10−2 and 3.34×10−3, respectively). In the Chinese replication sample, the two CNVs achieved p-values of 3.26×10−2 and 0.107, respectively. CNV1191 covers a gene, GTPase of the immunity-associated protein family (GIMAP1), which is important for skeletal muscle cell survival/death in humans. CNV2580 is located in the Serine hydrolase-like protein (SERHL) gene, which plays an important role in normal peroxisome function and skeletal muscle growth in response to mechanical stimuli. In summary, our study suggested two novel CNVs and the related genes that may contribute to variation in ALM.
Studies that analyzed single nucleotide polymorphisms (SNP) in various genes have shown that genetic factors are strongly associated with age-related macular degeneration (AMD) susceptibility. Copy number variation (CNV) may be an additional type of genetic variation that contributes to AMD pathogenesis. This study investigated CNV in 4 AMD-relevant genes in Korean AMD patients and control subjects.
Four CNV candidate regions located in AMD-relevant genes (VEGFA, ARMS2/HTRA1, CFH and VLDLR), were selected based on the outcomes of our previous study which elucidated common CNVs in the Asian populations. Real-time PCR based TaqMan Copy Number Assays were performed on CNV candidates in 273 AMD patients and 257 control subjects.
The predicted copy number (PCN, 0, 1, 2 or 3+) of each region was called using the CopyCaller program. All candidate genes except ARMS2/HTRA1 showed CNV in at least one individual, in which losses of VEGFA and VLDLR represent novel findings in the Asian population. When the frequencies of PCN were compared, only the gain in VLDLR showed significant differences between AMD patients and control subjects (p = 0.025). Comparisons of the raw copy values (RCV) revealed that 3 of 4 candidate genes showed significant differences (2.03 vs. 1.92 for VEGFA, p<0.01; 2.01 vs. 1.97 for CFH, p<0.01; 1.97 vs. 2.01, p<0.01 for ARMS2/HTRA1).
CNVs located in AMD-relevant genes may be associated with AMD susceptibility. Further investigations encompassing larger patient cohorts are needed to elucidate the role of CNV in AMD pathogenesis.
Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins1–4. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs5–9. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with ~550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 × 10−3). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 × 10−3). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 × 10−6). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.
Genetic factors predisposing individuals to cancer remain elusive in the majority of patients with a familial or clinical history suggestive of hereditary breast cancer. Germline DNA copy number variation (CNV) has recently been implicated in predisposition to cancers such as neuroblastomas as well as prostate and colorectal cancer. We evaluated the role of germline CNVs in breast cancer susceptibility, in particular those with low population frequencies (rare CNVs), which are more likely to cause disease."
Using whole-genome comparative genomic hybridization on microarrays, we screened a cohort of women fulfilling criteria for hereditary breast cancer who did not carry BRCA1/BRCA2 mutations.
The median numbers of total and rare CNVs per genome were not different between controls and patients. A total of 26 rare germline CNVs were identified in 68 cancer patients, however, a proportion that was significantly different (P = 0.0311) from the control group (23 rare CNVs in 100 individuals). Several of the genes affected by CNV in patients and controls had already been implicated in cancer.
This study is the first to explore the contribution of germline CNVs to BRCA1/2-negative familial and early-onset breast cancer. The data suggest that rare CNVs may contribute to cancer predisposition in this small cohort of patients, and this trend needs to be confirmed in larger population samples.
Structural variations such as copy number variants (CNV) influence the expression of different phenotypic traits. Algorithms to identify CNVs through SNP-array platforms are available. The ability to evaluate well-characterized CNVs such as GSTM1 (1p13.3) deletion provides an important opportunity to assess their performance.
773 cases and 759 controls from the SBC/EPICURO Study were genotyped in the GSTM1 region using TaqMan, Multiplex Ligation-dependent Probe Amplification (MLPA), and Illumina Infinium 1 M SNP-array platforms. CNV callings provided by TaqMan and MLPA were highly concordant and replicated the association between GSTM1 and bladder cancer. This was not the case when CNVs were called using Illumina 1 M data through available algorithms since no deletion was detected across the study samples. In contrast, when the Log R Ratio (LRR) was used as a continuous measure for the 5 probes contained in this locus, we were able to detect their association with bladder cancer using simple regression models or more sophisticated methods such as the ones implemented in the CNVtools package.
This study highlights an important limitation in the CNV calling from SNP-array data in regions of common aberrations and suggests that there may be added advantage for using LRR as a continuous measure in association tests rather than relying on calling algorithms.
Bladder cancer risk; Glutathione S-transferase mu 1 (GSTM1); Copy number variation (CNV); SNP-array
Bipolar disorder (BPD) is a common psychiatric illness with a complex mode of inheritance. Besides traditional linkage and association studies, which require large sample sizes, analysis of common and rare chromosomal copy number variants (CNVs) in extended families may provide novel insights into the genetic susceptibility of complex disorders. Using the Illumina HumanHap550 BeadChip with over 550,000 SNP markers, we genotyped 46 individuals in a three-generation Old Order Amish pedigree with 19 affected (16 BPD and three major depression) and 27 unaffected subjects. Using the PennCNV algorithm, we identified 50 CNV regions that ranged in size from 12 to 885 kb and encompassed at least 10 single nucleotide polymorphisms (SNPs). Of 19 well characterized CNV regions that were available for combined genotype-expression analysis 11 (58%) were associated with expression changes of genes within, partially within or near these CNV regions in fibroblasts or lymphoblastoid cell lines at a nominal P value <0.05. To further investigate the mode of inheritance of CNVs in the large pedigree, we analyzed a set of four CNVs, located at 6q27, 9q21.11, 12p13.31 and 15q11, all of which were enriched in subjects with affective disorders. We additionally show that these variants affect the expression of neuronal genes within or near the rearrangement. Our analysis suggests that family based studies of the combined effect of common and rare CNVs at many loci may represent a useful approach in the genetic analysis of disease susceptibility of mental disorders.
Brain arteriovenous malformations (BAVM) are clusters of abnormal blood vessels, with shunting of blood from the arterial to venous circulation and a high risk of rupture and intracranial hemorrhage. Most BAVMs are sporadic, but also occur in patients with Hereditary Hemorrhagic Telangiectasia, a Mendelian disorder caused by mutations in genes in the transforming growth factor beta (TGFβ) signaling pathway.
To investigate whether copy number variations (CNVs) contribute to risk of sporadic BAVM, we performed a genome-wide association study in 371 sporadic BAVM cases and 563 healthy controls, all Caucasian. Cases and controls were genotyped using the Affymetrix 6.0 array. CNVs were called using the PennCNV and Birdsuite algorithms and analyzed via segment-based and gene-based approaches. Common and rare CNVs were evaluated for association with BAVM.
A CNV region on 1p36.13, containing the neuroblastoma breakpoint family, member 1 gene (NBPF1), was significantly enriched with duplications in BAVM cases compared to controls (P = 2.2×10−9); NBPF1 was also significantly associated with BAVM in gene-based analysis using both PennCNV and Birdsuite. We experimentally validated the 1p36.13 duplication; however, the association did not replicate in an independent cohort of 184 sporadic BAVM cases and 182 controls (OR = 0.81, P = 0.8). Rare CNV analysis did not identify genes significantly associated with BAVM.
We did not identify common CNVs associated with sporadic BAVM that replicated in an independent cohort. Replication in larger cohorts is required to elucidate the possible role of common or rare CNVs in BAVM pathogenesis.
Copy number variations (CNVs), a major source of human genetic polymorphism, have been suggested to have an important role in genetic susceptibility to common diseases such as cancer, immune diseases and neurological disorders. Nasopharyngeal carcinoma (NPC) is a multifactorial tumor closely associated with genetic background and with a male preponderance over female (3:1). Previous genome-wide association studies have identified single-nucleotide polymorphisms (SNPs) that are associated with NPC susceptibility. Here, we sought to explore the possible association of CNVs with NPC predisposition. Utilizing genome-wide SNP-based arrays and five CNV-prediction algorithms, we identified eight regions with CNV that were significantly overrepresented in NPC patients compared with healthy controls. These CNVs included six deletions (on chromosomes 3, 6, 7, 8 and 19), and two duplications (on chromosomes 7 and 12). Among them, the CNV located at chromosome 6p21.3, with single-copy deletion of the MICA and HCP5 genes, showed the highest association with NPC. Interestingly, it was more specifically associated with an increased NPC risk among males. This gender-specific association was replicated in an independent case–control sample using a self-established deletion-specific polymerase chain reaction strategy. To the best of our knowledge, this is the first study to explore the role of constitutional CNVs in NPC, using a genome-wide platform. Moreover, we identified eight novel candidate regions with CNV that merit future investigation, and our results suggest that similar to neuroblastoma and prostate cancer, genetic structural variations might contribute to NPC predisposition.
We assessed the role of rare copy number variants (CNVs) in Alzheimer's disease (AD) using intensity data from 3260 AD cases and 1290 age-matched controls from the genome-wide association study (GWAS) conducted by the Genetic and Environmental Risk for Alzheimer's disease Consortium (GERAD). We did not observe a significant excess of rare CNVs in cases, although we did identify duplications overlapping APP and CR1 which may be pathogenic. We looked for an excess of CNVs in loci which have been highlighted in previous AD CNV studies, but did not replicate previous findings. Through pathway analyses, we observed suggestive evidence for biological overlap between single nucleotide polymorphisms and CNVs in AD susceptibility. We also identified that our sample of elderly controls harbours significantly fewer deletions >1 Mb than younger control sets in previous CNV studies on schizophrenia and bipolar disorder (P = 8.9 × 10−4 and 0.024, respectively), raising the possibility that healthy elderly individuals have a reduced rate of large deletions. Thus, in contrast to diseases such as schizophrenia, autism and attention deficit/hyperactivity disorder, CNVs do not appear to make a significant contribution to the development of AD.
Although family history is a risk factor for pancreatic adenocarcinoma, much of the genetic etiology of this disease remains unknown. While genome-wide association studies have identified some common single nucleotide polymorphisms (SNPs) associated with pancreatic cancer risk, these SNPs do not explain all the heritability of this disease. We hypothesized that copy number variation (CNVs) in the genome may play a role in genetic predisposition to pancreatic adenocarcinoma. Here, we report a genome-wide analysis of CNVs in a small hospital-based, European ancestry cohort of pancreatic cancer cases and controls. Germline CNV discovery was performed using the Illumina Human CNV370 platform in 223 pancreatic cancer cases (both sporadic and familial) and 169 controls. Following stringent quality control, we asked if global CNV burden was a risk factor for pancreatic cancer. Finally, we performed in silico CNV genotyping and association testing to discover novel CNV risk loci. When we examined the global CNV burden, we found no strong evidence that CNV burden plays a role in pancreatic cancer risk either overall or specifically in individuals with a family history of the disease. Similarly, we saw no significant evidence that any particular CNV is associated with pancreatic cancer risk. Taken together, these data suggest that CNVs do not contribute substantially to the genetic etiology of pancreatic cancer, though the results are tempered by small sample size and large experimental variability inherent in array-based CNV studies.
pancreatic cancer; copy number variation; cancer risk; SNP microarrays; CNVs
Alcohol dependence (AD) is a complex disorder characterized by psychiatric and physiological dependence on alcohol. AD is reflected by regular alcohol drinking, which is highly inheritable. In this study, to identify susceptibility genes associated with alcohol drinking, we performed a genome-wide association study of copy number variants (CNVs) in 2,286 Caucasian subjects with Affymetrix SNP6.0 genotyping array. We replicated our findings in 1,627 Chinese subjects with the same genotyping array. We identified two CNVs, CNV207 (combined p-value 1.91E-03) and CNV1836 (combined p-value 3.05E-03) that were associated with alcohol drinking. CNV207 and CNV1836 are located at the downstream of genes LTBP1 (870 kb) and FGD4 (400 kb), respectively. LTBP1, by interacting TGFB1, may down-regulate enzymes directly participating in alcohol metabolism. FGD4 plays a role in clustering and trafficking GABAA receptor and subsequently influence alcohol drinking through activating CDC42. Our results provide suggestive evidence that the newly identified CNV regions and relevant genes may contribute to the genetic mechanism of alcohol dependence.
Although copy number variations (CNVs) are expected to affect various diseases, little is known about the association between CNVs and breast cancer susceptibility. Therefore, we investigated this relation. Array comparative genomic hybridization was performed to search for candidate CNVs related to breast cancer susceptibility. Subsequent quantitative real-time polymerase chain reaction was carried out for confirmation. We found seven CNV markers associated with breast cancer risk. The means of the relative copy numbers of patients with a history of breast cancer and women in the control group were 0.8 and 1.8 for Hs06535529_cn on 1p36.12 (P < 0.0001), 2.9 and 2.2 for Hs03103056_cn on 3q26.1 (P < 0.0001), 1.2 and 1.8 for Hs03899300_cn on 15q26.3 (P < 0.0001), 1.0 and 1.5 for Hs03908783_cn on 15q26.3 (P < 0.0001), and 1.1 and 1.7 for Hs03898338_cn on 15q26.3 (P < 0.0001), respectively. Interestingly, nine or more copies of Hs04093415_cn on 22q12.3 were found only in 8/193 (4.1 %) patients with a history of breast cancer and in none of the controls (P = 0.0081). Similarly, 12 or more copies of Hs040908898_cn on 22q12.3 were found only in 7/193 (3.6 %) patients with a history of breast cancer and in none of the controls (P = 0.016). A combination of two CNVs resulted in 80.3 % sensitivity, 80.6 % specificity, 82.4 % positive predictive value, and 78.3 % negative predictive value for the prediction of breast cancer susceptibility. These findings may lead to a new means of risk assessment for breast cancer. Confirmatory studies using independent data sets are needed to support our findings.
CNV; Breast cancer susceptibility; CGH; Real-time PCR; Digital PCR
Variation in human intelligence is approximately 50% heritable, but understanding of the genes involved is limited. Several forms of genetic variation remain under-studied in relation to intelligence, one of which is copy number variation (CNV). Using single-nucleotide polymorphism (SNP) -based microarrays, we genotyped CNVs genome-wide in a birth cohort of 723 New Zealanders, and correlated them with four intelligence-related phenotypes. We found no significant association for any common CNV after false discovery correction, which is consistent with previous work. In contrast to a previous study, however, we found no effect on any cognitive measure of rare CNV burden, defined as total number of bases inserted or deleted in CNVs rarer than 5%. We discuss possible reasons for this failure to replicate, including interaction between CNV and aging in determining the effects of rare CNVs. While our results suggest that no CNV assayable by SNP chips contributes more than a very small amount to variation in human intelligence, it remains possible that common CNVs in segmental duplication arrays, which are not well covered by SNP chips, are important contributors.
Graves’ disease (GD) and Graves’ ophthalmopathy (GO) are autoimmune disorders, which might be influenced by genetic factors. Copy number variation (CNV) is an important source of genomic diversity in humans, and influences disease susceptibility. This study investigated the association between CNV in the TSHR and TLR7 genes and the development of GD and GO in a Chinese population in Taiwan.
For this case-control study, sample from 196 healthy controls and 484 GD patients, including 203 patients with GO were studied. CNV was detected by real-time polymerase chain reaction (PCR) using TaqMan™ probes and the relative copy number (CN) was estimated by using the comparative Ct method.
The differences in the distribution of TSHR CNV in healthy controls and GD patients were statistically significant (p value = 0.01). However, the difference in the distribution of TSHR CNV in the control group and the GO group was not statistically significant (p value = 0.06). For TLR7 CNV, the results were not significantly different when we compared the distribution in healthy controls and GD patients and in healthy controls and GO patients (p values for Fisher’s exact test were 0.13 and 0.09, respectively). However, a lower than normal CNV for TLR7 (CNV < 2 for female and CNV < 1 for male) was found to have a protective effect against the development of GD (odds ratio (OR) = 0.24; 95% confidence interval (CI), 0.07-0.75) after adjusting for age and gender.
These results suggested that TSHR and TLR7 CNV might be associated with susceptibility to GD.
Graves’ disease; Graves’ ophthalmopathy; Copy number variation; TSHR; TLR7; Taiwan
A major motivation for seeking disease-associated genetic variation is to identify novel risk processes. Although rare copy number variants (CNVs) appear to contribute to attention deficit hyperactivity disorder (ADHD), common risk variants (single-nucleotide polymorphisms [SNPs]) have not yet been detected using genome-wide association studies (GWAS). This raises the concern as to whether future larger-scale, adequately powered GWAS will be worthwhile. The authors undertook a GWAS of ADHD and examined whether associated SNPs, including those below conventional levels of significance, influenced the same biological pathways affected by CNVs.
The authors analyzed genome-wide SNP frequencies in 727 children with ADHD and 5,081 comparison subjects. The gene sets that were enriched in a pathway analysis of the GWAS data (the top 5% of SNPs) were tested for an excess of genes spanned by large, rare CNVs in the children with ADHD.
No SNP achieved genome-wide significance levels. As previously reported in a subsample of the present study, large, rare CNVs were significantly more common in case subjects than comparison subjects. Thirteen biological pathways enriched for SNP association significantly overlapped with those enriched for rare CNVs. These included cholesterol-related and CNS development pathways. At the level of individual genes, CHRNA7, which encodes a nicotinic receptor subunit previously implicated in neuropsychiatric disorders, was affected by six large duplications in case subjects (none in comparison subjects), and SNPs in the gene had a gene-wide p value of 0.0002 for association in the GWAS.
Both common and rare genetic variants appear to be relevant to ADHD and index-shared biological pathways.
Aging is a biological process strongly determined by genetics. However, only a few single nucleotide polymorphisms (SNPs) have been reported to be consistently associated with aging. While investigating whether copy number variations (CNVs) could fill this gap, we focused on CNVs that have not been studied in previous SNP-based searches via tagging SNPs.
TaqMan qPCR assays were developed to quantify 20 common CNVs in 222 senior American Caucasians in order to reveal possible association with longevity. The replication study was comprised of 1283 community-dwelling senior European Caucasians. Replicated CNVs were further investigated for association with healthy aging and aging-related diseases, while association with longevity was additionally tested in Caenorhabditis elegans.
In the discovery study of ≥80 vs.<80 years old seniors, a homozygous intronic CNV deletion in the CNTNAP4 gene was inversely associated with survival to the age of 80 (OR=0.51, 95%CI 0.29-0.87, p=0.015 before correction for multiple testing). After stratification by sex, association remained significant in females (OR=0.41, 95%CI 0.21-0.77, p=0.007), but not in males (OR=0.97, 95%CI 0.33-2.79, p=1). The finding was validated in a replication study (OR=0.66, 95%CI 0.48-0.90, p=0.011 for females). CNTNAP4 association with longevity was supported by a marked 25% lifespan change in C. elegans after knocking down the ortholog gene. An inverse association of the CNV del/del variant with female healthy aging was observed (OR=0.39, 95%CI 0.19-0.76, p=0.006). A corresponding positive association with aging-related diseases was revealed for cognitive impairment (OR=2.17, 95%CI 1.11-4.22, p=0.024) and, in independent studies, for Alzheimer’s (OR=4.07, 95%CI 1.17-14.14, p=0.036) and Parkinson’s (OR=1.59, 95%CI 1.03-2.42, p=0.041) diseases.
This is the first demonstration for association of the CNTNAP4 gene and one of its intronic CNV polymorphisms with aging. Association with particular aging-related diseases awaits replication and independent validation.
To date, hundreds of thousands of copy-number variation (CNV) data have been reported using various platforms. The proportion of Asians in these data is, however, relatively small as compared with that of other ethnic groups, such as Caucasians and Yorubas. Because of limitations in platform resolution and the high noise level in signal intensity, in most CNV studies (particularly those using single nucleotide polymorphism arrays), the average number of CNVs in an individual is less than the number of known CNVs. In this study, we ascertained reliable, common CNV regions (CNVRs) and identified actual frequency rates in the Korean population to provide more CNV information. We performed two-stage analyses for detecting structural variations with two platforms. We discovered 576 common CNVRs (88 CNV segments on average in an individual), and 87% (501 of 576) of these CNVRs overlapped by ≥1 bp with previously validated CNV events. Interestingly, from the frequency analysis of CNV profiles, 52 of 576 CNVRs had a frequency rate of <1% in the 8842 individuals. Compared with other common CNV studies, this study found six common CNVRs that were not reported in previous CNV studies. In conclusion, we propose the data-driven detection approach to discover common CNVRs including those of unreported in the previous Korean CNV study while minimizing false positives. Through our approach, we successfully discovered more common CNVRs than previous Korean CNV study and conducted frequency analysis. These results will be a valuable resource for the effective level of CNVs in the Korean population.
common copy-number variation; CNV profile; Asian CNV; structural variation
Copy-number variants (CNVs) are a source of genetic variation that increasingly are associated with human disease. However, the role of CNVs in human lifespan is to date unknown. To identify CNVs that influence mortality at old age, we analyzed genome-wide CNV data in 5178 participants of Rotterdam Study (RS1) and positive findings were evaluated in 1714 participants of the second cohort of the Rotterdam Study (RS2) and in 4550 participants of Framingham Heart Study (FHS). First, we assessed the total burden of rare (frequency <1%) and common (frequency >1%) CNVs for association with mortality during follow-up. These analyses were repeated by stratifying CNVs by type and size. Secondly, we assessed individual common CNV regions (CNVR) for association with mortality. We observed that the burden of common but not of rare CNVs influences mortality. A higher burden of large (≥500 kb) common deletions associated with 4% higher mortality [hazard ratio (HR) per CNV 1.04, 95% confidence interval (CI) 1.02–1.07, P = 5.82 × 10−5] in the 11 442 participants of RS1, RS2 and FHS. In the analysis of 312 individual common CNVRs, we identified two regions (11p15.5; 14q21.3) that associated with higher mortality in these cohorts. The 11p15.5 region (combined HR 1.59, 95% CI 1.31–1.93, P = 2.87 × 10−6) encompasses 41 genes, of which some have previously been related to longevity, whereas the 14q21.3 region (combined HR 1.57, 95% CI 1.19–2.07, P = 1.53 × 10−3) does not encompass any genes. In conclusion, the burden of large common deletions, as well as common CNVs in 11p15.5 and 14q21.3 region, associate with higher mortality.
Accurate and efficient genome-wide detection of copy number variants (CNVs) is essential for understanding human genomic variation, genome-wide CNV association type studies, cytogenetics research and diagnostics, and independent validation of CNVs identified from sequencing based technologies. Numerous, array-based platforms for CNV detection exist utilizing array Comparative Genome Hybridization (aCGH), Single Nucleotide Polymorphism (SNP) genotyping or both. We have quantitatively assessed the abilities of twelve leading genome-wide CNV detection platforms to accurately detect Gold Standard sets of CNVs in the genome of HapMap CEU sample NA12878, and found significant differences in performance. The technologies analyzed were the NimbleGen 4.2 M, 2.1 M and 3×720 K Whole Genome and CNV focused arrays, the Agilent 1×1 M CGH and High Resolution and 2×400 K CNV and SNP+CGH arrays, the Illumina Human Omni1Quad array and the Affymetrix SNP 6.0 array. The Gold Standards used were a 1000 Genomes Project sequencing-based set of 3997 validated CNVs and an ultra high-resolution aCGH-based set of 756 validated CNVs. We found that sensitivity, total number, size range and breakpoint resolution of CNV calls were highest for CNV focused arrays. Our results are important for cost effective CNV detection and validation for both basic and clinical applications.
Attention-deficit/hyperactivity disorder (ADHD) is a common, highly heritable neurodevelopmental disorder. Genetic loci have not yet been identified by genome-wide association studies. Rare copy number variations (CNVs), such as chromosomal deletions or duplications, have been implicated in ADHD and other neurodevelopmental disorders. To identify rare (frequency ⩽1%) CNVs that increase the risk of ADHD, we performed a whole-genome CNV analysis based on 489 young ADHD patients and 1285 adult population-based controls and identified one significantly associated CNV region. In tests for a global burden of large (>500 kb) rare CNVs, we observed a nonsignificant (P=0.271) 1.126-fold enriched rate of subjects carrying at least one such CNV in the group of ADHD cases. Locus-specific tests of association were used to assess if there were more rare CNVs in cases compared with controls. Detected CNVs, which were significantly enriched in the ADHD group, were validated by quantitative (q)PCR. Findings were replicated in an independent sample of 386 young patients with ADHD and 781 young population-based healthy controls. We identified rare CNVs within the parkinson protein 2 gene (PARK2) with a significantly higher prevalence in ADHD patients than in controls (P=2.8 × 10−4 after empirical correction for genome-wide testing). In total, the PARK2 locus (chr 6: 162 659 756–162 767 019) harboured three deletions and nine duplications in the ADHD patients and two deletions and two duplications in the controls. By qPCR analysis, we validated 11 of the 12 CNVs in ADHD patients (P=1.2 × 10−3 after empirical correction for genome-wide testing). In the replication sample, CNVs at the PARK2 locus were found in four additional ADHD patients and one additional control (P=4.3 × 10−2). Our results suggest that copy number variants at the PARK2 locus contribute to the genetic susceptibility of ADHD. Mutations and CNVs in PARK2 are known to be associated with Parkinson disease.
ADHD; children; CNVs; GWAS; PARK2
Neural tube defects (NTDs) are common birth defects of complex etiology. Family and population-based studies have confirmed a genetic component to NTDs. However, despite more than three decades of research, the genes involved in human NTDs remain largely unknown. We tested the hypothesis that rare copy number variants (CNVs), especially de novo germline CNVs, are a significant risk factor for NTDs. We used array-based comparative genomic hybridization (aCGH) to identify rare CNVs in 128 Caucasian and 61 Hispanic patients with non-syndromic lumbar-sacral myelomeningocele. We also performed aCGH analysis on the parents of affected individuals with rare CNVs where parental DNA was available (42 sets). Among the eight de novo CNVs that we identified, three generated copy number changes of entire genes. One large heterozygous deletion removed 27 genes, including PAX3, a known spina bifida-associated gene. A second CNV altered genes (PGPD8, ZC3H6) for which little is known regarding function or expression. A third heterozygous deletion removed GPC5 and part of GPC6, genes encoding glypicans. Glypicans are proteoglycans that modulate the activity of morphogens such as Sonic Hedgehog (SHH) and bone morphogenetic proteins (BMPs), both of which have been implicated in NTDs. Additionally, glypicans function in the planar cell polarity (PCP) pathway, and several PCP genes have been associated with NTDs. Here, we show that GPC5 orthologs are expressed in the neural tube, and that inhibiting their expression in frog and fish embryos results in NTDs. These results implicate GPC5 as a gene required for normal neural tube development.
Copy number variations (CNV) are important causal genetic variations for human disease; however, the lack of a statistical model has impeded the systematic testing of CNVs associated with disease in large-scale cohort.
Here, we developed a novel integrated strategy to test CNV-association in genome-wide case-control studies. We converted the single-nucleotide polymorphism (SNP) signal to copy number states using a well-trained hidden Markov model. We mapped the susceptible CNV-loci through SNP site-specific testing to cope with the physiological complexity of CNVs. We also ensured the credibility of the associated CNVs through further window-based CNV-pattern clustering. Genome-wide data with seven diseases were used to test our strategy and, in total, we identified 36 new susceptible loci that are associated with CNVs for the seven diseases: 5 with bipolar disorder, 4 with coronary artery disease, 1 with Crohn's disease, 7 with hypertension, 9 with rheumatoid arthritis, 7 with type 1 diabetes and 3 with type 2 diabetes. Fifteen of these identified loci were validated through genotype-association and physiological function from previous studies, which provide further confidence for our results. Notably, the genes associated with bipolar disorder converged in the phosphoinositide/calcium signaling, a well-known affected pathway in bipolar disorder, which further supports that CNVs have impact on bipolar disorder.
Our results demonstrated the effectiveness and robustness of our CNV-association analysis and provided an alternative avenue for discovering new associated loci of human diseases.