Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ∼ 1.8–2.0). Relative risks as low as λ ∼ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Advances in high-throughput genotyping technology and the International HapMap Project have enabled genetic association studies at the whole-genome level. Our paper describes two genome-wide SNP panels that contain tag SNPs derived from the International HapMap Project. Tag SNPs are proxies for groups of highly correlated SNPs. Information can be captured for the entire group of correlated SNPs by genotyping only one representative SNP, the tag SNP. These whole-genome SNP panels also contain additional content thought to be overrepresented in disease, such as amino acid–changing nonsynonymous SNPs and mitochondrial SNPs. We show that these panels cover the genome with very high efficiency as measured by coverage of all HapMap SNPs and a set of SNPs derived from completely resequenced genes from the Seattle SNPs database. We also show that these panels have high power to detect disease risk alleles for both HapMap and non-HapMap SNPs. In complex disease where multiple risk alleles are believed to be involved, we show that the ability to detect at least one risk allele with the tag SNP panels is also high.
New technologies have enabled genome-wide association studies to be conducted with hundreds of thousands of genotyped SNPs. Several different first-generation genome-wide panels of SNPs have been commercialized. The total amount of common genetic variation is still unknown; however, the coverage of commercial panels can be evaluated against reference population samples genotyped by the International HapMap project. Less information is available about coverage in samples from other populations.
In this study we compare four commercial panels: the HumanHap 300 and HumanHap 550 Array Sets from the Illumina Infinium series and the Mapping 100 K and Mapping 500 K Array Sets from the Affymetrix GeneChip series. Tagging performance is compared among HapMap CEPH (CEU), Asian (JPT, CHB) and Yoruba (YRI) population samples. It is also evaluated in an Estonian population sample with more than 1000 individuals genotyped in two 500-kbp ENCODE regions of chromosome 2: ENr112 on 2p16.3 and ENr131 on 2p37.1.
We found that in a non-reference Caucasian population, commercial SNP panels provide levels of coverage similar to those in the HapMap CEPH population sample. We present the proportions of universal and population-specific SNPs in all the commercial platforms studied.
Obesity is an increasingly common disorder that predisposes to several medical conditions, including type 2 diabetes. We investigated whether large and rare copy-number variations (CNVs) differentiate moderate to extreme obesity from never-overweight control subjects.
RESEARCH DESIGN AND METHODS
Using single nucleotide polymorphism (SNP) arrays, we performed a genome-wide CNV survey on 430 obese case subjects (BMI >35 kg/m2) and 379 never-overweight control subjects (BMI <25 kg/m2). All subjects were of European ancestry and were genotyped on the Illumina HumanHap550 arrays with ∼550,000 SNP markers. The CNV calls were generated by PennCNV software.
CNVs >1 Mb were found to be overrepresented in case versus control subjects (odds ratio [OR] = 1.5 [95% CI 0.5–5]), and CNVs >2 Mb were present in 1.3% of the case subjects but were absent in control subjects (OR = infinity [95% CI 1.2–infinity]). When focusing on rare deletions that disrupt genes, even more pronounced effect sizes are observed (OR = 2.7 [95% CI 0.5–27.1] for CNVs >1 Mb). Interestingly, obese case subjects who carry these large CNVs have moderately high BMI and do not appear to be extreme cases. Several CNVs disrupt known candidate genes for obesity, such as a 3.3-Mb deletion disrupting NAP1L5 and a 2.1-Mb deletion disrupting UCP1 and IL15.
Our results suggest that large CNVs, especially rare deletions, confer risk of obesity in patients with moderate obesity and that genes impacted by large CNVs represent intriguing candidates for obesity that warrant further study.
High-density single-nucleotide polymorphism (SNP) genotyping technology enables extensive genotyping as well as the detection of increasingly smaller chromosomal aberrations. In this study, we assess molecular karyotyping as first-round analysis of patients with mental retardation and/or multiple congenital abnormalities (MR/MCA). We used different commercially available SNP array platforms, the Affymetrix GeneChip 262K NspI, the Genechip 238K StyI, the Illumina HumanHap 300 and HumanCNV 370 BeadChip, to detect copy number variants (CNVs) in 318 patients with unexplained MR/MCA. We found abnormalities in 22.6% of the patients, including six CNVs that overlap known microdeletion/duplication syndromes, eight CNVs that overlap recently described syndromes, 63 potentially pathogenic CNVs (in 52 patients), four large segments of homozygosity and two mosaic trisomies for an entire chromosome. This study shows that high-density SNP array analysis reveals a much higher diagnostic yield as that of conventional karyotyping. SNP arrays have the potential to detect CNVs, mosaics, uniparental disomies and loss of heterozygosity in one experiment. We, therefore, propose a novel diagnostic approach to all MR/MCA patients by first analyzing every patient with an SNP array instead of conventional karyotyping.
SNP array; mental retardation; copy number variants; diagnostic workflow
Bipolar disorder (BPD) is a common psychiatric illness with a complex mode of inheritance. Besides traditional linkage and association studies, which require large sample sizes, analysis of common and rare chromosomal copy number variants (CNVs) in extended families may provide novel insights into the genetic susceptibility of complex disorders. Using the Illumina HumanHap550 BeadChip with over 550,000 SNP markers, we genotyped 46 individuals in a three-generation Old Order Amish pedigree with 19 affected (16 BPD and three major depression) and 27 unaffected subjects. Using the PennCNV algorithm, we identified 50 CNV regions that ranged in size from 12 to 885 kb and encompassed at least 10 single nucleotide polymorphisms (SNPs). Of 19 well characterized CNV regions that were available for combined genotype-expression analysis 11 (58%) were associated with expression changes of genes within, partially within or near these CNV regions in fibroblasts or lymphoblastoid cell lines at a nominal P value <0.05. To further investigate the mode of inheritance of CNVs in the large pedigree, we analyzed a set of four CNVs, located at 6q27, 9q21.11, 12p13.31 and 15q11, all of which were enriched in subjects with affective disorders. We additionally show that these variants affect the expression of neuronal genes within or near the rearrangement. Our analysis suggests that family based studies of the combined effect of common and rare CNVs at many loci may represent a useful approach in the genetic analysis of disease susceptibility of mental disorders.
In the present study, DNA from 28 pediatric low-grade astrocytomas was analyzed using Illumina HumanHap550K single-nucleotide polymorphism oligonucleotide arrays. A novel duplication in chromosome band 7q34 was identified in 17 of 22 juvenile pilocytic astrocytomas and three of six fibrillary astrocytomas. The 7q34 duplication spans 2.6 Mb of genomic sequence and contains approximately 20 genes, including two candidate tumor genes, HIPK2 and BRAF. There were no abnormalities in HIPK2, and analysis of two mutation hot-spots in BRAF revealed a V600E mutation in only one tumor without the duplication. Fluorescence in situ hybridization confirmed the 7q34 copy number change and was suggestive of a tandem duplication. Reverse transcription polymerase chain reaction-based sequencing revealed a fusion product between KIAA1549 and BRAF. The predicted fusion product includes the BRAF kinase domain and lacks the auto-inhibitory N-terminus. Western blot analysis revealed phosphorylated mitogen-activated protein kinase (MAPK) protein in tumors with the duplication, consistent with BRAF-induced activation of the pathway. Further studies are required to determine the role of this fusion gene in downstream MAPK signaling and its role in development of pediatric low-grade astrocytomas.
astrocytoma; BRAF; glioma; HIPK2; SNP array; 7q34
Structural genomic variation study, along with microarray technology development has provided many genomic resources related with architecture of human genome, and led to the fact that human genome structure is a lot more complicated than previously thought.
In the case of International HapMap Project, Epstein-Barr various immortalized cell lines were preferably used over blood in order to get a larger number of genomic DNA. However, genomic aberration stemming from immortalization process, biased representation of the donor tissue, and culture process may influence the accuracy of SNP genotypes. In order to identify chromosome aberrations including loss of heterozygosity (LOH), large-scale and small-scale copy number variations, we used Illumina HumanHap500 BeadChip (555,352 markers) on Korean HapMap individuals (n = 90) to obtain Log R ratio and B allele frequency information, and then utilized the data with various programs including Illumina ChromoZone, cnvParition and PennCNV. As a result, we identified 28 LOHs (>3 mb) and 35 large-scale CNVs (>1 mb), with 4 samples having completely duplicated chromosome. In addition, after checking the sample quality (standard deviation of log R ratio <0.30), we selected 79 samples and used both signal intensity and B allele frequency simultaneously for identification of small-scale CNVs (<1 mb) to discover 4,989 small-scale CNVs. Identified CNVs in this study were successfully validated using visual examination of the genoplot images, overlapping analysis with previously reported CNVs in DGV, and quantitative PCR.
In this study, we describe the result of the identified chromosome aberrations in Korean HapMap individuals, and expect that these findings will provide more meaningful information on the human genome.
Genome-wide association (GWA) studies to map genes for complex traits are powerful yet costly. DNA-pooling strategies have the potential to dramatically reduce the cost of GWA studies. Pooling using Affymetrix arrays has been proposed and used but the efficiency of these arrays has not been quantified. We compared and contrasted Affymetrix Genechip HindIII and Illumina HumanHap300 arrays on the same DNA pools and showed that the HumanHap300 arrays are substantially more efficient. In terms of effective sample size, HumanHap300-based pooling extracts >80% of the information available with individual genotyping (IG). In contrast, Genechip HindIII-based pooling only extracts ∼30% of the available information. With HumanHap300 arrays concordance with IG data is excellent. Guidance is given on best study design and it is shown that even after taking into account pooling error, one stage scans can be performed for >100-fold reduced cost compared with IG. With appropriately designed two stage studies, IG can provide confirmation of pooling results whilst still providing ∼20-fold reduction in total cost compared with IG-based alternatives. The large cost savings with Illumina HumanHap300-based pooling imply that future studies need only be limited by the availability of samples and not cost.
Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina’s HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%–93%), but IMPUTE2 had the highest IQS (81%–83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.
Familial aggregation of ischemic stroke derives from shared genetic and environmental factors. We present a meta-analysis of genome-wide association scans (GWAS) from 3 cohorts to identify the contribution of common variants to ischemic stroke risk.
This study involved 1464 ischemic stroke cases and 1932 controls. Cases were genotyped using the Illumina 610 or 660 genotyping arrays; controls, with Illumina HumanHap 550Kv1 or 550Kv3 genotyping arrays. Imputation was performed with the 1000 Genomes European ancestry haplotypes (August 2010 release) as a reference. A total of 5,156,597 single-nucleotide polymorphisms (SNPs) were incorporated into the fixed effects meta-analysis. All SNPs associated with ischemic stroke (P<1×10−5) were incorporated into a multivariate risk profile model.
No SNP reached genome-wide significance for ischemic stroke (P<5×10−8). Secondary analysis identified a significant cumulative effect for age at onset of stroke (first versus fifth quintile of cumulative profiles based on SNPs associated with late onset, ß = 14.77 [10.85,18.68], P = 5.5×10−12), as well as a strong effect showing increased risk across samples with a high propensity for stroke among samples with enriched counts of suggestive risk alleles (P<5×10−6). Risk profile scores based only on genomic information offered little incremental prediction.
There is little evidence of a common genetic variant contributing to moderate risk of ischemic stroke. Quintiles based on genetic loading of alleles associated with a younger age at onset of ischemic stroke revealed a significant difference in age at onset between those in the upper and lower quintiles. Using common variants from GWAS and imputation, genomic profiling remains inferior to family history of stroke for defining risk. Inclusion of genomic (rare variant) information may be required to improve clinical risk profiling.
The use of genome-wide single nucleotide polymorphism (SNP) data has recently proven useful in the study of human population structure. We have studied the internal genetic structure of the Swedish population using more than 350,000 SNPs from 1525 Swedes from all over the country genotyped on the Illumina HumanHap550 array. We have also compared them to 3212 worldwide reference samples, including Finns, northern Germans, British and Russians, based on the more than 29,000 SNPs that overlap between the Illumina and Affymetrix 250K Sty arrays. The Swedes - especially southern Swedes - were genetically close to the Germans and British, while their genetic distance to Finns was substantially longer. The overall structure within Sweden appeared clinal, and the substructure in the southern and middle parts was subtle. In contrast, the northern part of Sweden, Norrland, exhibited pronounced genetic differences both within the area and relative to the rest of the country. These distinctive genetic features of Norrland probably result mainly from isolation by distance and genetic drift caused by low population density. The internal structure within Sweden (FST = 0.0005 between provinces) was stronger than that in many Central European populations, although smaller than what has been observed for instance in Finland; importantly, it is of the magnitude that may hamper association studies with a moderate number of markers if cases and controls are not properly matched geographically. Overall, our results underline the potential of genome-wide data in analyzing substructure in populations that might otherwise appear relatively homogeneous, such as the Swedes.
Chromosome 15q24 microdeletion syndrome is a rare genomic disorder characterised by intellectual disability, growth retardation, unusual facial morphology and other anomalies. To date, 20 patients have been reported; 18 have had detailed breakpoint analysis.
To further delineate the features of the 15q24 microdeletion syndrome, the clinical and molecular characterisation of fifteen patients with deletions in the 15q24 region was performed, nearly doubling the number of reported patients.
Breakpoints were characterised using a custom, high-density array comparative hybridisation platform, and detailed phenotype information was collected for each patient.
Nine distinct deletions with different breakpoints ranging in size from 266 kb to 3.75 Mb were identified. The majority of breakpoints lie within segmental duplication (SD) blocks. Low sequence identity and large intervals of unique sequence between SD blocks likely contribute to the rarity of 15q24 deletions, which occur 8–10 times less frequently than 1q21 or 15q13 microdeletions in our series. Two small, atypical deletions were identified within the region that help delineate the critical region for the core phenotype in the 15q24 microdeletion syndrome.
The molecular characterisation of these patients suggests that the core cognitive features of the 15q24 microdeletion syndrome, including developmental delays and severe speech problems, are largely due to deletion of genes in a 1.1–Mb critical region. However, genes just distal to the critical region also play an important role in cognition and in the development of characteristic facial features associated with 15q24 deletions. Clearly, deletions in the 15q24 region are variable in size and extent. Knowledge of the breakpoints and size of deletion combined with the natural history and medical problems of our patients provide insights that will inform management guidelines. Based on common phenotypic features, all patients with 15q24 microdeletions should receive a thorough neurodevelopmental evaluation, physical, occupational and speech therapies, and regular audiologic and ophthalmologic screening.
Academic medicine; clinical genetics; epilepsy and seizures; cytogenetics; molecular genetics; genetics; copy-number; developmental; epilepsy and seizures; neurology; neuroophthalmology; cancer: breast; cancer: colon; genetic screening/counselling; obstetrics and gynaecology
African Americans have increased susceptibility to non-diabetic (non-DM) forms of end-stage renal disease (ESRD) and extensive evidence supports a genetic contribution. A genome-wide association study (GWAS) using pooled DNA was performed in 1,000 African Americans to detect associated genes. DNA from 500 non-DM ESRD cases and 500 non-nephropathy controls was quantified using gel electrophoresis and spectrophotometric analysis and pools of 50 case and 50 control DNA samples were created. DNA pools were genotyped in duplicate on the Illumina HumanHap550-Duo BeadChip. Normalization methods were developed and applied to array intensity values to reduce inter-array variance. Allele frequencies were calculated from normalized channel intensities and compared between case and control pools. Three SNPs had p values of <1.0E–6: rs4462445 (ch 13), rs4821469 (ch 22) and rs8077346 (ch 17). After normalization, top scoring SNPs (n = 65) were genotyped individually in 464 of the original cases and 478 of the controls, with replication in 336 non-DM ESRD cases and 363 non-nephropathy controls. Sixteen SNPs were associated with non-DM ESRD (p < 7.7E–4, Bonferroni corrected). Twelve of these SNPs are in or near the MYH9 gene. The four non-MYH9 SNPs that were associated with non-DM ESRD in the pooled samples were not associated in the replication set. Five SNPs that were modestly associated in the pooled samples were more strongly associated in the replication and/or combined samples. This is the first GWAS for non-DM ESRD in African Americans using pooled DNA. We demonstrate strong association between non-DM ESRD in African Americans with MYH9, and have identified additional candidate loci.
Neuroblastoma is a malignancy of the developing sympathetic nervous system that most commonly affects young children and is often lethal. The etiology of this embryonal cancer is not known.
We performed a genome-wide association study by first genotyping 1,032 neuroblastoma patients and 2,043 controls of European descent using the Illumina HumanHap550 BeadChip. Three independent groups of neuroblastoma cases (N=720) and controls (N=2128) were then genotyped to replicate significant associations.
We observed highly significant association between neuroblastoma and the common minor alleles of three single nucleotide polymorphisms (SNPs) within a 94.2 kilobase (Kb) linkage disequilibrium block at chromosome band 6p22 containing the predicted genes FLJ22536 and FLJ44180 (P-value range = 1.71×10-9-7.01×10-10; allelic odds ratio range 1.39-1.40). Homozygosity for the at-risk G allele of the most significantly associated SNP, rs6939340, resulted in an increased likelihood of developing neuroblastoma of 1.97 (95% CI 1.58-2.44). Subsequent genotyping of these 6p22 SNPs in the three independent case series confirmed our observation of association (P=9.33×10-15 at rs6939340 for joint analysis). Furthermore, neuroblastoma patients homozygous for the risk alleles at 6p22 were more likely to develop metastatic (Stage 4) disease (P=0.02), show amplification of the MYCN oncogene in the tumor cells (P=0.006), and to have disease relapse (P=0.01).
Common genetic variation at chromosome band 6p22 is associated with susceptibility to neuroblastoma.
Inflammatory bowel disease, including Crohn's disease (CD) and ulcerative colitis (UC), and type 1 diabetes (T1D) are autoimmune diseases that may share common susceptibility pathways. We examined known susceptibility loci for these diseases in a cohort of 1689 CD cases, 777 UC cases, 989 T1D cases and 6197 shared control subjects of European ancestry, who were genotyped by the Illumina HumanHap550 SNP arrays. We identified multiple previously unreported or unconfirmed disease associations, including known CD loci (ICOSLG and TNFSF15) and T1D loci (TNFAIP3) that confer UC risk, known UC loci (HERC2 and IL26) that confer T1D risk and known UC loci (IL10 and CCNY) that confer CD risk. Additionally, we show that T1D risk alleles residing at the PTPN22, IL27, IL18RAP and IL10 loci protect against CD. Furthermore, the strongest risk alleles for T1D within the major histocompatibility complex (MHC) confer strong protection against CD and UC; however, given the multi-allelic nature of the MHC haplotypes, sequencing of the MHC locus will be required to interpret this observation. These results extend our current knowledge on genetic variants that predispose to autoimmunity, and suggest that many loci involved in autoimmunity may be under a balancing selection due to antagonistic pleiotropic effect. Our analysis implies that variants with opposite effects on different diseases may facilitate the maintenance of common susceptibility alleles in human populations, making autoimmune diseases especially amenable to genetic dissection by genome-wide association studies.
The combination of megalencephaly, perisylvian polymicrogyria, polydactyly and hydrocephalus (MPPH) is a rare syndrome of unknown cause. We observed two first cousins affected by an MPPH-like phenotype with a submicroscopic chromosome 5q35 deletion as a result of an unbalanced der(5)t(5;20)(q35.2;q13.3) translocation, including the NSD1 Sotos syndrome locus. We describe the phenotype and the deletion breakpoints of the two MPPH-like patients and compare these with five unrelated MPPH and Sotos patients harboring a 5q35 microdeletion. Mapping of the breakpoints in the two cousins was performed by MLPA, FISH, high density SNP-arrays and Q-PCR for the 5q35 deletion and 20q13 duplication. The 5q35 deletion area of the two cousins almost completely overlaps with earlier described patients with an atypical Sotos microdeletion, except for the DRD1 gene. The five unrelated MPPH patients neither showed submicroscopic chromosomal aberrations nor DRD1 mutations. We reviewed the brain MRI of 10 Sotos patients and did not detect polymicrogyria in any of them. In our two cousins, the MPPH-like phenotype is probably caused by the contribution of genes on both chromosome 5q35 and 20q13. Some patients with MPPH may harbor a submicroscopic chromosomal aberration and therefore high-resolution array analysis should be part of the diagnostic workup.
Megalencephaly; Polymicrogyria; Polydactyly; Hydrocephalus; microdeletion; 5q35.2; 20q13.3
Single nucleotide polymorphism (SNP) genotyping has emerged as a technology to incorporate copy-number variants (CNVs) into genetic analyses of human traits. However, the extent to which SNP platforms accurately capture CNVs remains unclear. Using independent, sequence-based CNV maps, we find that commonly used SNP platforms have limited or no probe coverage for a large fraction of CNVs. Despite this, in nine samples we inferred 368 CNVs using Illumina SNP genotyping data and experimentally validated over two-thirds of these. We also developed a method (SCIMM) to robustly genotype deletions using as few as two SNP probes. We find that HapMap SNPs are strongly correlated with 82% of common deletions, but the newest SNP platforms effectively tag about 50%. We conclude that currently available genome-wide SNP assays can capture CNVs accurately, but improvements in array designs, particularly in duplicated sequences, are necessary to facilitate more comprehensive analyses of genomic variation.
Genome-wide studies on autism spectrum disorders (ASDs) have mostly focused on large-scale population samples, but examination of rare variations in isolated populations may provide additional insights into the disease pathogenesis.
As a first step in the genetic analysis of ASD in Croatia, we characterized genetic variation in a sample of 103 subjects with ASD and 203 control individuals, who were genotyped using the Illumina HumanHap550 BeadChip. We analyzed the genetic diversity of the Croatian population and its relationship to other populations, the degree of relatedness via Runs of Homozygosity (ROHs), and the distribution of large (>500 Kb) copy number variations.
Combining the Croatian cohort with several previously published populations in the FastME analysis (an alternative to Neighbor Joining) revealed that Croatian subjects cluster, as expected, with Southern Europeans; in addition, individuals from the same geographic region within Europe cluster together. Whereas Croatian subjects could be separated from a sample of healthy control subjects of European origin from North America, Croatian ASD cases and controls are well mixed. A comparison of runs of homozygosity indicated that the number and the median length of regions of homozygosity are higher for ASD subjects than for controls (p = 6 × 10-3). Furthermore, analysis of copy number variants found a higher frequency of large chromosomal rearrangements (>2 Mb) in ASD cases (5/103) than in ethnically matched control subjects (1/197, p = 0.019).
Our findings illustrate the remarkable utility of high-density genotype data for subjects from a limited geographic area in dissecting genetic heterogeneity with respect to population and disease related variation.
Chromosomal imbalances, recognized as the major cause of mental retardation, are often due to submicroscopic deletions or duplications not evidenced by conventional cytogenetic methods. To date, interstitial deletion of long arm of chromosome 2 have been reported for more than 100 cases, although studies reporting small interstitial deletions involving the 2q24.1q24.2 region are rare. With the widespread clinical use of comparative genomic hybridization chromosomal microarray technology, several cryptic chromosome imbalances have outlined new genotype-phenotype correlations and isolated a number of distinctive clinical conditions.
here we report on a girl with mental retardation and generalized hypotonia. A genome-wide screen for copy number variations (CNVs) using single nucleotide polymorphisms (SNPs) array revealed a 7.5 Mb interstitial deletion of chromosome region 2q24.1q24.2 encompassing 59 genes, which was absent in parents. The gene content analysis of the deleted region and review of the literature revealed the presence of some genes that may be indicated as good candidate in generating the main clinical features of the patient.
the present case represents a further patient described in the literature with an interstitial deletion of chromosome 2q24.1q24.2. Our patient shares some clinical features with the previously reported patients carriers of overlapping 2q24 deletion. Although more cases are needed to delineate the full-blown phenotype of 2q24.1q24.2 deletion syndrome, published data and present observation suggest that hemizygosity of this region results in a clinically recognizable phenotype. Considering these clinical and cytogenetic similarities, we suggest the existence of an emerging syndrome associated to 2q24.1q24.2 region.
mental retardation; 2q24.1q24.2; array comparative genomic hybridization
Over two hundred asthma candidate genes have been examined in human association studies or identified with knockout mouse approaches. However, many have not been systematically replicated in human populations, especially those containing a large number of tagging single nucleotide polymorphisms (SNPs).
We comprehensively evaluated the association of previously implicated asthma candidate genes with childhood asthma in a Mexico City population.
We identified, from the literature, candidate genes with at least one positive report of association with asthma phenotypes in humans or implicated in asthma pathogenesis by knockout mouse experiments. We performed a genome-wide association study in 492 asthmatic children aged 5 to 17 years and both parents using the Illumina HumanHap 550v3 BeadChip. Separate candidate gene analyses were performed for 2,933 autosomal SNPs in the 237 selected genes using the log-linear method with a log-additive risk model.
Sixty-one of the 237 genes had at least one SNP with p < 0.05 for association with asthma. The nine most significant results were observed for rs2241715 in TGFB1 (p=3.3×10−5), rs13431828 and rs1041973 in IL1RL1 (p=2×10−4 and 3.5×10−4), five SNPs in DPP10 (p=1.6×10−4 to 4.5×10−4), and rs17599222 in CYFIP2 (p=4.1×10−4). False discovery rates were <0.1 for all 9 SNPs. Multimarker analysis identified TGFB1, IL1RL1, IL18R1, and DPP10 as the genes most significantly associated with asthma.
This comprehensive analysis of literature-based candidate genes suggests that SNPs in several candidate genes including TGFB1, IL1RL1, IL18R1 and DPP10 may contribute to childhood asthma susceptibility in a Mexican population.
Allergy; asthma; genetic predisposition to disease; genome-wide association study (GWAS); single nucleotide polymorphism (SNP)
Tracheal agenesis (TA) is a rare congenital anomaly of the respiratory tract. Many patients have associated anomalies, suggesting a syndromal phenotype. In a cohort of 12 patients, we aimed to detect copy number variations. In addition to routine cytogenetic analysis, we applied oligonucleotide array comparative genomic hybridization. Our patient cohort showed various copy number variations, of which many were parentally inherited variants. One patient had, in addition to an inherited 16p12.1 deletion, a 3.6 Mb deletion on chromosomal locus 5q11.2. This patient had a syndromic phenotype, including vertebral, anal, cardiovascular and tracheo-oesophageal associated anomalies, and other foregut-related anomalies, such as cartilage rings in the oesophagus and an aberrant right bronchus. No common deletions or duplications are found in our cohort, suggesting that TA is a genetically heterogeneous disorder.
tracheal agenesis; array comparative genomic hybridization; 5q11; deletion; VACTERL; TACRD
Background: Array comparative genomic hybridisation is a powerful tool for the detection of copy number changes in the genome.
Methods: A human X and Y chromosome tiling path array was developed for the analysis of sex chromosome aberrations.
Results: Normal X and Y chromosome profiles were established by analysis with DNA from normal fertile males and females. Detection of infertile males with known Y deletions confirmed the competence of the array to detect AZFa, AZFb and AZFc deletions and to distinguish between different AZFc lesions. Examples of terminal and interstitial deletions of Xp (previously characterised through cytogenetic and microsatellite analysis) have been assessed using the arrays, thus both confirming and refining the established deletion breakpoints. Breakpoints in iso‐Yq, iso‐Yp and X–Y translocation chromosomes and X–Y interchanges in XX males are also amenable to analysis.
Discussion: The resolution of the tiling path clone set used allows breakpoints to be placed within 100–200 kb, permitting more precise genotype/phenotype correlations. These data indicate that the combined X and Y tiling path arrays provide an effective tool for the investigation and diagnosis of sex chromosome copy number aberrations and rearrangements.
Using single-nucleotide polymorphisms (SNPs), we sought to predict classical class I and class II human leukocyte antigen (HLA) alleles, and test for their associations with rheumatoid arthritis (RA) in the North American Rheumatoid Arthritis Consortium sample of cases and controls, genotyped on the Illumina HumanHap550 BeadChip. We use publicly available databases of SNP data and HLA data to find SNPs or SNP-haplotypes to be used as surrogates for each HLA allele. To reduce the confounding effects of linkage disequilibrium with the HLA-DRB1 locus, we tested for the association conditional on the presence or absence of a shared epitope allele on the same haplotype as the target HLA allele. Using SNP surrogates, we find that components of the DQ8 serotype (DQA1*0301:DQB1*0302) are associated with RA, irrespective of the presence or absence of a shared epitope allele on their respective haplotypes. Knowledge of the haplotype structure in the HLA region is still necessary for better interpretation of the results.
Including previously-genotyped controls in a genome-wide association study can provide cost-savings, but can also create design biases. When cases and controls are genotyped on different platforms, the imputation needed to provide genome-wide coverage will introduce differential measurement error and may lead to false positives. We compared genotype frequencies of two healthy control groups from the Nurses’ Health Study genotyped on different platforms (Affymetrix 6.0 [n=1,672] and Illumina HumanHap550 [n=1,038]). Using standard imputation quality filters, we observed 9,841 SNPs out of 2,347,809 (0.4%) significant at the 5 × 10−8 level. We explored three methods for controlling for this Type I error inflation. One method was to remove platform effects using principal components; another was to restrict to SNPs of highest quality imputation; and a third was to genotype some controls alongside cases to exclude SNPs that are statistical artifact. The first method could not reduce the Type I error rate; the other two could dramatically reduce the error rate, although both required that a portion of SNPs be excluded from analysis. Ideally, the biases we describe would be eliminated at the design stage, by genotyping sufficient numbers of cases and controls on each platform. Researchers using imputation to combine samples genotyped on different platforms with severely unbalanced case-control ratios should be aware of the potential for inflated Type I error rates and apply appropriate quality filters. Every SNP found with genome-wide significance should be validated on another platform to verify that its significance is not an artifact of study design.
Genome-wide association study; Imputation; GWAS quality control
Many candidate genes have been studied for asthma, but replication has varied. Novel candidate genes have been identified for various complex diseases using genome-wide association studies (GWASs). We conducted a GWAS in 492 Mexican children with asthma, predominantly atopic by skin prick test, and their parents using the Illumina HumanHap 550 K BeadChip to identify novel genetic variation for childhood asthma. The 520,767 autosomal single nucleotide polymorphisms (SNPs) passing quality control were tested for association with childhood asthma using log-linear regression with a log-additive risk model. Eleven of the most significantly associated GWAS SNPs were tested for replication in an independent study of 177 Mexican case–parent trios with childhood-onset asthma and atopy using log-linear analysis. The chromosome 9q21.31 SNP rs2378383 (p = 7.10×10−6 in the GWAS), located upstream of transducin-like enhancer of split 4 (TLE4), gave a p-value of 0.03 and the same direction and magnitude of association in the replication study (combined p = 6.79×10−7). Ancestry analysis on chromosome 9q supported an inverse association between the rs2378383 minor allele (G) and childhood asthma. This work identifies chromosome 9q21.31 as a novel susceptibility locus for childhood asthma in Mexicans. Further, analysis of genome-wide expression data in 51 human tissues from the Novartis Research Foundation showed that median GWAS significance levels for SNPs in genes expressed in the lung differed most significantly from genes not expressed in the lung when compared to 50 other tissues, supporting the biological plausibility of our overall GWAS findings and the multigenic etiology of childhood asthma.
Asthma is a leading chronic childhood disease with a presumed strong genetic component, but no genes have been definitely shown to influence asthma development. Few genetic studies of asthma have included Hispanic populations. Here, we conducted a genome-wide association study of asthma in 492 Mexican children with asthma, predominantly atopic by skin prick test, and their parents to identify novel genetic variation for childhood asthma. We implicated several polymorphisms in or near TLE4 on chromosome 9q21.31 (a novel candidate region for childhood asthma) and replicated one polymorphism in an independent study of childhood-onset asthmatics with atopy and their parents of Mexican ethnicity. Hispanics have differing proportions of Native American, European, and African ancestries, and we found less Native American ancestry than expected at chromosome 9q21.31. This suggests that chromosome 9q21.31 may underlie ethnic differences in childhood asthma and that future replication would be most effective in populations with Native American ancestry. Analysis of publicly available genome-wide expression data revealed that association signals in genes expressed in the lung differed most significantly from genes not expressed in the lung when compared to 50 other tissues, supporting the biological plausibility of the overall GWAS findings and the multigenic etiology of asthma.