PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-20 (20)
 

Clipboard (0)
None
Journals
Year of Publication
Document Types
1.  A genome wide association study of plasma uric acid levels in obese cases and never-overweight controls 
Obesity (Silver Spring, Md.)  2013;21(9):E490-E494.
Objective
To identify plasma uric acid related genes in extremely obese and normal weight individuals using genome wide association studies (GWAS).
Design and Methods
Using genotypes from a GWAS focusing on obesity and thinness, we performed quantitative trait association analyses (PLINK) for plasma uric acid levels in 1,060 extremely obese individuals [body mass index (BMI) >35 kg/m2] and normal-weight controls (BMI<25kg/m2). In 961 samples with uric acid data, 924 were females.
Results
Significant associations were found in SLC2A9 gene SNPs and plasma uric acid levels (rs6449213, P=3.15×10−12). DIP2C gene SNP rs877282 also reached genome wide significance(P=4,56×10−8). Weaker associations (P<1×10−5) were found in F5, PXDNL, FRAS1, LCORL, and MICAL2genes. Besides SLC2A9, 3 previously identified uric acid related genes ABCG2 (rs2622605, P=0.0026), SLC17A1 (rs3799344, P=0.0017), and RREB1 (rs1615495, P =0.00055) received marginal support in our study.
Conclusions
Two genes/chromosome regions reached genome wide association significance (P< 1× 10−7, 550K SNPs) in our GWAS : SLC2A9, the chromosome 2 60.1 Mb region (rs6723995), and the DIP2C gene region. Five other genes (F5, PXDNL, FRAS1, LCORL, and MICAL2) yielded P<1× 10−5. Four previous reported associations were replicated in our study, including SLC2A9, ABCG2, RREB, and SLC17A1.
doi:10.1002/oby.20303
PMCID: PMC3762924  PMID: 23703922
uric acid; genome wide association study; obesity
2.  The missense variation landscape of FTO, MC4R and TMEM18 in obese children of African ancestry 
Obesity (Silver Spring, Md.)  2013;21(1):159-163.
Common variation at the loci harboring FTO, MC4R and TMEM18 is consistently reported as being statistically the most strongly associated with obesity. We investigated if these loci also harbor rarer missense variants that confer substantially higher risk of common childhood obesity in African American (AA) children. We sequenced the exons of FTO, MC4R and TMEM18 in an initial subset of our cohort i.e. 200 obese (BMI≥95th percentile) and 200 lean AA children (BMI≤5th percentile). Any missense exonic variants that were uncovered went on to be further genotyped in a further 768 obese and 768 lean (BMI≤50th percentile) children of the same ethnicity. A number of exonic variants were observed from our sequencing effort: seven in FTO, of which four were non-synonymous (A163T, G182A, M400V and A405V), thirteen in MC4R, of which six were non-synonymous (V103I, N123S, S136A, F202L, N240S and I251L) and four in TMEM18, of which two were non-synonymous (P2S and V113L). Follow-up genotyping of these missense variants revealed only one significant difference in allele frequency between cases and controls, namely with N240S in MC4R(Fisher's Exact P = 0.0001). In summary, moderately rare missense variants within the FTO, MC4R and TMEM18 genes observed in our study did not confer risk of common childhood obesity in African Americans except for a degree of evidence for one known loss-of-function variant in MC4R.
doi:10.1002/oby.20147
PMCID: PMC3605748  PMID: 23505181
Obesity; Pediatrics; Genomics
3.  Integrative genomics identifies LMO1 as a neuroblastoma oncogene 
Nature  2010;469(7329):216-220.
Neuroblastoma is a childhood cancer of the sympathetic nervous system that accounts for approximately 10% of all paediatric oncology deaths1,2. To identify genetic risk factors for neuroblastoma, we performed a genome-wide association study (GWAS) on 2,251 patients and 6,097 control subjects of European ancestry from four case series. Here we report a significant association within LIM domain only 1 (LMO1) at 11p15.4 (rs110419, combined P = 5.2 × 10−16, odds ratio of risk allele = 1.34 (95% confidence interval 1.25–1.44)). The signal was enriched in the subset of patients with the most aggressive form of the disease. LMO1 encodes a cysteine-rich transcriptional regulator, and its paralogues (LMO2, LMO3 and LMO4) have each been previously implicated in cancer. In parallel, we analysed genome-wide DNA copy number alterations in 701 primary tumours. We found that the LMO1 locus was aberrant in 12.4% through a duplication event, and that this event was associated with more advanced disease (P < 0.0001) and survival (P = 0.041). The germline single nucleotide polymorphism (SNP) risk alleles and somatic copy number gains were associated with increased LMO1 expression in neuroblastoma cell lines and primary tumours, consistent with a gain-of-function role in tumorigenesis. Short hairpin RNA (shRNA)-mediated depletion of LMO1 inhibited growth of neuroblastoma cells with high LMO1 expression, whereas forced expression of LMO1 in neuroblastoma cells with low LMO1 expression enhanced proliferation. These data show that common polymorphisms at the LMO1 locus are strongly associated with susceptibility to developing neuroblastoma, but also may influence the likelihood of further somatic alterations at this locus, leading to malignant progression.
doi:10.1038/nature09609
PMCID: PMC3320515  PMID: 21124317
4.  Correction: A Genome-Wide Association Study on Obesity and Obesity-Related Traits 
PLoS ONE  2012;7(2):10.1371/annotation/a34ee94e-3e6a-48bd-a19e-398a4bb88580.
doi:10.1371/annotation/a34ee94e-3e6a-48bd-a19e-398a4bb88580
PMCID: PMC3293772
5.  Large Copy-Number Variations Are Enriched in Cases With Moderate to Extreme Obesity 
Diabetes  2010;59(10):2690-2694.
OBJECTIVE
Obesity is an increasingly common disorder that predisposes to several medical conditions, including type 2 diabetes. We investigated whether large and rare copy-number variations (CNVs) differentiate moderate to extreme obesity from never-overweight control subjects.
RESEARCH DESIGN AND METHODS
Using single nucleotide polymorphism (SNP) arrays, we performed a genome-wide CNV survey on 430 obese case subjects (BMI >35 kg/m2) and 379 never-overweight control subjects (BMI <25 kg/m2). All subjects were of European ancestry and were genotyped on the Illumina HumanHap550 arrays with ∼550,000 SNP markers. The CNV calls were generated by PennCNV software.
RESULTS
CNVs >1 Mb were found to be overrepresented in case versus control subjects (odds ratio [OR] = 1.5 [95% CI 0.5–5]), and CNVs >2 Mb were present in 1.3% of the case subjects but were absent in control subjects (OR = infinity [95% CI 1.2–infinity]). When focusing on rare deletions that disrupt genes, even more pronounced effect sizes are observed (OR = 2.7 [95% CI 0.5–27.1] for CNVs >1 Mb). Interestingly, obese case subjects who carry these large CNVs have moderately high BMI and do not appear to be extreme cases. Several CNVs disrupt known candidate genes for obesity, such as a 3.3-Mb deletion disrupting NAP1L5 and a 2.1-Mb deletion disrupting UCP1 and IL15.
CONCLUSIONS
Our results suggest that large CNVs, especially rare deletions, confer risk of obesity in patients with moderate obesity and that genes impacted by large CNVs represent intriguing candidates for obesity that warrant further study.
doi:10.2337/db10-0192
PMCID: PMC3279563  PMID: 20622171
6.  Pathway-Wide Association Study Implicates Multiple Sterol Transport and Metabolism Genes in HDL Cholesterol Regulation 
Pathway-based association methods have been proposed to be an effective approach in identifying disease genes, when single-marker association tests do not have sufficient power. The analysis of quantitative traits may be benefited from these approaches, by sampling from two extreme tails of the distribution. Here we tested a pathway association approach on a small genome-wide association study (GWAS) on 653 subjects with extremely high high-density lipoprotein cholesterol (HDL-C) levels and 784 subjects with low HDL-C levels. We identified 102 genes in the sterol transport and metabolism pathways that collectively associate with HDL-C levels, and replicated these association signals in an independent GWAS. Interestingly, the pathways include 18 genes implicated in previous GWAS on lipid traits, suggesting that genuine HDL-C genes are highly enriched in these pathways. Additionally, multiple biologically relevant loci in the pathways were not detected by previous GWAS, including genes implicated in previous candidate gene association studies (such as LEPR, APOA2, HDLBP, SOAT2), genes that cause Mendelian forms of lipid disorders (such as DHCR24), and genes expressing dyslipidemia phenotypes in knockout mice (such as SOAT1, PON1). Our study suggests that sampling from two extreme tails of a quantitative trait and examining genetic pathways may yield biological insights from smaller samples than are generally required using single-marker analysis in large-scale GWAS. Our results also implicate that functionally related genes work together to regulate complex quantitative traits, and that future large-scale studies may benefit from pathway-association approaches to identify novel pathways regulating HDL-C levels.
doi:10.3389/fgene.2011.00041
PMCID: PMC3268595  PMID: 22303337
GWAS; lipid; HDL-C; pathway analysis; cholesterol; sterol transport; sterol metabolism; genetic association
7.  A Genome-Wide Association Study on Obesity and Obesity-Related Traits 
PLoS ONE  2011;6(4):e18939.
Large-scale genome-wide association studies (GWAS) have identified many loci associated with body mass index (BMI), but few studies focused on obesity as a binary trait. Here we report the results of a GWAS and candidate SNP genotyping study of obesity, including extremely obese cases and never overweight controls as well as families segregating extreme obesity and thinness. We first performed a GWAS on 520 cases (BMI>35 kg/m2) and 540 control subjects (BMI<25 kg/m2), on measures of obesity and obesity-related traits. We subsequently followed up obesity-associated signals by genotyping the top ∼500 SNPs from GWAS in the combined sample of cases, controls and family members totaling 2,256 individuals. For the binary trait of obesity, we found 16 genome-wide significant signals within the FTO gene (strongest signal at rs17817449, P = 2.5×10−12). We next examined obesity-related quantitative traits (such as total body weight, waist circumference and waist to hip ratio), and detected genome-wide significant signals between waist to hip ratio and NRXN3 (rs11624704, P = 2.67×10−9), previously associated with body weight and fat distribution. Our study demonstrated how a relatively small sample ascertained through extreme phenotypes can detect genuine associations in a GWAS.
doi:10.1371/journal.pone.0018939
PMCID: PMC3084240  PMID: 21552555
8.  Examination of All Type 2 Diabetes GWAS Loci Reveals HHEX-IDE as a Locus Influencing Pediatric BMI 
Diabetes  2009;59(3):751-755.
OBJECTIVE
A number of studies have found that BMI in early life influences the risk of developing type 2 diabetes later in life. Our goal was to investigate if any type 2 diabetes variants uncovered through genome-wide association studies (GWAS) impact BMI in childhood.
RESEARCH DESIGN AND METHODS
Using data from an ongoing GWAS of pediatric BMI in our cohort, we investigated the association of pediatric BMI with 20 single nucleotide polymorphisms at 18 type 2 diabetes loci uncovered through GWAS, consisting of ADAMTS9, CDC123-CAMK1D, CDKAL1, CDKN2A/B, EXT2, FTO, HHEX-IDE, IGF2BP2, the intragenic region on 11p12, JAZF1, KCNQ1, LOC387761, MTNR1B, NOTCH2, SLC30A8, TCF7L2, THADA, and TSPAN8-LGR5. We randomly partitioned our cohort exactly in half in order to have a discovery cohort (n = 3,592) and a replication cohort (n = 3,592).
RESULTS
Our data show that the major type 2 diabetes risk–conferring G allele of rs7923837 at the HHEX-IDE locus was associated with higher pediatric BMI in both the discovery (P = 0.0013 and survived correction for 20 tests) and replication (P = 0.023) sets (combined P = 1.01 × 10−4). Association was not detected with any other known type 2 diabetes loci uncovered to date through GWAS except for the well-established FTO.
CONCLUSIONS
Our data show that the same genetic HHEX-IDE variant, which is associated with type 2 diabetes from previous studies, also influences pediatric BMI.
doi:10.2337/db09-0972
PMCID: PMC2828649  PMID: 19933996
9.  Association Between a High-Risk Autism Locus on 5p14 and Social Communication Spectrum Phenotypes in the General Population 
The American journal of psychiatry  2010;167(11):1364-1372.
Objective
Recent genome-wide analysis identified a genetic variant on 5p14.1 (rs4307059), which is associated with risk for autism spectrum disorder. This study investigated whether rs4307059 also operates as a quantitative trait locus underlying a broader autism phenotype in the general population, focusing specifically on the social communication aspect of the spectrum.
Method
Study participants were 7,313 children from the Avon Longitudinal Study of Parents and Children. Single-trait and joint-trait genotype associations were investigated for 29 measures related to language and communication, verbal intelligence, social interaction, and behavioral adjustment, assessed between ages 3 and 12 years. Analyses were performed in one-sided or directed mode and adjusted for multiple testing, trait interrelatedness, and random genotype dropout.
Results
Single phenotype analyses showed that an increased load of rs4307059 risk allele is associated with stereotyped conversation and lower pragmatic communication skills, as measured by the Children's Communication Checklist (at a mean age of 9.7 years). In addition a trend toward a higher frequency of identification of special educational needs (at a mean age of 11.8 years) was observed. Variation at rs4307059 was also associated with the phenotypic profile of studied traits. This joint signal was fully explained neither by single-trait associations nor by overall behavioral adjustment problems but suggested a combined effect, which manifested through multiple subthreshold social, communicative, and cognitive impairments.
Conclusions
Our results suggest that common variation at 5p14.1 is associated with social communication spectrum phenotypes in the general population and support the role of rs4307059 as a quantitative trait locus for autism spectrum disorder.
doi:10.1176/appi.ajp.2010.09121789
PMCID: PMC3008767  PMID: 20634369
10.  Examination of Type 2 Diabetes Loci Implicates CDKAL1 as a Birth Weight Gene 
Diabetes  2009;58(10):2414-2418.
OBJECTIVE
A number of studies have found that reduced birth weight is associated with type 2 diabetes later in life; however, the underlying mechanism for this correlation remains unresolved. Recently, association has been demonstrated between low birth weight and single nucleotide polymorphisms (SNPs) at the CDKAL1 and HHEX-IDE loci, regions that were previously implicated in the pathogenesis of type 2 diabetes. In order to investigate whether type 2 diabetes risk–conferring alleles associate with low birth weight in our Caucasian childhood cohort, we examined the effects of 20 such loci on this trait.
RESEARCH DESIGN AND METHODS
Using data from an ongoing genome-wide association study in our cohort of 5,465 Caucasian children with recorded birth weights, we investigated the association of the previously reported type 2 diabetes–associated variation at 20 loci including TCF7L2, HHEX-IDE, PPARG, KCNJ11, SLC30A8, IGF2BP2, CDKAL1, CDKN2A/2B, and JAZF1 with birth weight.
RESULTS
Our data show that the minor allele of rs7756992 (P = 8 × 10−5) at the CDKAL1 locus is strongly associated with lower birth weight, whereas a perfect surrogate for variation previously implicated for the trait at the same locus only yielded nominally significant association (P = 0.01; r2 rs7756992 = 0.677). However, association was not detected with any of the other type 2 diabetes loci studied.
CONCLUSIONS
We observe association between lower birth weight and type 2 diabetes risk–conferring alleles at the CDKAL1 locus. Our data show that the same genetic locus that has been identified as a marker for type 2 diabetes in previous studies also influences birth weight.
doi:10.2337/db09-0506
PMCID: PMC2750235  PMID: 19592620
11.  The role of obesity-associated loci identified in genome wide association studies in the determination of pediatric BMI 
Obesity (Silver Spring, Md.)  2009;17(12):2254-2257.
The prevalence of obesity in children and adults in the United States has increased dramatically over the past decade. Besides environmental factors, genetic factors are known to play an important role in the pathogenesis of obesity. A number of genetic determinants of adult BMI have already been established through genome wide association studies. In this study, we examined 25 single nucleotide polymorphisms (SNPs) corresponding to thirteen previously reported genomic loci in 6,078 children with measures of BMI. Fifteen of these SNPs yielded at least nominally significant association to BMI, representing nine different loci including INSIG2, FTO, MC4R, TMEM18, GNPDA2, NEGR1, BDNF, KCTD15 and 1q25. Other loci revealed no evidence for association, namely at MTCH2, SH2B1, 12q13 and 3q27. For the 15 associated variants, the genotype score explained 1.12% of the total variation for BMI z-score. We conclude that among thirteen loci that have been reported to associate with adult BMI, at least nine also contribute to the determination of BMI in childhood as demonstrated by their associations in our pediatric cohort.
doi:10.1038/oby.2009.159
PMCID: PMC2860782  PMID: 19478790
12.  Investigation of the locus near MC4R with childhood obesity in Americans of European and African ancestry 
Obesity (Silver Spring, Md.)  2009;17(7):1461-1465.
Recently a modest, but consistently, replicated association was demonstrated between obesity and the single nucleotide polymorphism (SNP), rs17782313, 3’ of the MC4R locus as a consequence of a meta-analysis of genome wide association (GWA) studies of the disease in Caucasian populations. We investigated the association in the context of the childhood form of the disease utilizing data from our ongoing GWA study in a cohort of 728 European American (EA) obese children (BMI ≥ 95th percentile) and 3,960 EA controls (BMI < 95th percentile), as well as 1,008 African American (AA) obese children and 2,715 AA controls. rs571312, rs10871777 and rs476828 (perfect surrogates for rs17782313) yielded odds ratios in the EA cohort of 1.142 (P = 0.045), 1.137 (P = 0.054) and 1.145 (P = 0.042); however, there was no significant association with these SNPs in the AA cohort. When investigating all thirty SNPs present on the Illumina BeadChip at this locus, again there was no evidence for association in AA cases when correcting for the number of tests employed. As such, variants 3’ to the MC4R locus present on the genotyping platform utilized confer a similar magnitude of risk of obesity in Caucasian children as to their adult Caucasian counterparts but this observation did not extend to African Americans.
doi:10.1038/oby.2009.53
PMCID: PMC2860794  PMID: 19265794
13.  ATOM: a powerful gene-based association test by combining optimally weighted markers 
Bioinformatics  2008;25(4):497-503.
Background: Large-scale candidate-gene and genome-wide association studies genotype multiple SNPs within or surrounding a gene, including both tag and functional SNPs. The immense amount of data generated in these studies poses new challenges to analysis. One particularly challenging yet important question is how to best use all genetic information to test whether a gene or a region is associated with the trait of interest.
Methods: Here we propose a powerful gene-based Association Test by combining Optimally Weighted Markers (ATOM) within a genomic region. Due to variation in linkage disequilibrium, different markers often associate with the trait of interest at different levels. To appropriately apportion their contributions, we assign a weight to each marker that is proportional to the amount of information it captures about the trait locus. We analytically derive the optimal weights for both quantitative and binary traits, and describe a procedure for estimating the weights from a reference database such as the HapMap. Compared with existing approaches, our method has several distinct advantages, including (i) the ability to borrow information from an external database to increase power, (ii) the theoretical derivation of optimal marker weights and (iii) the scalability to simultaneous analysis of all SNPs in candidate genes and pathways.
Results: Through extensive simulations and analysis of the FTO gene in our ongoing genome-wide association study on childhood obesity, we demonstrate that ATOM increases the power to detect genetic association as compared with several commonly used multi-marker association tests.
Contact: mingyao@mail.med.upenn.edu; chun.li@vanderbilt.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btn641
PMCID: PMC2642636  PMID: 19074959
14.  Modeling genetic inheritance of copy number variations 
Nucleic Acids Research  2008;36(21):e138.
Copy number variations (CNVs) are being used as genetic markers or functional candidates in gene-mapping studies. However, unlike single nucleotide polymorphism or microsatellite genotyping techniques, most CNV detection methods are limited to detecting total copy numbers, rather than copy number in each of the two homologous chromosomes. To address this issue, we developed a statistical framework for intensity-based CNV detection platforms using family data. Our algorithm identifies CNVs for a family simultaneously, thus avoiding the generation of calls with Mendelian inconsistency while maintaining the ability to detect de novo CNVs. Applications to simulated data and real data indicate that our method significantly improves both call rates and accuracy of boundary inference, compared to existing approaches. We further illustrate the use of Mendelian inheritance to infer SNP allele compositions in each of the two homologous chromosomes in CNV regions using real data. Finally, we applied our method to a set of families genotyped using both the Illumina HumanHap550 and Affymetrix genome-wide 5.0 arrays to demonstrate its performance on both inherited and de novo CNVs. In conclusion, our method produces accurate CNV calls, gives probabilistic estimates of CNV transmission and builds a solid foundation for the development of linkage and association tests utilizing CNVs.
doi:10.1093/nar/gkn641
PMCID: PMC2588508  PMID: 18832372
15.  A Genome-Wide Meta-Analysis of Six Type 1 Diabetes Cohorts Identifies Multiple Associated Loci 
PLoS Genetics  2011;7(9):e1002293.
Diabetes impacts approximately 200 million people worldwide, of whom approximately 10% are affected by type 1 diabetes (T1D). The application of genome-wide association studies (GWAS) has robustly revealed dozens of genetic contributors to the pathogenesis of T1D, with the most recent meta-analysis identifying in excess of 40 loci. To identify additional genetic loci for T1D susceptibility, we examined associations in the largest meta-analysis to date between the disease and ∼2.54 million SNPs in a combined cohort of 9,934 cases and 16,956 controls. Targeted follow-up of 53 SNPs in 1,120 affected trios uncovered three new loci associated with T1D that reached genome-wide significance. The most significantly associated SNP (rs539514, P = 5.66×10−11) resides in an intronic region of the LMO7 (LIM domain only 7) gene on 13q22. The second most significantly associated SNP (rs478222, P = 3.50×10−9) resides in an intronic region of the EFR3B (protein EFR3 homolog B) gene on 2p23; however, the region of linkage disequilibrium is approximately 800 kb and harbors additional multiple genes, including NCOA1, C2orf79, CENPO, ADCY3, DNAJC27, POMC, and DNMT3A. The third most significantly associated SNP (rs924043, P = 8.06×10−9) lies in an intergenic region on 6q27, where the region of association is approximately 900 kb and harbors multiple genes including WDR27, C6orf120, PHF10, TCTE3, C6orf208, LOC154449, DLL1, FAM120B, PSMB1, TBP, and PCD2. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
Author Summary
Despite the fact that there is clearly a large genetic component to type 1 diabetes (T1D), uncovering the genes contributing to this disease has proven challenging. However, in the past three years there has been relatively major progress in this regard, with advances in genetic screening technologies allowing investigators to scan the genome for variants conferring risk for disease without prior hypotheses. Such genome-wide association studies have revealed multiple regions of the genome to be robustly and consistently associated with T1D. More recent findings have been a consequence of combining of multiple datasets from independent investigators in meta-analyses, which have more power to pick up additional variants contributing to the trait. In the current study, we describe the largest meta-analysis of T1D genome-wide genotyped datasets to date, which combines six large studies. As a consequence, we have uncovered three new signals residing at the chromosomal locations 13q22, 2p23, and 6q27, which went on to be replicated in independent sample sets. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
doi:10.1371/journal.pgen.1002293
PMCID: PMC3183083  PMID: 21980299
16.  Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects 
Human Molecular Genetics  2010;19(10):2059-2067.
Inflammatory bowel disease, including Crohn's disease (CD) and ulcerative colitis (UC), and type 1 diabetes (T1D) are autoimmune diseases that may share common susceptibility pathways. We examined known susceptibility loci for these diseases in a cohort of 1689 CD cases, 777 UC cases, 989 T1D cases and 6197 shared control subjects of European ancestry, who were genotyped by the Illumina HumanHap550 SNP arrays. We identified multiple previously unreported or unconfirmed disease associations, including known CD loci (ICOSLG and TNFSF15) and T1D loci (TNFAIP3) that confer UC risk, known UC loci (HERC2 and IL26) that confer T1D risk and known UC loci (IL10 and CCNY) that confer CD risk. Additionally, we show that T1D risk alleles residing at the PTPN22, IL27, IL18RAP and IL10 loci protect against CD. Furthermore, the strongest risk alleles for T1D within the major histocompatibility complex (MHC) confer strong protection against CD and UC; however, given the multi-allelic nature of the MHC haplotypes, sequencing of the MHC locus will be required to interpret this observation. These results extend our current knowledge on genetic variants that predispose to autoimmunity, and suggest that many loci involved in autoimmunity may be under a balancing selection due to antagonistic pleiotropic effect. Our analysis implies that variants with opposite effects on different diseases may facilitate the maintenance of common susceptibility alleles in human populations, making autoimmune diseases especially amenable to genetic dissection by genome-wide association studies.
doi:10.1093/hmg/ddq078
PMCID: PMC2860894  PMID: 20176734
17.  Duplication of the SLIT3 Locus on 5q35.1 Predisposes to Major Depressive Disorder 
PLoS ONE  2010;5(12):e15463.
Major depressive disorder (MDD) is a common psychiatric and behavioral disorder. To discover novel variants conferring risk to MDD, we conducted a whole-genome scan of copy number variation (CNV), including 1,693 MDD cases and 4,506 controls genotyped on the Perlegen 600K platform. The most significant locus was observed on 5q35.1, harboring the SLIT3 gene (P = 2×10−3). Extending the controls with 30,000 subjects typed on the Illumina 550 k array, we found the CNV to remain exclusive to MDD cases (P = 3.2×10−9). Duplication was observed in 5 unrelated MDD cases encompassing 646 kb with highly similar breakpoints. SLIT3 is integral to repulsive axon guidance based on binding to Roundabout receptors. Duplication of 5q35.1 is a highly penetrant variation accounting for 0.7% of the subset of 647 cases harboring large CNVs, using a threshold of a minimum of 10 SNPs and 100 kb. This study leverages a large dataset of MDD cases and controls for the analysis of CNVs with matched platform and ethnicity. SLIT3 duplication is a novel association which explains a definitive proportion of the largely unknown etiology of MDD.
doi:10.1371/journal.pone.0015463
PMCID: PMC2995745  PMID: 21152026
18.  From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes 
PLoS Genetics  2009;5(10):e1000678.
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays.
Author Summary
An often touted utility of genome-wide association studies (GWAS) is that the resulting discoveries can facilitate implementation of personalized medicine, in which preventive and therapeutic interventions for complex diseases can be tailored to individual genetic profiles. However, recent studies using whole-genome SNP genotype data for disease risk assessment have generally failed to achieve satisfactory results, leading to a pessimistic view of the utility of genotype data for such purposes. Here we propose that sophisticated machine-learning approaches on a large ensemble of markers, which contain both confirmed and as yet unconfirmed disease susceptibility variants, may improve the performance of disease risk assessment. We tested an algorithm called Support Vector Machine (SVM) on three large-scale datasets for type 1 diabetes and demonstrated that risk assessment can be highly accurate for the disease. Our results suggest that individualized disease risk assessment using whole-genome data may be more successful for some diseases (such as T1D) than other diseases. However, the predictive accuracy will be dependent on the heritability of the disease under study, the proportion of the genetic risk that is known, and that the right set of markers and right algorithms are being used.
doi:10.1371/journal.pgen.1000678
PMCID: PMC2748686  PMID: 19816555
19.  Copy number variation at 1q21.1 associated with neuroblastoma 
Nature  2009;459(7249):987-991.
Common copy number variations (CNVs) represent a significant source of genetic diversity, yet their influence on phenotypic variability, including disease susceptibility, remains poorly understood. To address this problem in cancer, we performed a genome-wide association study (GWAS) of CNVs in the childhood cancer neuroblastoma, a disease where SNP variations are known to influence susceptibility1,2. We first genotyped 846 Caucasian neuroblastoma patients and 803 healthy Caucasian controls at 550,000 single nucleotide polymorphisms, and performed a CNV-based test for association. We then replicated significant observations in two independent sample sets comprised of a total of 595 cases and 3,357 controls. We identified a common CNV at 1q21.1 associated with neuroblastoma in the discovery set, which was confirmed in both replication sets (Pcombined = 2.97 × 10−17; OR = 2.49, 95% CI: 2.02 to 3.05). This CNV was validated by quantitative PCR, fluorescent in situ hybridization, and analysis of matched tumor specimens, and was shown to be heritable in an independent set of 713 cancer-free trios. We identified a novel transcript within the CNV which showed high sequence similarity to several “Neuroblastoma breakpoint family” (NBPF) genes3,4 and represents a new member of this gene family (NBPFX). This transcript was preferentially expressed in fetal brain and fetal sympathetic nervous tissues, and expression level was strictly correlated with CNV state in neuroblastoma cells. These data demonstrate that inherited copy number variation at 1q21.1 is associated with neuroblastoma and implicate a novel NBPF gene in early tumorigenesis of this childhood cancer.
doi:10.1038/nature08035
PMCID: PMC2755253  PMID: 19536264
20.  Genome-Wide Analyses of Exonic Copy Number Variants in a Family-Based Study Point to Novel Autism Susceptibility Genes 
PLoS Genetics  2009;5(6):e1000536.
The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11–q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3×10−5). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3×10−4). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3×10−39), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.
Author Summary
Autism spectrum disorders (ASDs) are common neurodevelopmental syndromes with a strong genetic component. ASDs are characterized by disturbances in social behavior, impaired verbal and nonverbal communication, as well as repetitive behaviors and/or a restricted range of interests. To identify genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. To enrich for variants most likely to interfere with gene function, we restricted our analyses to deletions and gains encompassing exons. Of the many genomic regions highlighted, 27 were seen to harbor rare variants in cases and not controls, both in the first phase of our analysis, and also in an independent replication cohort comprised of 859 cases and 1,051 controls. More work in a larger number of individuals will be required to determine which of the rare alleles highlighted here are indeed related to the ASDs and how they act to shape risk.
doi:10.1371/journal.pgen.1000536
PMCID: PMC2695001  PMID: 19557195

Results 1-20 (20)