Search tips
Search criteria

Results 1-24 (24)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
author:("Hou, cupping")
1.  Genome-Wide Association Study of Serum Minerals Levels in Children of Different Ethnic Background 
PLoS ONE  2015;10(4):e0123499.
Calcium, magnesium, potassium, sodium, chloride and phosphorus are the major dietary minerals involved in various biological functions and are commonly measured in the blood serum. Sufficient mineral intake is especially important for children due to their rapid growth. Currently, the genetic mechanisms influencing serum mineral levels are poorly understood, especially for children. We carried out a genome-wide association (GWA) study on 5,602 European-American children and 4,706 African-American children who had mineral measures available in their electronic medical records (EMR). While no locus met the criteria for genome-wide significant association, our results demonstrated a nominal association of total serum calcium levels with a missense variant in the calcium –sensing receptor (CASR) gene on 3q13 (rs1801725, P = 1.96 × 10-3) in the African-American pediatric cohort, a locus previously reported in Caucasians. We also confirmed the association result in our pediatric European-American cohort (P = 1.38 × 10-4). We further replicated two other loci associated with serum calcium levels in the European-American cohort (rs780094, GCKR, P = 4.26 × 10-3; rs10491003, GATA3, P = 0.02). In addition, we replicated a previously reported locus on 1q21, demonstrating association of serum magnesium levels with MUC1 (rs4072037, P = 2.04 × 10-6). Moreover, in an extended gene-based association analysis we uncovered evidence for association of calcium levels with the previously reported gene locus DGKD in both European-American children and African-American children. Taken together, our results support a role for CASR and DGKD mediated calcium regulation in both African-American and European-American children, and corroborate the association of calcium levels with GCKR and GATA3, and the association of magnesium levels with MUC1 in the European-American children.
PMCID: PMC4401557  PMID: 25886283
2.  Genome-wide copy number variation study associates metabotropic glutamate receptor gene networks with attention deficit hyperactivity disorder 
Nature genetics  2011;44(1):78-84.
Attention deficit hyperactivity disorder (ADHD) is a common, heritable neuropsychiatric disorder of unknown etiology. We performed a whole-genome copy number variation (CNV) study on 1,013 cases with ADHD and 4,105 healthy children of European ancestry using 550,000 SNPs. We evaluated statistically significant findings in multiple independent cohorts, with a total of 2,493 cases with ADHD and 9,222 controls of European ancestry, using matched platforms. CNVs affecting metabotropic glutamate receptor genes were enriched across all cohorts (P = 2.1 × 10−9). We saw GRM5 (encoding glutamate receptor, metabotropic 5) deletions in ten cases and one control (P = 1.36 × 10−6). We saw GRM7 deletions in six cases, and we saw GRM8 deletions in eight cases and no controls. GRM1 was duplicated in eight cases. We experimentally validated the observed variants using quantitative RT-PCR. A gene network analysis showed that genes interacting with the genes in the GRM family are enriched for CNVs in ~10% of the cases (P = 4.38 × 10−10) after correction for occurrence in the controls. We identified rare recurrent CNVs affecting glutamatergic neurotransmission genes that were overrepresented in multiple ADHD cohorts.
PMCID: PMC4310555  PMID: 22138692
4.  GWAS of blood cell traits identifies novel associated loci and epistatic interactions in Caucasian and African-American children 
Human Molecular Genetics  2012;22(7):1457-1464.
Hematological traits are important clinical indicators, the genetic determinants of which have not been fully investigated. Common measures of hematological traits include red blood cell (RBC) count, hemoglobin concentration (HGB), hematocrit (HCT), mean corpuscular hemoglobin (MCH), MCH concentration (MCHC), mean corpuscular volume (MCV), platelet count (PLT) and white blood cell (WBC) count. We carried out a genome-wide association study of the eight common hematological traits among 7943 African-American children and 6234 Caucasian children. In African Americans, we report five novel associations of HBE1 variants with HCT and MCHC, the alpha-globin gene cluster variants with RBC and MCHC, and a variant at the ARHGEF3 locus with PLT, as well as replication of four previously reported loci at genome-wide significance. In Caucasians, we report a novel association of variants at the COPZ1 locus with PLT as well as replication of four previously reported loci at genome-wide significance. Extended analysis of an association observed between MCH and the alpha-globin gene cluster variants demonstrated independent effects and epistatic interaction at the locus, impacting the risk of iron deficiency anemia in African Americans with specific genotype states. In summary, we extend the understanding of genetic variants underlying hematological traits based on analyses in African-American children.
PMCID: PMC3657475  PMID: 23263863
5.  AGC1 Deficiency Causes Infantile Epilepsy, Abnormal Myelination, and Reduced N-Acetylaspartate 
JIMD Reports  2014;14:77-85.
Background: Whole exome sequencing (WES) offers a powerful diagnostic tool to rapidly and efficiently sequence all coding genes in individuals presenting for consideration of phenotypically and genetically heterogeneous disorders such as suspected mitochondrial disease. Here, we report results of WES and functional validation in a consanguineous Indian kindred where two siblings presented with profound developmental delay, congenital hypotonia, refractory epilepsy, abnormal myelination, fluctuating basal ganglia changes, cerebral atrophy, and reduced N-acetylaspartate (NAA).
Methods: Whole blood DNA from one affected and one unaffected sibling was captured by Agilent SureSelect Human All Exon kit and sequenced on the Illumina HiSeq2000. Mutations were validated by Sanger sequencing in all family members. Protein from wild-type and mutant fibroblasts was isolated to assess mutation effects on protein expression and enzyme activity.
Results: A novel SLC25A12 homozygous missense mutation, c.1058G>A; p.Arg353Gln, segregated with disease in this kindred. SLC25A12 encodes the neuronal aspartate-glutamate carrier 1 (AGC1) protein, an essential component of the neuronal malate/aspartate shuttle that transfers NADH and H+ reducing equivalents from the cytosol to mitochondria. AGC1 activity enables neuronal export of aspartate, the glial substrate necessary for proper neuronal myelination. Recombinant mutant p.Arg353Gln AGC1 activity was reduced to 15% of wild type. One prior reported SLC25A12 mutation caused complete loss of AGC1 activity in a child with epilepsy, hypotonia, hypomyelination, and reduced brain NAA.
Conclusions: These data strongly suggest that SLC25A12 disease impairs neuronal AGC1 activity. SLC25A12 sequencing should be considered in children with infantile epilepsy, congenital hypotonia, global delay, abnormal myelination, and reduced brain NAA.
Electronic supplementary material
The online version of this chapter (doi:10.1007/8904_2013_287) contains supplementary material, which is available to authorized users.
PMCID: PMC4213337  PMID: 24515575
6.  Whole-genome DNA/RNA sequencing identifies truncating mutations in RBCK1 in a novel Mendelian disease with neuromuscular and cardiac involvement 
Genome Medicine  2013;5(7):67.
Whole-exome sequencing has identified the causes of several Mendelian diseases by analyzing multiple unrelated cases, but it is more challenging to resolve the cause of extremely rare and suspected Mendelian diseases from individual families. We identified a family quartet with two children, both affected with a previously unreported disease, characterized by progressive muscular weakness and cardiomyopathy, with normal intelligence. During the course of the study, we identified one additional unrelated patient with a comparable phenotype.
We performed whole-genome sequencing (Complete Genomics platform), whole-exome sequencing (Agilent SureSelect exon capture and Illumina Genome Analyzer II platform), SNP genotyping (Illumina HumanHap550 SNP array) and Sanger sequencing on blood samples, as well as RNA-Seq (Illumina HiSeq platform) on transformed lymphoblastoid cell lines.
From whole-genome sequence data, we identified RBCK1, a gene encoding an E3 ubiquitin-protein ligase, as the most likely candidate gene, with two protein-truncating mutations in probands in the first family. However, exome data failed to nominate RBCK1 as a candidate gene, due to poor regional coverage. Sanger sequencing identified a private homozygous splice variant in RBCK1 in the proband in the second family, yet SNP genotyping revealed a 1.2Mb copy-neutral region of homozygosity covering RBCK1. RNA-Seq confirmed aberrant splicing of RBCK1 transcripts, resulting in truncated protein products.
While the exact mechanism by which these mutations cause disease is unknown, our study represents an example of how the combined use of whole-genome DNA and RNA sequencing can identify a disease-predisposing gene for a novel and extremely rare Mendelian disease.
PMCID: PMC3971341  PMID: 23889995
7.  The missense variation landscape of FTO, MC4R and TMEM18 in obese children of African ancestry 
Obesity (Silver Spring, Md.)  2013;21(1):159-163.
Common variation at the loci harboring FTO, MC4R and TMEM18 is consistently reported as being statistically the most strongly associated with obesity. We investigated if these loci also harbor rarer missense variants that confer substantially higher risk of common childhood obesity in African American (AA) children. We sequenced the exons of FTO, MC4R and TMEM18 in an initial subset of our cohort i.e. 200 obese (BMI≥95th percentile) and 200 lean AA children (BMI≤5th percentile). Any missense exonic variants that were uncovered went on to be further genotyped in a further 768 obese and 768 lean (BMI≤50th percentile) children of the same ethnicity. A number of exonic variants were observed from our sequencing effort: seven in FTO, of which four were non-synonymous (A163T, G182A, M400V and A405V), thirteen in MC4R, of which six were non-synonymous (V103I, N123S, S136A, F202L, N240S and I251L) and four in TMEM18, of which two were non-synonymous (P2S and V113L). Follow-up genotyping of these missense variants revealed only one significant difference in allele frequency between cases and controls, namely with N240S in MC4R(Fisher's Exact P = 0.0001). In summary, moderately rare missense variants within the FTO, MC4R and TMEM18 genes observed in our study did not confer risk of common childhood obesity in African Americans except for a degree of evidence for one known loss-of-function variant in MC4R.
PMCID: PMC3605748  PMID: 23505181
Obesity; Pediatrics; Genomics
8.  Whole-genome sequencing in an autism multiplex family 
Molecular Autism  2013;4:8.
Autism spectrum disorders (ASDs) represent a group of childhood neurodevelopmental disorders that affect 1 in 88 children in the US. Previous exome sequencing studies on family trios have implicated a role for rare, de-novo mutations in the pathogenesis of autism.
To examine the utility of whole-genome sequencing to identify inherited disease candidate variants and genes, we sequenced two probands from a large pedigree, including two parents and eight children. We evaluated multiple analytical strategies to identify a prioritized list of candidate genes.
By assuming a recessive model of inheritance, we identified seven candidate genes shared by the two probands. We also evaluated a different analytical strategy that does not require the assumption of disease model, and identified a list of 59 candidate variants that may increase susceptibility to autism. Manual examination of this list identified ANK3 as the most likely candidate gene. Finally, we identified 33 prioritized non-coding variants such as those near SMG6 and COQ5, based on evolutionary constraint and experimental evidence from ENCODE. Although we were unable to confirm rigorously whether any of these genes indeed contribute to the disease, our analysis provides a prioritized shortlist for further validation studies.
Our study represents one of the first whole-genome sequencing studies in autism leveraging a large family-based pedigree. These results provide for a discussion on the relative merits of finding de-novo mutations in sporadic cases versus finding inherited mutations in large pedigrees, in the context of neuropsychiatric and neurodevelopmental diseases.
PMCID: PMC3642023  PMID: 23597238
9.  Common variation at 6q16 within HACE1 and LIN28B influences susceptibility to neuroblastoma 
Nature genetics  2012;44(10):1126-1130.
Neuroblastoma is a cancer of the sympathetic nervous system that accounts for approximately 10% of all pediatric oncology deaths1. Here we report on a genome-wide association study of 2,817 neuroblastoma cases and 7,473 controls. We identified two new associations at 6q16, the first within HACE1 (rs4336470; combined P = 2.7 × 10−11, odds ratio 1.26, 95% CI: 1.18–1.35) and the second within LIN28B (rs17065417; combined P = 1.2 × 10−8, odds ratio 1.38, 95% CI: 1.23–1.54). Expression of LIN28B and let-7 miRNA correlated with rs17065417 genotype in neuroblastoma cell lines, and we observed significant growth inhibition upon depletion of LIN28B specifically in neuroblastoma cells homozygous for the risk allele. Low HACE1 and high LIN28B expression in diagnostic primary neuroblastomas were associated with worse overall survival (P = 0.008 and 0.014, respectively). Taken together, we show that common variants in HACE1 and LIN28B influence neuroblastoma susceptibility and that both genes likely play a role in disease progression.
PMCID: PMC3459292  PMID: 22941191
10.  Common Variation at BARD1 Results in the Expression of an Oncogenic Isoform that Influences Neuroblastoma Susceptibility and Oncogenicity 
Cancer Research  2012;72(8):2068-2078.
The mechanisms underlying genetic susceptibility at loci discovered by genome-wide association study (GWAS) approaches in human cancer remain largely undefined. In this study we characterized the high-risk neuroblastoma association at the BRCA1-related locus, BARD1, showing that disease-associated variations correlate with increased expression of the oncogenically activated isoform, BARD1β. In neuroblastoma cells, silencing of BARD1β showed genotype-specific cytotoxic effects, including decreased substrate-adherent, anchorage-independent, and foci growth. In established murine fibroblasts, overexpression of BARD1β was sufficient for neoplastic transformation. BARD1β stabilized the Aurora family of kinases in neuroblastoma cells, suggesting both a mechanism for the observed effect and a potential therapeutic strategy. Together, our findings identify BARD1β as an oncogenic driver of high-risk neuroblastoma tumorigenesis, and more generally, they illustrate how robust GWAS signals offer genomic landmarks to identify molecular mechanisms involved in both tumor initiation and malignant progression. The interaction of BARD1β with the Aurora family of kinases lends strong support to the ongoing work to develop Aurora kinase inhibitors for clinically aggressive neuroblastoma.
PMCID: PMC3328617  PMID: 22350409
genome-wide association; neuroblastoma; BARD1; cancer susceptibility genes; functional genomics; oncogenes; genotype-phenotype correlations
11.  Integrative genomics identifies LMO1 as a neuroblastoma oncogene 
Nature  2010;469(7329):216-220.
Neuroblastoma is a childhood cancer of the sympathetic nervous system that accounts for approximately 10% of all paediatric oncology deaths1,2. To identify genetic risk factors for neuroblastoma, we performed a genome-wide association study (GWAS) on 2,251 patients and 6,097 control subjects of European ancestry from four case series. Here we report a significant association within LIM domain only 1 (LMO1) at 11p15.4 (rs110419, combined P = 5.2 × 10−16, odds ratio of risk allele = 1.34 (95% confidence interval 1.25–1.44)). The signal was enriched in the subset of patients with the most aggressive form of the disease. LMO1 encodes a cysteine-rich transcriptional regulator, and its paralogues (LMO2, LMO3 and LMO4) have each been previously implicated in cancer. In parallel, we analysed genome-wide DNA copy number alterations in 701 primary tumours. We found that the LMO1 locus was aberrant in 12.4% through a duplication event, and that this event was associated with more advanced disease (P < 0.0001) and survival (P = 0.041). The germline single nucleotide polymorphism (SNP) risk alleles and somatic copy number gains were associated with increased LMO1 expression in neuroblastoma cell lines and primary tumours, consistent with a gain-of-function role in tumorigenesis. Short hairpin RNA (shRNA)-mediated depletion of LMO1 inhibited growth of neuroblastoma cells with high LMO1 expression, whereas forced expression of LMO1 in neuroblastoma cells with low LMO1 expression enhanced proliferation. These data show that common polymorphisms at the LMO1 locus are strongly associated with susceptibility to developing neuroblastoma, but also may influence the likelihood of further somatic alterations at this locus, leading to malignant progression.
PMCID: PMC3320515  PMID: 21124317
12.  Common variants at five new loci associated with early-onset inflammatory bowel disease 
Nature Genetics  2009;41(12):1335-1340.
The inflammatory bowel diseases (IBD) Crohn’s disease and ulcerative colitis are common causes of morbidity in children and young adults in the western world. Here we report the results of a genome-wide association study in early-onset IBD involving 3,426 affected individuals and 11,963 genetically matched controls recruited through international collaborations in Europe and North America, thereby extending the results from a previous study of 1,011 individuals with early-onset IBD1. We have identified five new regions associated with early-onset IBD susceptibility, including 16p11 near the cytokine gene IL27 (rs8049439, P = 2.41 × 10−9), 22q12 (rs2412973, P = 1.55 × 10−9), 10q22 (rs1250550, P = 5.63 × 10−9), 2q37 (rs4676410, P = 3.64 × 10−8) and 19q13.11 (rs10500264, P = 4.26 × 10−10). Our scan also detected associations at 23 of 32 loci previously implicated in adult-onset Crohn’s disease and at 8 of 17 loci implicated in adult-onset ulcerative colitis, highlighting the close pathogenetic relationship between early- and adult-onset IBD.
PMCID: PMC3267927  PMID: 19915574
13.  Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects 
Human Molecular Genetics  2010;19(10):2059-2067.
Inflammatory bowel disease, including Crohn's disease (CD) and ulcerative colitis (UC), and type 1 diabetes (T1D) are autoimmune diseases that may share common susceptibility pathways. We examined known susceptibility loci for these diseases in a cohort of 1689 CD cases, 777 UC cases, 989 T1D cases and 6197 shared control subjects of European ancestry, who were genotyped by the Illumina HumanHap550 SNP arrays. We identified multiple previously unreported or unconfirmed disease associations, including known CD loci (ICOSLG and TNFSF15) and T1D loci (TNFAIP3) that confer UC risk, known UC loci (HERC2 and IL26) that confer T1D risk and known UC loci (IL10 and CCNY) that confer CD risk. Additionally, we show that T1D risk alleles residing at the PTPN22, IL27, IL18RAP and IL10 loci protect against CD. Furthermore, the strongest risk alleles for T1D within the major histocompatibility complex (MHC) confer strong protection against CD and UC; however, given the multi-allelic nature of the MHC haplotypes, sequencing of the MHC locus will be required to interpret this observation. These results extend our current knowledge on genetic variants that predispose to autoimmunity, and suggest that many loci involved in autoimmunity may be under a balancing selection due to antagonistic pleiotropic effect. Our analysis implies that variants with opposite effects on different diseases may facilitate the maintenance of common susceptibility alleles in human populations, making autoimmune diseases especially amenable to genetic dissection by genome-wide association studies.
PMCID: PMC2860894  PMID: 20176734
14.  Examination of All Type 2 Diabetes GWAS Loci Reveals HHEX-IDE as a Locus Influencing Pediatric BMI 
Diabetes  2009;59(3):751-755.
A number of studies have found that BMI in early life influences the risk of developing type 2 diabetes later in life. Our goal was to investigate if any type 2 diabetes variants uncovered through genome-wide association studies (GWAS) impact BMI in childhood.
Using data from an ongoing GWAS of pediatric BMI in our cohort, we investigated the association of pediatric BMI with 20 single nucleotide polymorphisms at 18 type 2 diabetes loci uncovered through GWAS, consisting of ADAMTS9, CDC123-CAMK1D, CDKAL1, CDKN2A/B, EXT2, FTO, HHEX-IDE, IGF2BP2, the intragenic region on 11p12, JAZF1, KCNQ1, LOC387761, MTNR1B, NOTCH2, SLC30A8, TCF7L2, THADA, and TSPAN8-LGR5. We randomly partitioned our cohort exactly in half in order to have a discovery cohort (n = 3,592) and a replication cohort (n = 3,592).
Our data show that the major type 2 diabetes risk–conferring G allele of rs7923837 at the HHEX-IDE locus was associated with higher pediatric BMI in both the discovery (P = 0.0013 and survived correction for 20 tests) and replication (P = 0.023) sets (combined P = 1.01 × 10−4). Association was not detected with any other known type 2 diabetes loci uncovered to date through GWAS except for the well-established FTO.
Our data show that the same genetic HHEX-IDE variant, which is associated with type 2 diabetes from previous studies, also influences pediatric BMI.
PMCID: PMC2828649  PMID: 19933996
15.  Examination of Type 2 Diabetes Loci Implicates CDKAL1 as a Birth Weight Gene 
Diabetes  2009;58(10):2414-2418.
A number of studies have found that reduced birth weight is associated with type 2 diabetes later in life; however, the underlying mechanism for this correlation remains unresolved. Recently, association has been demonstrated between low birth weight and single nucleotide polymorphisms (SNPs) at the CDKAL1 and HHEX-IDE loci, regions that were previously implicated in the pathogenesis of type 2 diabetes. In order to investigate whether type 2 diabetes risk–conferring alleles associate with low birth weight in our Caucasian childhood cohort, we examined the effects of 20 such loci on this trait.
Using data from an ongoing genome-wide association study in our cohort of 5,465 Caucasian children with recorded birth weights, we investigated the association of the previously reported type 2 diabetes–associated variation at 20 loci including TCF7L2, HHEX-IDE, PPARG, KCNJ11, SLC30A8, IGF2BP2, CDKAL1, CDKN2A/2B, and JAZF1 with birth weight.
Our data show that the minor allele of rs7756992 (P = 8 × 10−5) at the CDKAL1 locus is strongly associated with lower birth weight, whereas a perfect surrogate for variation previously implicated for the trait at the same locus only yielded nominally significant association (P = 0.01; r2 rs7756992 = 0.677). However, association was not detected with any of the other type 2 diabetes loci studied.
We observe association between lower birth weight and type 2 diabetes risk–conferring alleles at the CDKAL1 locus. Our data show that the same genetic locus that has been identified as a marker for type 2 diabetes in previous studies also influences birth weight.
PMCID: PMC2750235  PMID: 19592620
16.  Common genetic variants on 5p14.1 associate with autism spectrum disorders 
Nature  2009;459(7246):528-533.
Autism spectrum disorders (ASDs) represent a group of childhood neurodevelopmental and neuropsychiatric disorders characterized by deficits in verbal communication, impairment of social interaction, and restricted and repetitive patterns of interests and behaviour. To identify common genetic risk factors underlying ASDs, here we present the results of genome-wide association studies on a cohort of 780 families (3,101 subjects) with affected children, and a second cohort of 1,204 affected subjects and 6,491 control subjects, all of whom were of European ancestry. Six single nucleotide polymorphisms between cadherin 10 (CDH10) and cadherin 9 (CDH9)—two genes encoding neuronal cell-adhesion molecules—revealed strong association signals, with the most significant SNP being rs4307059 (P = 3.4 × 10−8, odds ratio = 1.19). These signals were replicated in two independent cohorts, with combined P values ranging from 7.4 × 10−8 to 2.1 × 10−10. Our results implicate neuronal cell-adhesion molecules in the pathogenesis of ASDs, and represent, to our knowledge, the first demonstration of genome-wide significant association of common variants with susceptibility to ASDs.
PMCID: PMC2943511  PMID: 19404256
17.  Autism genome-wide copy number variation reveals ubiquitin and neuronal genes 
Nature  2009;459(7246):569-573.
Autism spectrum disorders (ASDs) are childhood neurodevelopmental disorders with complex genetic origins1–4. Previous studies focusing on candidate genes or genomic regions have identified several copy number variations (CNVs) that are associated with an increased risk of ASDs5–9. Here we present the results from a whole-genome CNV study on a cohort of 859 ASD cases and 1,409 healthy children of European ancestry who were genotyped with ~550,000 single nucleotide polymorphism markers, in an attempt to comprehensively identify CNVs conferring susceptibility to ASDs. Positive findings were evaluated in an independent cohort of 1,336 ASD cases and 1,110 controls of European ancestry. Besides previously reported ASD candidate genes, such as NRXN1 (ref. 10) and CNTN4 (refs 11, 12), several new susceptibility genes encoding neuronal cell-adhesion molecules, including NLGN1 and ASTN2, were enriched with CNVs in ASD cases compared to controls (P = 9.5 × 10−3). Furthermore, CNVs within or surrounding genes involved in the ubiquitin pathways, including UBE3A, PARK2, RFWD2 and FBXO40, were affected by CNVs not observed in controls (P = 3.3 × 10−3). We also identified duplications 55 kilobases upstream of complementary DNA AK123120 (P = 3.6 × 10−6). Although these variants may be individually rare, they target genes involved in neuronal cell-adhesion or ubiquitin degradation, indicating that these two important gene networks expressed within the central nervous system may contribute to the genetic susceptibility of ASD.
PMCID: PMC2925224  PMID: 19404257
18.  The role of height-associated loci identified in genome wide association studies in the determination of pediatric stature 
BMC Medical Genetics  2010;11:96.
Human height is considered highly heritable and correlated with certain disorders, such as type 2 diabetes and cancer. Despite environmental influences, genetic factors are known to play an important role in stature determination. A number of genetic determinants of adult height have already been established through genome wide association studies.
To examine 51 single nucleotide polymorphisms (SNPs) corresponding to the 46 previously reported genomic loci for height in 8,184 European American children with height measurements. We leveraged genotyping data from our ongoing GWA study of height variation in children in order to query the 51 SNPs in this pediatric cohort.
Sixteen of these SNPs yielded at least nominally significant association to height, representing fifteen different loci including EFEMP1-PNPT1, GPR126, C6orf173, SPAG17, Histone class 1, HLA class III and GDF5-UQCC. Other loci revealed no evidence for association, including HMGA1 and HMGA2. For the 16 associated variants, the genotype score explained 1.64% of the total variation for height z-score.
Among 46 loci that have been reported to associate with adult height to date, at least 15 also contribute to the determination of height in childhood.
PMCID: PMC2894790  PMID: 20546612
19.  Common variations in BARD1 influence susceptibility to high-risk neuroblastoma 
Nature genetics  2009;41(6):718-723.
We conducted a SNP-based genome-wide association study (GWAS) focused on the high-risk subset of neuroblastoma1. As our previous unbiased GWAS showed strong association of common 6p22 SNP alleles with aggressive neuroblastoma2, we now restricted our analysis to 397 high-risk cases compared to 2,043 controls. We detected new significant association of six SNPs at 2q35 within the BARD1 gene locus (Pallelic = 2.35×10−9 − 2.25×10−8). Each SNP association was confirmed in a second series of 189 high-risk cases and 1,178 controls (Pallelic = 7.90×10−7 − 2.77×10−4). The two most significant SNPs (rs6435862, rs3768716) were also tested in two additional independent high-risk neuroblastoma case series, yielding combined allelic odds-ratios of 1.68 each (P = 8.65×10−18 and 2.74×10−16, respectively). Significant association was also found with known BARD1 nsSNPs. These data show that common variation in BARD1 contributes to the etiology of the aggressive and most clinically relevant subset of human neuroblastoma.
PMCID: PMC2753610  PMID: 19412175
20.  From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes 
PLoS Genetics  2009;5(10):e1000678.
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays.
Author Summary
An often touted utility of genome-wide association studies (GWAS) is that the resulting discoveries can facilitate implementation of personalized medicine, in which preventive and therapeutic interventions for complex diseases can be tailored to individual genetic profiles. However, recent studies using whole-genome SNP genotype data for disease risk assessment have generally failed to achieve satisfactory results, leading to a pessimistic view of the utility of genotype data for such purposes. Here we propose that sophisticated machine-learning approaches on a large ensemble of markers, which contain both confirmed and as yet unconfirmed disease susceptibility variants, may improve the performance of disease risk assessment. We tested an algorithm called Support Vector Machine (SVM) on three large-scale datasets for type 1 diabetes and demonstrated that risk assessment can be highly accurate for the disease. Our results suggest that individualized disease risk assessment using whole-genome data may be more successful for some diseases (such as T1D) than other diseases. However, the predictive accuracy will be dependent on the heritability of the disease under study, the proportion of the genetic risk that is known, and that the right set of markers and right algorithms are being used.
PMCID: PMC2748686  PMID: 19816555
21.  Copy number variation at 1q21.1 associated with neuroblastoma 
Nature  2009;459(7249):987-991.
Common copy number variations (CNVs) represent a significant source of genetic diversity, yet their influence on phenotypic variability, including disease susceptibility, remains poorly understood. To address this problem in cancer, we performed a genome-wide association study (GWAS) of CNVs in the childhood cancer neuroblastoma, a disease where SNP variations are known to influence susceptibility1,2. We first genotyped 846 Caucasian neuroblastoma patients and 803 healthy Caucasian controls at 550,000 single nucleotide polymorphisms, and performed a CNV-based test for association. We then replicated significant observations in two independent sample sets comprised of a total of 595 cases and 3,357 controls. We identified a common CNV at 1q21.1 associated with neuroblastoma in the discovery set, which was confirmed in both replication sets (Pcombined = 2.97 × 10−17; OR = 2.49, 95% CI: 2.02 to 3.05). This CNV was validated by quantitative PCR, fluorescent in situ hybridization, and analysis of matched tumor specimens, and was shown to be heritable in an independent set of 713 cancer-free trios. We identified a novel transcript within the CNV which showed high sequence similarity to several “Neuroblastoma breakpoint family” (NBPF) genes3,4 and represents a new member of this gene family (NBPFX). This transcript was preferentially expressed in fetal brain and fetal sympathetic nervous tissues, and expression level was strictly correlated with CNV state in neuroblastoma cells. These data demonstrate that inherited copy number variation at 1q21.1 is associated with neuroblastoma and implicate a novel NBPF gene in early tumorigenesis of this childhood cancer.
PMCID: PMC2755253  PMID: 19536264
22.  A genome-wide association study identifies a susceptibility locus to clinically aggressive neuroblastoma at 6p22 
The New England journal of medicine  2008;358(24):2585-2593.
Neuroblastoma is a malignancy of the developing sympathetic nervous system that most commonly affects young children and is often lethal. The etiology of this embryonal cancer is not known.
We performed a genome-wide association study by first genotyping 1,032 neuroblastoma patients and 2,043 controls of European descent using the Illumina HumanHap550 BeadChip. Three independent groups of neuroblastoma cases (N=720) and controls (N=2128) were then genotyped to replicate significant associations.
We observed highly significant association between neuroblastoma and the common minor alleles of three single nucleotide polymorphisms (SNPs) within a 94.2 kilobase (Kb) linkage disequilibrium block at chromosome band 6p22 containing the predicted genes FLJ22536 and FLJ44180 (P-value range = 1.71×10-9-7.01×10-10; allelic odds ratio range 1.39-1.40). Homozygosity for the at-risk G allele of the most significantly associated SNP, rs6939340, resulted in an increased likelihood of developing neuroblastoma of 1.97 (95% CI 1.58-2.44). Subsequent genotyping of these 6p22 SNPs in the three independent case series confirmed our observation of association (P=9.33×10-15 at rs6939340 for joint analysis). Furthermore, neuroblastoma patients homozygous for the risk alleles at 6p22 were more likely to develop metastatic (Stage 4) disease (P=0.02), show amplification of the MYCN oncogene in the tumor cells (P=0.006), and to have disease relapse (P=0.01).
Common genetic variation at chromosome band 6p22 is associated with susceptibility to neuroblastoma.
PMCID: PMC2742373  PMID: 18463370
23.  Adjustment of genomic waves in signal intensities from whole-genome SNP genotyping platforms 
Nucleic Acids Research  2008;36(19):e126.
Whole-genome microarrays with large-insert clones designed to determine DNA copy number often show variation in hybridization intensity that is related to the genomic position of the clones. We found these ‘genomic waves’ to be present in Illumina and Affymetrix SNP genotyping arrays, confirming that they are not platform-specific. The causes of genomic waves are not well-understood, and they may prevent accurate inference of copy number variations (CNVs). By measuring DNA concentration for 1444 samples and by genotyping the same sample multiple times with varying DNA quantity, we demonstrated that DNA quantity correlates with the magnitude of waves. We further showed that wavy signal patterns correlate best with GC content, among multiple genomic features considered. To measure the magnitude of waves, we proposed a GC-wave factor (GCWF) measure, which is a reliable predictor of DNA quantity (correlation coefficient = 0.994 based on samples with serial dilution). Finally, we developed a computational approach by fitting regression models with GC content included as a predictor variable, and we show that this approach improves the accuracy of CNV detection. With the wide application of whole-genome SNP genotyping techniques, our wave adjustment method will be important for taking full advantage of genotyped samples for CNV analysis.
PMCID: PMC2577347  PMID: 18784189
24.  DTNBP1 (Dystrobrevin Binding Protein 1) and Schizophrenia: Association Evidence in the 3′ End of the Gene 
Human Heredity  2007;64(2):97-106.
Dysbindin (DTNBP1) has been identified as a susceptibility gene for schizophrenia (SZ) through a positional approach. However, a variety of single nucleotide polymorphisms (SNPs) and haplotypes, in different parts of the gene, have been reported to be associated in different samples, and a precise molecular mechanism of disease remains to be defined. We have performed an association study with two well-characterized family samples not previously investigated at the DTNBP1 locus.
We examined 646 subjects in 136 families with SZ, largely of European ancestry (EA), genotyping 26 SNPs in DTNBP1.
Three correlated markers (rs875462, rs760666, and rs7758659) at the 3′ region of DTNBP1 showed evidence for association to SZ (p = 0.004), observed in both the EA (p = 0.031) and the African American (AA) subset (p = 0.045) with the same over-transmitted allele. The most significant haplotype in our study was rs7758659-rs3213207 (global p = 0.0015), with rs3213207 being the most frequently reported associated marker in previous studies. A non-conservative missense variant (Pro272Ser) in the 3′ region of DTNBP1 that may impair DTNBP1 function was more common in SZ probands (8.2%) than in founders (5%) and in dbSNP (2.1%), but did not reach statistical significance.
Our results provide evidence for an association of SZ with SNPs at the 3′ end of DTNBP1 in the samples studied.
PMCID: PMC2861529  PMID: 17476109
Single nucleotide polymorphism; Haplotype; Linkage disequilibrium; Complex disorder; Dystrobrevin binding protein 1; Schizophrenia; Association

Results 1-24 (24)