Search tips
Search criteria

Results 26-48 (48)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
26.  SNP array mapping of 20p deletions: Genotypes, Phenotypes and Copy Number Variation 
Human mutation  2009;30(3):371-378.
The use of array technology to define chromosome deletions and duplications is bringing us closer to establishing a genotype/phenotype map of genomic copy number alterations. We studied 21 patients and 5 relatives with deletions of the short arm of chromosome 20 using the Illumina HumanHap550 SNP array to 1) more accurately determine the deletion sizes, 2) identify and compare breakpoints, 3) establish genotype/phenotype correlations and 4) investigate the use of the HumanHap550 platform for analysis of chromosome deletions. Deletions ranged from 95kb to 14.62Mb, and all of the breakpoints were unique. Eleven patients had deletions between 95kb and 4Mb and these individuals had normal development, with no anomalies outside of those associated with Alagille syndrome. The proximal and distal boundaries of these eleven deletions constitute a 5.4MB region, and we propose that haploinsufficiency for only 1 of the 12 genes in this region causes phenotypic abnormalities. This defines the JAG1 associated critical region, in which deletions do not confer findings other than those associated with Alagille syndrome. The other 10 patients had deletions between 3.28Mb and 14.62Mb, which extended outside the critical region, and notably, all of these patients, had developmental delay. This group had other findings such as autism, scoliosis and bifid uvula. We identified 47 additional polymorphic genome-wide copy number variants (>20 SNPs), with 0–5 variants called per patient. Deletions of the short arm of chromosome 20 are associated with relatively mild and limited clinical anomalies. The use of SNP arrays provides accurate high-resolution definition of genomic abnormalities.
PMCID: PMC2650004  PMID: 19058200
SNP array analysis; 20p deletion; copy number variants; Alagille syndrome; haploinsufficiency; JAG1
27.  ATOM: a powerful gene-based association test by combining optimally weighted markers 
Bioinformatics  2008;25(4):497-503.
Background: Large-scale candidate-gene and genome-wide association studies genotype multiple SNPs within or surrounding a gene, including both tag and functional SNPs. The immense amount of data generated in these studies poses new challenges to analysis. One particularly challenging yet important question is how to best use all genetic information to test whether a gene or a region is associated with the trait of interest.
Methods: Here we propose a powerful gene-based Association Test by combining Optimally Weighted Markers (ATOM) within a genomic region. Due to variation in linkage disequilibrium, different markers often associate with the trait of interest at different levels. To appropriately apportion their contributions, we assign a weight to each marker that is proportional to the amount of information it captures about the trait locus. We analytically derive the optimal weights for both quantitative and binary traits, and describe a procedure for estimating the weights from a reference database such as the HapMap. Compared with existing approaches, our method has several distinct advantages, including (i) the ability to borrow information from an external database to increase power, (ii) the theoretical derivation of optimal marker weights and (iii) the scalability to simultaneous analysis of all SNPs in candidate genes and pathways.
Results: Through extensive simulations and analysis of the FTO gene in our ongoing genome-wide association study on childhood obesity, we demonstrate that ATOM increases the power to detect genetic association as compared with several commonly used multi-marker association tests.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2642636  PMID: 19074959
28.  Follow-Up Analysis of Genome-Wide Association Data Identifies Novel Loci for Type 1 Diabetes 
Diabetes  2009;58(1):290-295.
OBJECTIVE—Two recent genome-wide association (GWA) studies have revealed novel loci for type 1 diabetes, a common multifactorial disease with a strong genetic component. To fully utilize the GWA data that we had obtained by genotyping 563 type 1 diabetes probands and 1,146 control subjects, as well as 483 case subject–parent trios, using the Illumina HumanHap550 BeadChip, we designed a full stage 2 study to capture other possible association signals.
RESEARCH DESIGN AND METHODS—From our existing datasets, we selected 982 markers with P < 0.05 in both GWA cohorts. Genotyping these in an independent set of 636 nuclear families with 974 affected offspring revealed 75 markers that also had P < 0.05 in this third cohort. Among these, six single nucleotide polymorphisms in five novel loci also had P < 0.05 in the Wellcome Trust Case-Control Consortium dataset and were further tested in 1,303 type 1 diabetes probands from the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) plus 1,673 control subjects.
RESULTS—Two markers (rs9976767 and rs3757247) remained significant after adjusting for the number of tests in this last cohort; they reside in UBASH3A (OR 1.16; combined P = 2.33 × 10−8) and BACH2 (1.13; combined P = 1.25 × 10−6).
CONCLUSIONS—Evaluation of a large number of statistical GWA candidates in several independent cohorts has revealed additional loci that are associated with type 1 diabetes. The two genes at these respective loci, UBASH3A and BACH2, are both biologically relevant to autoimmunity.
PMCID: PMC2606889  PMID: 18840781
29.  Association Analysis of Type 2 Diabetes Loci in Type 1 Diabetes 
Diabetes  2008;57(7):1983-1986.
OBJECTIVE—To search for a possible association of type 1 diabetes with 10 validated type 2 diabetes loci, i.e., PPARG, KCNJ11, WFS1, HNF1B, IDE/HHEX, SLC30A8, CDKAL1, CDKN2A/B, IGF2BP2, and FTO/RPGRIP1L.
RESEARCH DESIGN AND METHODS—Two European population samples were studied: 1) one case-control cohort of 514 type 1 diabetic subjects and 2,027 control subjects and 2) one family cohort of 483 complete type 1 diabetic case-parent trios (total 997 affected). A total of 13 tag single nucleotide polymorphisms (SNPs) from the 10 type 2 diabetes loci were analyzed for type 1 diabetes association.
RESULTS—No association of type 1 diabetes was found with any of the 10 type 2 diabetes loci, and no age-at-onset effect was detected. By combined analysis using the Wellcome Trust Case-Control Consortium type 1 diabetes data, SNP rs1412829 in the CDKN2A/B locus bordered on significance (P = 0.039) (odds ratio 0.929 [95% CI 0.867–0.995]), which did not reach the statistical significance threshold adjusted for 13 tests (α = 0.00385).
CONCLUSIONS—This study suggests that the type 2 diabetes loci do not play any obvious role in type 1 diabetes genetic susceptibility. The distinct molecular mechanisms of the two diseases highlighted the importance of differentiation diagnosis and different treatment principles.
PMCID: PMC2453613  PMID: 18426861
30.  Concept, Design and Implementation of a Cardiovascular Gene-Centric 50 K SNP Array for Large-Scale Genomic Association Studies 
PLoS ONE  2008;3(10):e3583.
A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a “cosmopolitan” tagging approach to capture the genetic diversity across ∼2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.
PMCID: PMC2571995  PMID: 18974833
31.  Modeling genetic inheritance of copy number variations 
Nucleic Acids Research  2008;36(21):e138.
Copy number variations (CNVs) are being used as genetic markers or functional candidates in gene-mapping studies. However, unlike single nucleotide polymorphism or microsatellite genotyping techniques, most CNV detection methods are limited to detecting total copy numbers, rather than copy number in each of the two homologous chromosomes. To address this issue, we developed a statistical framework for intensity-based CNV detection platforms using family data. Our algorithm identifies CNVs for a family simultaneously, thus avoiding the generation of calls with Mendelian inconsistency while maintaining the ability to detect de novo CNVs. Applications to simulated data and real data indicate that our method significantly improves both call rates and accuracy of boundary inference, compared to existing approaches. We further illustrate the use of Mendelian inheritance to infer SNP allele compositions in each of the two homologous chromosomes in CNV regions using real data. Finally, we applied our method to a set of families genotyped using both the Illumina HumanHap550 and Affymetrix genome-wide 5.0 arrays to demonstrate its performance on both inherited and de novo CNVs. In conclusion, our method produces accurate CNV calls, gives probabilistic estimates of CNV transmission and builds a solid foundation for the development of linkage and association tests utilizing CNVs.
PMCID: PMC2588508  PMID: 18832372
32.  Association Analysis of the FTO Gene with Obesity in Children of Caucasian and African Ancestry Reveals a Common Tagging SNP 
PLoS ONE  2008;3(3):e1746.
Recently an association was demonstrated between the single nucleotide polymorphism (SNP), rs9939609, within the FTO locus and obesity as a consequence of a genome wide association (GWA) study of type 2 diabetes in adults. We examined the effects of two perfect surrogates for this SNP plus 11 other SNPs at this locus with respect to our childhood obesity cohort, consisting of both Caucasians and African Americans (AA). Utilizing data from our ongoing GWA study in our cohort of 418 Caucasian obese children (BMI≥95th percentile), 2,270 Caucasian controls (BMI<95th percentile), 578 AA obese children and 1,424 AA controls, we investigated the association of the previously reported variation at the FTO locus with the childhood form of this disease in both ethnicities. The minor allele frequencies (MAF) of rs8050136 and rs3751812 (perfect surrogates for rs9939609 i.e. both r2 = 1) in the Caucasian cases were 0.448 and 0.443 respectively while they were 0.391 and 0.386 in Caucasian controls respectively, yielding for both an odds ratio (OR) of 1.27 (95% CI 1.08–1.47; P = 0.0022). Furthermore, the MAFs of rs8050136 and rs3751812 in the AA cases were 0.449 and 0.115 respectively while they were 0.436 and 0.090 in AA controls respectively, yielding an OR of 1.05 (95% CI 0.91–1.21; P = 0.49) and of 1.31 (95% CI 1.050–1.643; P = 0.017) respectively. Investigating all 13 SNPs present on the Illumina HumanHap550 BeadChip in this region of linkage disequilibrium, rs3751812 was the only SNP conferring significant risk in AA. We have therefore replicated and refined the association in an AA cohort and distilled a tag-SNP, rs3751812, which captures the ancestral origin of the actual mutation. As such, variants in the FTO gene confer a similar magnitude of risk of obesity to children as to their adult counterparts and appear to have a global impact.
PMCID: PMC2262153  PMID: 18335027
33.  A COL1A1 Sp1 binding site polymorphism predisposes to osteoporotic fracture by affecting bone density and quality 
Journal of Clinical Investigation  2001;107(7):899-907.
Osteoporosis is a common disease with a strong genetic component. We previously described a polymorphic Sp1 binding site in the COL1A1 gene that has been associated with osteoporosis in several populations. Here we explore the molecular mechanisms underlying this association. A meta-analysis showed significant associations between COL1A1 “s” alleles and bone mineral density (BMD), body mass index (BMI), and osteoporotic fractures. The association with fracture was stronger than expected on the basis of the observed differences in BMD and BMI, suggesting an additional effect on bone strength. Gel shift assays showed increased binding affinity of the “s” allele for Sp1 protein, and primary RNA transcripts derived from the “s” allele were approximately three times more abundant than “S” allele–derived transcripts in “Ss” heterozygotes. Collagen produced from osteoblasts cultured from “Ss” heterozygotes had an increased ratio of α1(I) protein relative to α2(I), and this was accompanied by an increased ratio of COL1A1 mRNA relative to COL1A2. Finally, the yield strength of bone derived from “Ss” individuals was reduced when compared with bone derived from “SS” subjects. We conclude that the COL1A1 Sp1 polymorphism is a functional genetic variant that predisposes to osteoporosis by complex mechanisms involving changes in bone mass and bone quality.
PMCID: PMC199568  PMID: 11285309
34.  Linkage of Osteoporosis to Chromosome 20p12 and Association to BMP2 
PLoS Biology  2003;1(3):e69.
Osteoporotic fractures are a major cause of morbidity and mortality in ageing populations. Osteoporosis, defined as low bone mineral density (BMD) and associated fractures, have significant genetic components that are largely unknown. Linkage analysis in a large number of extended osteoporosis families in Iceland, using a phenotype that combines osteoporotic fractures and BMD measurements, showed linkage to Chromosome 20p12.3 (multipoint allele-sharing LOD, 5.10; p value, 6.3 × 10−7), results that are statistically significant after adjusting for the number of phenotypes tested and the genome-wide search. A follow-up association analysis using closely spaced polymorphic markers was performed. Three variants in the bone morphogenetic protein 2 (BMP2) gene, a missense polymorphism and two anonymous single nucleotide polymorphism haplotypes, were determined to be associated with osteoporosis in the Icelandic patients. The association is seen with many definitions of an osteoporotic phenotype, including osteoporotic fractures as well as low BMD, both before and after menopause. A replication study with a Danish cohort of postmenopausal women was conducted to confirm the contribution of the three identified variants. In conclusion, we find that a region on the short arm of Chromosome 20 contains a gene or genes that appear to be a major risk factor for osteoporosis and osteoporotic fractures, and our evidence supports the view that BMP2 is at least one of these genes.
Genetic analysis of Icelandic families and a replication study in a Danish population provide evidence that variation in the gene BMP2 might contribute to osteoporosis
PMCID: PMC270020  PMID: 14691541
35.  Transferability and Fine Mapping of Type 2 Diabetes Loci in African Americans 
Diabetes  2013;62(3):965-976.
Type 2 diabetes (T2D) disproportionally affects African Americans (AfA) but, to date, genetic variants identified from genome-wide association studies (GWAS) are primarily from European and Asian populations. We examined the single nucleotide polymorphism (SNP) and locus transferability of 40 reported T2D loci in six AfA GWAS consisting of 2,806 T2D case subjects with or without end-stage renal disease and 4,265 control subjects from the Candidate Gene Association Resource Plus Study. Our results revealed that seven index SNPs at the TCF7L2, KLF14, KCNQ1, ADCY5, CDKAL1, JAZF1, and GCKR loci were significantly associated with T2D (P < 0.05). The strongest association was observed at TCF7L2 rs7903146 (odds ratio [OR] 1.30; P = 6.86 × 10−8). Locus-wide analysis demonstrated significant associations (Pemp < 0.05) at regional best SNPs in the TCF7L2, KLF14, and HMGA2 loci as well as suggestive signals in KCNQ1 after correction for the effective number of SNPs at each locus. Of these loci, the regional best SNPs were in differential linkage disequilibrium (LD) with the index and adjacent SNPs. Our findings suggest that some loci discovered in prior reports affect T2D susceptibility in AfA with similar effect sizes. The reduced and differential LD pattern in AfA compared with European and Asian populations may facilitate fine mapping of causal variants at loci shared across populations.
PMCID: PMC3581206  PMID: 23193183
36.  Common variants at 6q22 and 17q21 are associated with intracranial volume 
Nature genetics  2012;44(5):539-544.
During aging, intracranial volume remains unchanged and represents maximally attained brain size, while various interacting biological phenomena lead to brain volume loss. Consequently, intracranial volume and brain volume in late life reflect different genetic influences. Our genome-wide association study in 8,175 community-dwelling elderly did not reveal any genome-wide significant associations (p<5*10−8) for brain volume. In contrast, intracranial volume was significantly associated with two loci: rs4273712 (p=3.4*10−11), a known height locus on chromosome 6q22, and rs9915547, tagging the inversion on chromosome 17q21 (p=1.5*10−12). We replicated the associations of these loci with intracranial volume in a separate sample of 1,752 older persons (p=1.1*10−3 for 6q22 and p=1.2*10−3 for 17q21). Furthermore, we also found suggestive associations of the 17q21 locus with head circumference in 10,768 children (mean age 14.5 months). Our data identify two loci associated with head size, with the inversion on 17q21 also likely involved in attaining maximal brain size.
PMCID: PMC3618290  PMID: 22504418
37.  Copy Number Variations in Alternative Splicing Gene Networks Impact Lifespan 
PLoS ONE  2013;8(1):e53846.
Longevity has a strong genetic component evidenced by family-based studies. Lipoprotein metabolism, FOXO proteins, and insulin/IGF-1 signaling pathways in model systems have shown polygenic variations predisposing to shorter lifespan. To test the hypothesis that rare variants could influence lifespan, we compared the rates of CNVs in healthy children (0–18 years of age) with individuals 67 years or older. CNVs at a significantly higher frequency in the pediatric cohort were considered risk variants impacting lifespan, while those enriched in the geriatric cohort were considered longevity protective variants. We performed a whole-genome CNV analysis on 7,313 children and 2,701 adults of European ancestry genotyped with 302,108 SNP probes. Positive findings were evaluated in an independent cohort of 2,079 pediatric and 4,692 geriatric subjects. We detected 8 deletions and 10 duplications that were enriched in the pediatric group (P = 3.33×10−8–1.6×10−2 unadjusted), while only one duplication was enriched in the geriatric cohort (P = 6.3×10−4). Population stratification correction resulted in 5 deletions and 3 duplications remaining significant (P = 5.16×10−5–4.26×10−2) in the replication cohort. Three deletions and four duplications were significant combined (combined P = 3.7×10−4−3.9×10−2). All associated loci were experimentally validated using qPCR. Evaluation of these genes for pathway enrichment demonstrated ∼50% are involved in alternative splicing (P = 0.0077 Benjamini and Hochberg corrected). We conclude that genetic variations disrupting RNA splicing could have long-term biological effects impacting lifespan.
PMCID: PMC3559729  PMID: 23382853
38.  Genetic association analysis highlights new loci that modulate hematological trait variation in Caucasians and African Americans 
Human genetics  2010;129(3):307-317.
Red blood cell, white blood cell, and platelet measures, including their count, sub-type and volume, are important diagnostic and prognostic clinical parameters for several human diseases. To identify novel loci associated with hematological traits, and compare the architecture of these phenotypes between ethnic groups, the CARe Project genotyped 49,094 single nucleotide polymorphisms (SNPs) that capture variation in ~2,100 candidate genes in DNA of 23,439 Caucasians and 7,112 African Americans from five population-based cohorts. We found strong novel associations between erythrocyte phenotypes and the glucose-6 phosphate dehydrogenase (G6PD) A-allele in African Americans (rs1050828, P < 2.0 × 10−13, T-allele associated with lower red blood cell count, hemoglobin, and hematocrit, and higher mean corpuscular volume), and between platelet count and a SNP at the tropomyosin-4 (TPM4) locus (rs8109288, P = 3.0 × 10−7 in Caucasians; P = 3.0 × 10−7 in African Americans, T-allele associated with lower platelet count). We strongly replicated many genetic associations to blood cell phenotypes previously established in Caucasians. A common variant of the α-globin (HBA2-HBA1) locus was associated with red blood cell traits in African Americans, but not in Caucasians (rs1211375, P < 7 × 10−8, A-allele associated with lower hemoglobin, mean corpuscular hemoglobin, and mean corpuscular volume). Our results show similarities but also differences in the genetic regulation of hematological traits in European- and African-derived populations, and highlight the role of natural selection in shaping these differences.
PMCID: PMC3442357  PMID: 21153663
39.  A Genome-Wide Meta-Analysis of Six Type 1 Diabetes Cohorts Identifies Multiple Associated Loci 
PLoS Genetics  2011;7(9):e1002293.
Diabetes impacts approximately 200 million people worldwide, of whom approximately 10% are affected by type 1 diabetes (T1D). The application of genome-wide association studies (GWAS) has robustly revealed dozens of genetic contributors to the pathogenesis of T1D, with the most recent meta-analysis identifying in excess of 40 loci. To identify additional genetic loci for T1D susceptibility, we examined associations in the largest meta-analysis to date between the disease and ∼2.54 million SNPs in a combined cohort of 9,934 cases and 16,956 controls. Targeted follow-up of 53 SNPs in 1,120 affected trios uncovered three new loci associated with T1D that reached genome-wide significance. The most significantly associated SNP (rs539514, P = 5.66×10−11) resides in an intronic region of the LMO7 (LIM domain only 7) gene on 13q22. The second most significantly associated SNP (rs478222, P = 3.50×10−9) resides in an intronic region of the EFR3B (protein EFR3 homolog B) gene on 2p23; however, the region of linkage disequilibrium is approximately 800 kb and harbors additional multiple genes, including NCOA1, C2orf79, CENPO, ADCY3, DNAJC27, POMC, and DNMT3A. The third most significantly associated SNP (rs924043, P = 8.06×10−9) lies in an intergenic region on 6q27, where the region of association is approximately 900 kb and harbors multiple genes including WDR27, C6orf120, PHF10, TCTE3, C6orf208, LOC154449, DLL1, FAM120B, PSMB1, TBP, and PCD2. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
Author Summary
Despite the fact that there is clearly a large genetic component to type 1 diabetes (T1D), uncovering the genes contributing to this disease has proven challenging. However, in the past three years there has been relatively major progress in this regard, with advances in genetic screening technologies allowing investigators to scan the genome for variants conferring risk for disease without prior hypotheses. Such genome-wide association studies have revealed multiple regions of the genome to be robustly and consistently associated with T1D. More recent findings have been a consequence of combining of multiple datasets from independent investigators in meta-analyses, which have more power to pick up additional variants contributing to the trait. In the current study, we describe the largest meta-analysis of T1D genome-wide genotyped datasets to date, which combines six large studies. As a consequence, we have uncovered three new signals residing at the chromosomal locations 13q22, 2p23, and 6q27, which went on to be replicated in independent sample sets. These latest associated regions add to the growing repertoire of gene networks predisposing to T1D.
PMCID: PMC3183083  PMID: 21980299
40.  Genome-Wide Association Study of White Blood Cell Count in 16,388 African Americans: the Continental Origins and Genetic Epidemiology Network (COGENT) 
PLoS Genetics  2011;7(6):e1002108.
Total white blood cell (WBC) and neutrophil counts are lower among individuals of African descent due to the common African-derived “null” variant of the Duffy Antigen Receptor for Chemokines (DARC) gene. Additional common genetic polymorphisms were recently associated with total WBC and WBC sub-type levels in European and Japanese populations. No additional loci that account for WBC variability have been identified in African Americans. In order to address this, we performed a large genome-wide association study (GWAS) of total WBC and cell subtype counts in 16,388 African-American participants from 7 population-based cohorts available in the Continental Origins and Genetic Epidemiology Network. In addition to the DARC locus on chromosome 1q23, we identified two other regions (chromosomes 4q13 and 16q22) associated with WBC in African Americans (P<2.5×10−8). The lead SNP (rs9131) on chromosome 4q13 is located in the CXCL2 gene, which encodes a chemotactic cytokine for polymorphonuclear leukocytes. Independent evidence of the novel CXCL2 association with WBC was present in 3,551 Hispanic Americans, 14,767 Japanese, and 19,509 European Americans. The index SNP (rs12149261) on chromosome 16q22 associated with WBC count is located in a large inter-chromosomal segmental duplication encompassing part of the hydrocephalus inducing homolog (HYDIN) gene. We demonstrate that the chromosome 16q22 association finding is most likely due to a genotyping artifact as a consequence of sequence similarity between duplicated regions on chromosomes 16q22 and 1q21. Among the WBC loci recently identified in European or Japanese populations, replication was observed in our African-American meta-analysis for rs445 of CDK6 on chromosome 7q21 and rs4065321 of PSMD3-CSF3 region on chromosome 17q21. In summary, the CXCL2, CDK6, and PSMD3-CSF3 regions are associated with WBC count in African American and other populations. We also demonstrate that large inter-chromosomal duplications can result in false positive associations in GWAS.
Author Summary
Although recent genome-wide association studies have identified common genetic variants associated with total white blood cell (WBC) and WBC sub-type counts in European and Japanese ancestry populations, whether these or other loci account for differences in WBC count among African Americans is unknown. By examining >16,000 African Americans, we show that, in addition to the previously identified Duffy Antigen Receptor for Chemokines (DARC) locus on chromosome 1, another variant, rs9131, and other nearby variants on human chromosome 4 are associated with total WBC count in African Americans. The variants span the CXCL2 gene, which encodes an inflammatory mediator involved in WBC production and migration. We show that the association is not restricted to African Americans but is also present in independent samples of European Americans, Hispanic Americans, and Japanese. This finding is potentially important because WBC mediate or have altered counts in a variety of acute and chronic disorders.
PMCID: PMC3128101  PMID: 21738479
41.  Comparative genetic analysis of inflammatory bowel disease and type 1 diabetes implicates multiple loci with opposite effects 
Human Molecular Genetics  2010;19(10):2059-2067.
Inflammatory bowel disease, including Crohn's disease (CD) and ulcerative colitis (UC), and type 1 diabetes (T1D) are autoimmune diseases that may share common susceptibility pathways. We examined known susceptibility loci for these diseases in a cohort of 1689 CD cases, 777 UC cases, 989 T1D cases and 6197 shared control subjects of European ancestry, who were genotyped by the Illumina HumanHap550 SNP arrays. We identified multiple previously unreported or unconfirmed disease associations, including known CD loci (ICOSLG and TNFSF15) and T1D loci (TNFAIP3) that confer UC risk, known UC loci (HERC2 and IL26) that confer T1D risk and known UC loci (IL10 and CCNY) that confer CD risk. Additionally, we show that T1D risk alleles residing at the PTPN22, IL27, IL18RAP and IL10 loci protect against CD. Furthermore, the strongest risk alleles for T1D within the major histocompatibility complex (MHC) confer strong protection against CD and UC; however, given the multi-allelic nature of the MHC haplotypes, sequencing of the MHC locus will be required to interpret this observation. These results extend our current knowledge on genetic variants that predispose to autoimmunity, and suggest that many loci involved in autoimmunity may be under a balancing selection due to antagonistic pleiotropic effect. Our analysis implies that variants with opposite effects on different diseases may facilitate the maintenance of common susceptibility alleles in human populations, making autoimmune diseases especially amenable to genetic dissection by genome-wide association studies.
PMCID: PMC2860894  PMID: 20176734
42.  Duplication of the SLIT3 Locus on 5q35.1 Predisposes to Major Depressive Disorder 
PLoS ONE  2010;5(12):e15463.
Major depressive disorder (MDD) is a common psychiatric and behavioral disorder. To discover novel variants conferring risk to MDD, we conducted a whole-genome scan of copy number variation (CNV), including 1,693 MDD cases and 4,506 controls genotyped on the Perlegen 600K platform. The most significant locus was observed on 5q35.1, harboring the SLIT3 gene (P = 2×10−3). Extending the controls with 30,000 subjects typed on the Illumina 550 k array, we found the CNV to remain exclusive to MDD cases (P = 3.2×10−9). Duplication was observed in 5 unrelated MDD cases encompassing 646 kb with highly similar breakpoints. SLIT3 is integral to repulsive axon guidance based on binding to Roundabout receptors. Duplication of 5q35.1 is a highly penetrant variation accounting for 0.7% of the subset of 647 cases harboring large CNVs, using a threshold of a minimum of 10 SNPs and 100 kb. This study leverages a large dataset of MDD cases and controls for the analysis of CNVs with matched platform and ethnicity. SLIT3 duplication is a novel association which explains a definitive proportion of the largely unknown etiology of MDD.
PMCID: PMC2995745  PMID: 21152026
43.  Common variations in BARD1 influence susceptibility to high-risk neuroblastoma 
Nature genetics  2009;41(6):718-723.
We conducted a SNP-based genome-wide association study (GWAS) focused on the high-risk subset of neuroblastoma1. As our previous unbiased GWAS showed strong association of common 6p22 SNP alleles with aggressive neuroblastoma2, we now restricted our analysis to 397 high-risk cases compared to 2,043 controls. We detected new significant association of six SNPs at 2q35 within the BARD1 gene locus (Pallelic = 2.35×10−9 − 2.25×10−8). Each SNP association was confirmed in a second series of 189 high-risk cases and 1,178 controls (Pallelic = 7.90×10−7 − 2.77×10−4). The two most significant SNPs (rs6435862, rs3768716) were also tested in two additional independent high-risk neuroblastoma case series, yielding combined allelic odds-ratios of 1.68 each (P = 8.65×10−18 and 2.74×10−16, respectively). Significant association was also found with known BARD1 nsSNPs. These data show that common variation in BARD1 contributes to the etiology of the aggressive and most clinically relevant subset of human neuroblastoma.
PMCID: PMC2753610  PMID: 19412175
44.  Loci on 20q13 and 21q22 are associated with pediatric-onset inflammatory bowel disease 
Nature genetics  2008;40(10):1211-1215.
Inflammatory bowel disease (IBD) is a common inflammatory disorder with complex etiology that involves both genetic and environmental triggers, including but not limited to defects in bacterial clearance, defective mucosal barrier and persistent dysregulation of the immune response to commensal intestinal bacteria. IBD is characterized by two distinct phenotypes: Crohn’s disease (CD) and ulcerative colitis (UC). Previously reported GWA studies have identified genetic variation accounting for a small portion of the overall genetic susceptibility to CD and an even smaller contribution to UC pathogenesis. We hypothesized that stratification of IBD by age of onset might identify additional genes associated with IBD. To that end, we carried out a GWA analysis in a cohort of 1,011 individuals with pediatric-onset IBD and 4,250 matched controls. We identified and replicated significantly associated, previously unreported loci on chromosomes 20q13 (rs2315008[T] and rs4809330[A]; P = 6.30 × 10−8 and 6.95 × 10−8, respectively; odds ratio (OR) = 0.74 for both) and 21q22 (rs2836878[A]; P = 6.01 × 10−8; OR = 0.73), located close to the TNFRSF6B and PSMG1 genes, respectively.
PMCID: PMC2770437  PMID: 18758464
45.  From Disease Association to Risk Assessment: An Optimistic View from Genome-Wide Association Studies on Type 1 Diabetes 
PLoS Genetics  2009;5(10):e1000678.
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays.
Author Summary
An often touted utility of genome-wide association studies (GWAS) is that the resulting discoveries can facilitate implementation of personalized medicine, in which preventive and therapeutic interventions for complex diseases can be tailored to individual genetic profiles. However, recent studies using whole-genome SNP genotype data for disease risk assessment have generally failed to achieve satisfactory results, leading to a pessimistic view of the utility of genotype data for such purposes. Here we propose that sophisticated machine-learning approaches on a large ensemble of markers, which contain both confirmed and as yet unconfirmed disease susceptibility variants, may improve the performance of disease risk assessment. We tested an algorithm called Support Vector Machine (SVM) on three large-scale datasets for type 1 diabetes and demonstrated that risk assessment can be highly accurate for the disease. Our results suggest that individualized disease risk assessment using whole-genome data may be more successful for some diseases (such as T1D) than other diseases. However, the predictive accuracy will be dependent on the heritability of the disease under study, the proportion of the genetic risk that is known, and that the right set of markers and right algorithms are being used.
PMCID: PMC2748686  PMID: 19816555
46.  Copy number variation at 1q21.1 associated with neuroblastoma 
Nature  2009;459(7249):987-991.
Common copy number variations (CNVs) represent a significant source of genetic diversity, yet their influence on phenotypic variability, including disease susceptibility, remains poorly understood. To address this problem in cancer, we performed a genome-wide association study (GWAS) of CNVs in the childhood cancer neuroblastoma, a disease where SNP variations are known to influence susceptibility1,2. We first genotyped 846 Caucasian neuroblastoma patients and 803 healthy Caucasian controls at 550,000 single nucleotide polymorphisms, and performed a CNV-based test for association. We then replicated significant observations in two independent sample sets comprised of a total of 595 cases and 3,357 controls. We identified a common CNV at 1q21.1 associated with neuroblastoma in the discovery set, which was confirmed in both replication sets (Pcombined = 2.97 × 10−17; OR = 2.49, 95% CI: 2.02 to 3.05). This CNV was validated by quantitative PCR, fluorescent in situ hybridization, and analysis of matched tumor specimens, and was shown to be heritable in an independent set of 713 cancer-free trios. We identified a novel transcript within the CNV which showed high sequence similarity to several “Neuroblastoma breakpoint family” (NBPF) genes3,4 and represents a new member of this gene family (NBPFX). This transcript was preferentially expressed in fetal brain and fetal sympathetic nervous tissues, and expression level was strictly correlated with CNV state in neuroblastoma cells. These data demonstrate that inherited copy number variation at 1q21.1 is associated with neuroblastoma and implicate a novel NBPF gene in early tumorigenesis of this childhood cancer.
PMCID: PMC2755253  PMID: 19536264
47.  A genome-wide association study identifies a susceptibility locus to clinically aggressive neuroblastoma at 6p22 
The New England journal of medicine  2008;358(24):2585-2593.
Neuroblastoma is a malignancy of the developing sympathetic nervous system that most commonly affects young children and is often lethal. The etiology of this embryonal cancer is not known.
We performed a genome-wide association study by first genotyping 1,032 neuroblastoma patients and 2,043 controls of European descent using the Illumina HumanHap550 BeadChip. Three independent groups of neuroblastoma cases (N=720) and controls (N=2128) were then genotyped to replicate significant associations.
We observed highly significant association between neuroblastoma and the common minor alleles of three single nucleotide polymorphisms (SNPs) within a 94.2 kilobase (Kb) linkage disequilibrium block at chromosome band 6p22 containing the predicted genes FLJ22536 and FLJ44180 (P-value range = 1.71×10-9-7.01×10-10; allelic odds ratio range 1.39-1.40). Homozygosity for the at-risk G allele of the most significantly associated SNP, rs6939340, resulted in an increased likelihood of developing neuroblastoma of 1.97 (95% CI 1.58-2.44). Subsequent genotyping of these 6p22 SNPs in the three independent case series confirmed our observation of association (P=9.33×10-15 at rs6939340 for joint analysis). Furthermore, neuroblastoma patients homozygous for the risk alleles at 6p22 were more likely to develop metastatic (Stage 4) disease (P=0.02), show amplification of the MYCN oncogene in the tumor cells (P=0.006), and to have disease relapse (P=0.01).
Common genetic variation at chromosome band 6p22 is associated with susceptibility to neuroblastoma.
PMCID: PMC2742373  PMID: 18463370
48.  Genome-Wide Analyses of Exonic Copy Number Variants in a Family-Based Study Point to Novel Autism Susceptibility Genes 
PLoS Genetics  2009;5(6):e1000536.
The genetics underlying the autism spectrum disorders (ASDs) is complex and remains poorly understood. Previous work has demonstrated an important role for structural variation in a subset of cases, but has lacked the resolution necessary to move beyond detection of large regions of potential interest to identification of individual genes. To pinpoint genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. Through prioritization of exonic deletions (eDels), exonic duplications (eDups), and whole gene duplication events (gDups), we identified more than 150 loci harboring rare variants in multiple unrelated probands, but no controls. Importantly, 27 of these were confirmed on examination of an independent replication cohort comprised of 859 cases and an additional 1,051 controls. Rare variants at known loci, including exonic deletions at NRXN1 and whole gene duplications encompassing UBE3A and several other genes in the 15q11–q13 region, were observed in the course of these analyses. Strong support was likewise observed for previously unreported genes such as BZRAP1, an adaptor molecule known to regulate synaptic transmission, with eDels or eDups observed in twelve unrelated cases but no controls (p = 2.3×10−5). Less is known about MDGA2, likewise observed to be case-specific (p = 1.3×10−4). But, it is notable that the encoded protein shows an unexpectedly high similarity to Contactin 4 (BLAST E-value = 3×10−39), which has also been linked to disease. That hundreds of distinct rare variants were each seen only once further highlights complexity in the ASDs and points to the continued need for larger cohorts.
Author Summary
Autism spectrum disorders (ASDs) are common neurodevelopmental syndromes with a strong genetic component. ASDs are characterized by disturbances in social behavior, impaired verbal and nonverbal communication, as well as repetitive behaviors and/or a restricted range of interests. To identify genes likely to contribute to ASD etiology, we performed high density genotyping in 912 multiplex families from the Autism Genetics Resource Exchange (AGRE) collection and contrasted results to those obtained for 1,488 healthy controls. To enrich for variants most likely to interfere with gene function, we restricted our analyses to deletions and gains encompassing exons. Of the many genomic regions highlighted, 27 were seen to harbor rare variants in cases and not controls, both in the first phase of our analysis, and also in an independent replication cohort comprised of 859 cases and 1,051 controls. More work in a larger number of individuals will be required to determine which of the rare alleles highlighted here are indeed related to the ASDs and how they act to shape risk.
PMCID: PMC2695001  PMID: 19557195

Results 26-48 (48)