1.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci 
Nature genetics  2010;42(6):508-514.
To identify novel genetic risk factors for rheumatoid arthritis (RA), we conducted a genome-wide association study (GWAS) meta-analysis of 5,539 autoantibody positive RA cases and 20,169 controls of European descent, followed by replication in an independent set of 6,768 RA cases and 8,806 controls. Of 34 SNPs selected for replication, 7 novel RA risk alleles were identified at genome-wide significance (P<5×10−8) in analysis of all 41,282 samples. The associated SNPs are near genes of known immune function, including IL6ST, SPRED2, RBPJ, CCR6, IRF5, and PXK. We also refined the risk alleles at two established RA risk loci (IL2RA and CCL21) and confirmed the association at AFF3. These new associations bring the total number of confirmed RA risk loci to 31 among individuals of European ancestry. An additional 11 SNPs replicated at P<0.05, many of which are validated autoimmune risk alleles, suggesting that most represent bona fide RA risk alleles.
PMCID: PMC4243840  PMID: 20453842
2.  Assessing the phenotypic effects in the general population of rare variants in genes for a dominant mendelian form of diabetes 
Nature genetics  2013;45(11):1380-1385.
Genome sequencing can identify individuals in the general population who harbor rare coding variants in genes for Mendelian disorders1–7 – and who consequently may have increased disease risk. However, previous studies of rare variants in phenotypically extreme individuals have ascertainment bias and may demonstrate inflated effect size estimates8–12. We sequenced seven genes for maturity-onset diabetes of the young (MODY)13 in well-phenotyped population samples14,15 (n=4,003). Rare variants were filtered according to prediction criteria used to identify disease-causing mutations: i) previously-reported in MODY, and ii) stringent de novo thresholds satisfied (rare, conserved, protein damaging). Approximately 1.5% and 0.5% of randomly selected Framingham and Jackson Heart Study individuals carried variants from these two classes, respectively. However, the vast majority of carriers remained euglycemic through middle age. Accurate estimates of variant effect sizes from population-based sequencing are needed to avoid falsely predicting a significant fraction of individuals as at risk for MODY or other Mendelian diseases.
PMCID: PMC4051627  PMID: 24097065
3.  Association of Blood Lipids with Common DNA Sequence Variants at Nineteen Genetic Loci in the Multiethnic United States National Health and Nutrition Examination Survey III 
Using the genome-wide association approach in individuals of European ancestry, we and others recently identified single nucleotide polymorphisms (SNPs) at 19 loci as associated with blood lipids; eight of these loci were novel. Whether these same SNPs associate with lipids in a broader range of ethnicities is unknown.
Methods and Results
We genotyped index SNPs at 19 loci in the Third United States National Health and Nutrition Examination Survey (n=7159), a population-based probability sample of the U.S. comprised primarily of non-Hispanic blacks, Mexican Americans, and non-Hispanic whites. We constructed ethnic-specific residual blood lipid levels after adjusting for age and gender. Ethnic-specific linear regression was used to test the association of genotype with blood lipids. To summarize the statistical evidence across three racial groups, we conducted a fixed-effects variance-weighted meta-analysis.
After exclusions, there were 1627 non-Hispanic blacks, 1659 Mexican Americans, and 2230 non-Hispanic whites. At five loci (1p13 near CELSR2/PSRC1/SORT1, HMGCR, CETP, LPL, and APOA5), the index SNP was associated with low-density lipoprotein cholesterol, high-density lipoprotein cholesterol, or triglycerides in all three ethnic groups. At the remaining loci, there was mixed evidence by ethnic group. In meta-analysis, we found that, at 14 of the 19 loci, SNPs exceeded a nominal P < 0.05.
At five loci including the recently-discovered region on 1p13 near CELSR2/PSRC1/SORT1, the same SNP discovered in whites associates with blood lipids in non-Hispanic blacks and Mexican Americans. For the remaining loci, fine-mapping and resequencing will be required to definitively evaluate the relevance of each locus in individuals of African and Hispanic ancestries.
PMCID: PMC3561731  PMID: 20031591
lipids; genetics; epidemiology; risk factors
4.  Genetic variants at CD28, PRDM1, and CD2/CD58 are associated with rheumatoid arthritis risk 
Nature genetics  2009;41(12):1313-1318.
To discover novel RA risk loci, we systematically examined 370 SNPs from 179 independent loci with p<0.001 in a published meta-analysis of RA GWAS of 3,393 cases and 12,462 controls1. We used GRAIL2, a computational method that applies statistical text mining to PubMed abstracts, to score these 179 loci for functional relationships to genes in 16 established RA disease loci1,3-11. We identified 22 loci with a significant degree of functional connectivity. We genotyped 22 representative SNPs in an independent set of 7,957 cases and 11,958 matched controls. Three validate convincingly: CD2/CD58 (rs11586238, p=1×10−6 replication, p=1×10−9 overall), and CD28 (rs1980422, p=5×10−6 replication, p=1×10−9 overall), PRDM1 (rs548234, p=1×10−5 replication, p=2×10−8 overall). An additional four replicate (p<0.0023): TAGAP (rs394581, p=0.0002 replication, p=4×10−7 overall), PTPRC (rs10919563, p=0.0003 replication, p=7×10−7 overall), TRAF6/RAG1 (rs540386, p=0.0008 replication, p=4×10−6 overall), and FCGR2A (rs12746613, p=0.0022 replication, p=2×10−5 overall). Many of these loci are also associated to other immunologic diseases.
PMCID: PMC3142887  PMID: 19898481
5.  A common variant of HMGA2 is associated with adult and childhood height in the general population 
Nature genetics  2007;39(10):1245-1250.
Human height is a classic, highly heritable quantitative trait. To begin to identify genetic variants influencing height, we examined genome-wide association data from 4,921 individuals. Common variants in the HMGA2 oncogene, exemplified by rs1042725, were associated with height (P = 4 × 10−8). HMGA2 is also a strong biological candidate for height, as rare, severe mutations in this gene alter body size in mice and humans, so we tested rs1042725 in additional samples. We confirmed the association in 19,064 adults from four further studies (P = 3 × 10−11, overall P = 4 × 10−16, including the genome-wide association data). We also observed the association in children (P = 1 × 10−6, N = 6,827) and a tall/short case-control study (P = 4 × 10−6, N = 3,207). We estimate that rs1042725 explains ~0.3% of population variation in height (~0.4 cm increased adult height per C allele). There are few examples of common genetic variants reproducibly associated with human quantitative traits; these results represent, to our knowledge, the first consistently replicated association with adult and childhood height.
PMCID: PMC3086278  PMID: 17767157
6.  High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED1 in human complex I deficiency 
Nature genetics  2010;42(10):851-858.
Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing, and experimental validation to uncover the molecular basis of mitochondrial complex I (CI) disorders. We created five pools of DNA from a cohort of 103 patients and then performed deep sequencing of 103 candidate genes to spotlight 151 rare variants predicted to impact protein function. We used confirmatory experiments to establish genetic diagnoses in 22% of previously unsolved cases, and discovered that defects in NUBPL and FOXRED1 can cause CI deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can reveal novel disease-causing mutations in individual patients.
PMCID: PMC2977978  PMID: 20818383
7.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids 
Teslovich, Tanya M. | Musunuru, Kiran | Smith, Albert V. | Edmondson, Andrew C. | Stylianou, Ioannis M. | Koseki, Masahiro | Pirruccello, James P. | Ripatti, Samuli | Chasman, Daniel I. | Willer, Cristen J. | Johansen, Christopher T. | Fouchier, Sigrid W. | Isaacs, Aaron | Peloso, Gina M. | Barbalic, Maja | Ricketts, Sally L. | Bis, Joshua C. | Aulchenko, Yurii S. | Thorleifsson, Gudmar | Feitosa, Mary F. | Chambers, John | Orho-Melander, Marju | Melander, Olle | Johnson, Toby | Li, Xiaohui | Guo, Xiuqing | Li, Mingyao | Cho, Yoon Shin | Go, Min Jin | Kim, Young Jin | Lee, Jong-Young | Park, Taesung | Kim, Kyunga | Sim, Xueling | Ong, Rick Twee-Hee | Croteau-Chonka, Damien C. | Lange, Leslie A. | Smith, Joshua D. | Song, Kijoung | Zhao, Jing Hua | Yuan, Xin | Luan, Jian'an | Lamina, Claudia | Ziegler, Andreas | Zhang, Weihua | Zee, Robert Y.L. | Wright, Alan F. | Witteman, Jacqueline C.M. | Wilson, James F. | Willemsen, Gonneke | Wichmann, H-Erich | Whitfield, John B. | Waterworth, Dawn M. | Wareham, Nicholas J. | Waeber, Gérard | Vollenweider, Peter | Voight, Benjamin F. | Vitart, Veronique | Uitterlinden, Andre G. | Uda, Manuela | Tuomilehto, Jaakko | Thompson, John R. | Tanaka, Toshiko | Surakka, Ida | Stringham, Heather M. | Spector, Tim D. | Soranzo, Nicole | Smit, Johannes H. | Sinisalo, Juha | Silander, Kaisa | Sijbrands, Eric J.G. | Scuteri, Angelo | Scott, James | Schlessinger, David | Sanna, Serena | Salomaa, Veikko | Saharinen, Juha | Sabatti, Chiara | Ruokonen, Aimo | Rudan, Igor | Rose, Lynda M. | Roberts, Robert | Rieder, Mark | Psaty, Bruce M. | Pramstaller, Peter P. | Pichler, Irene | Perola, Markus | Penninx, Brenda W.J.H. | Pedersen, Nancy L. | Pattaro, Cristian | Parker, Alex N. | Pare, Guillaume | Oostra, Ben A. | O'Donnell, Christopher J. | Nieminen, Markku S. | Nickerson, Deborah A. | Montgomery, Grant W. | Meitinger, Thomas | McPherson, Ruth | McCarthy, Mark I. | McArdle, Wendy | Masson, David | Martin, Nicholas G. | Marroni, Fabio | Mangino, Massimo | Magnusson, Patrik K.E. | Lucas, Gavin | Luben, Robert | Loos, Ruth J. F. | Lokki, Maisa | Lettre, Guillaume | Langenberg, Claudia | Launer, Lenore J. | Lakatta, Edward G. | Laaksonen, Reijo | Kyvik, Kirsten O. | Kronenberg, Florian | König, Inke R. | Khaw, Kay-Tee | Kaprio, Jaakko | Kaplan, Lee M. | Johansson, Åsa | Jarvelin, Marjo-Riitta | Janssens, A. Cecile J.W. | Ingelsson, Erik | Igl, Wilmar | Hovingh, G. Kees | Hottenga, Jouke-Jan | Hofman, Albert | Hicks, Andrew A. | Hengstenberg, Christian | Heid, Iris M. | Hayward, Caroline | Havulinna, Aki S. | Hastie, Nicholas D. | Harris, Tamara B. | Haritunians, Talin | Hall, Alistair S. | Gyllensten, Ulf | Guiducci, Candace | Groop, Leif C. | Gonzalez, Elena | Gieger, Christian | Freimer, Nelson B. | Ferrucci, Luigi | Erdmann, Jeanette | Elliott, Paul | Ejebe, Kenechi G. | Döring, Angela | Dominiczak, Anna F. | Demissie, Serkalem | Deloukas, Panagiotis | de Geus, Eco J.C. | de Faire, Ulf | Crawford, Gabriel | Collins, Francis S. | Chen, Yii-der I. | Caulfield, Mark J. | Campbell, Harry | Burtt, Noel P. | Bonnycastle, Lori L. | Boomsma, Dorret I. | Boekholdt, S. Matthijs | Bergman, Richard N. | Barroso, Inês | Bandinelli, Stefania | Ballantyne, Christie M. | Assimes, Themistocles L. | Quertermous, Thomas | Altshuler, David | Seielstad, Mark | Wong, Tien Y. | Tai, E-Shyong | Feranil, Alan B. | Kuzawa, Christopher W. | Adair, Linda S. | Taylor, Herman A. | Borecki, Ingrid B. | Gabriel, Stacey B. | Wilson, James G. | Stefansson, Kari | Thorsteinsdottir, Unnur | Gudnason, Vilmundur | Krauss, Ronald M. | Mohlke, Karen L. | Ordovas, Jose M. | Munroe, Patricia B. | Kooner, Jaspal S. | Tall, Alan R. | Hegele, Robert A. | Kastelein, John J.P. | Schadt, Eric E. | Rotter, Jerome I. | Boerwinkle, Eric | Strachan, David P. | Mooser, Vincent | Holm, Hilma | Reilly, Muredach P. | Samani, Nilesh J | Schunkert, Heribert | Cupples, L. Adrienne | Sandhu, Manjinder S. | Ridker, Paul M | Rader, Daniel J. | van Duijn, Cornelia M. | Peltonen, Leena | Abecasis, Gonçalo R. | Boehnke, Michael | Kathiresan, Sekar
Nature  2010;466(7307):707-713.
Serum concentrations of total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) are among the most important risk factors for coronary artery disease (CAD) and are targets for therapeutic intervention. We screened the genome for common variants associated with serum lipids in >100,000 individuals of European ancestry. Here we report 95 significantly associated loci (P < 5 × 10-8), with 59 showing genome-wide significant association with lipid traits for the first time. The newly reported associations include single nucleotide polymorphisms (SNPs) near known lipid regulators (e.g., CYP7A1, NPC1L1, and SCARB1) as well as in scores of loci not previously implicated in lipoprotein metabolism. The 95 loci contribute not only to normal variation in lipid traits but also to extreme lipid phenotypes and impact lipid traits in three non-European populations (East Asians, South Asians, and African Americans). Our results identify several novel loci associated with serum lipids that are also associated with CAD. Finally, we validated three of the novel genes—GALNT2, PPP1R3B, and TTC39B—with experiments in mouse models. Taken together, our findings provide the foundation to develop a broader biological understanding of lipoprotein metabolism and to identify new therapeutic opportunities for the prevention of CAD.
PMCID: PMC3039276  PMID: 20686565
8.  Genetic Variants Near TNFAIP3 on 6q23 are Associated with Systemic Lupus Erythematosus (SLE) 
Nature genetics  2008;40(9):1059-1061.
SLE is an autoimmune disease influenced by genetic and environmental components. We performed a genome-wide association scan (GWAS) and observed novel association evidence with a variant inTNFAIP3(rs5029939, P = 2.89×10−12, OR = 2.29). We also found evidence of two independent signals of association to SLE risk, including one described in Rheumatoid Arthritis. These results establish that genetic variation inTNFAIP3contributes to differential risk for SLE and RA.
PMCID: PMC2772171  PMID: 19165918
9.  Two independent alleles at 6q23 associated with risk of rheumatoid arthritis 
Nature genetics  2007;39(12):1477-1482.
To identify susceptibility alleles associated with rheumatoid arthritis, we genotyped 397 individuals with rheumatoid arthritis for 116,204 SNPs and carried out an association analysis in comparison to publicly available genotype data for 1,211 related individuals from the Framingham Heart Study1. After evaluating and adjusting for technical and population biases, we identified a SNP at 6q23 (rs10499194, ∼150 kb from TNFAIP3 and OLIG3) that was reproducibly associated with rheumatoid arthritis both in the genome-wide association (GWA) scan and in 5,541 additional case-control samples (P = 10−3, GWA scan; P < 10−6, replication; P = 10−9, combined). In a concurrent study, the Wellcome Trust Case Control Consortium (WTCCC) has reported strong association of rheumatoid arthritis susceptibility to a different SNP located 3.8 kb from rs10499194 (rs6920220; P = 5 × 10−6 in WTCCC)2. We show that these two SNP associations are statistically independent, are each reproducible in the comparison of our data and WTCCC data, and define risk and protective haplotypes for rheumatoid arthritis at 6q23.
PMCID: PMC2652744  PMID: 17982456
10.  A comprehensive analysis of common genetic variation in prolactin (PRL) and PRL receptor (PRLR) genes in relation to plasma prolactin levels and breast cancer risk: the Multiethnic Cohort 
BMC Medical Genetics  2007;8:72.
Studies in animals and humans clearly indicate a role for prolactin (PRL) in breast epithelial proliferation, differentiation, and tumorigenesis. Prospective epidemiological studies have also shown that women with higher circulating PRL levels have an increase in risk of breast cancer, suggesting that variability in PRL may also be important in determining a woman's risk.
We evaluated genetic variation in the PRL and PRL receptor (PRLR) genes as predictors of plasma PRL levels and breast cancer risk among African-American, Native Hawaiian, Japanese-American, Latina, and White women in the Multiethnic Cohort Study (MEC). We selected single nucleotide polymorphisms (SNPs) from both the public (dbSNP) and private (Celera) databases to construct high density SNP maps that included up to 20 kilobases (kb) upstream of the transcription initiation site and 10 kb downstream of the last exon of each gene, for a total coverage of 59 kb in PRL and 210 kb in PRLR. We genotyped 80 SNPs in PRL and 173 SNPs in PRLR in a multiethnic panel of 349 unaffected subjects to characterize linkage disequilibrium (LD) and haplotype patterns. We sequenced the coding regions of PRL and PRLR in 95 advanced breast cancer cases (19 of each racial/ethnic group) to uncover putative functional variation. A total of 33 and 60 haplotype "tag" SNPs (tagSNPs) that allowed for high predictability (Rh2 ≥ 0.70) of the common haplotypes in PRL and PRLR, respectively, were then genotyped in a multiethnic breast cancer case-control study of 1,615 invasive breast cancer cases and 1,962 controls in the MEC. We also assessed the association of common genetic variation with circulating PRL levels in 362 postmenopausal controls without a history of hormone therapy use at blood draw. Because of the large number of comparisons being performed we used a relatively stringent type I error criteria (p < 0.0005) for evaluating the significance of any single association to correct for performing approximately 100 independent tests, close to the number of tagSNPs genotyped for both genes.
We observed no significant associations between PRL and PRLR haplotypes or individual SNPs in relation to breast cancer risk. A nominally significant association was noted between prolactin levels and a tagSNP (tagSNP 44, rs2244502) in intron 1 of PRL. This SNP showed approximately a 50% increase in levels between minor allele homozygotes vs. major allele homozygotes. However, this association was not significant (p = 0.002) using our type I error criteria to correct for multiple testing, nor was this SNP associated with breast cancer risk (p = 0.58).
In this comprehensive analysis covering 59 kb of the PRL locus and 210 kb of the PRLR locus, we found no significant association between common variation in these candidate genes and breast cancer risk or plasma PRL levels. The LD characterization of PRL and PRLR in this multiethnic population provide a framework for studying these genes in relation to other disease outcomes that have been associated with PRL, as well as for larger studies of plasma PRL levels.
PMCID: PMC2219987  PMID: 18053149
11.  Genetic Variation in the HSD17B1 Gene and Risk of Prostate Cancer 
PLoS Genetics  2005;1(5):e68.
Steroid hormones are believed to play an important role in prostate carcinogenesis, but epidemiological evidence linking prostate cancer and steroid hormone genes has been inconclusive, in part due to small sample sizes or incomplete characterization of genetic variation at the locus of interest. Here we report on the results of a comprehensive study of the association between HSD17B1 and prostate cancer by the Breast and Prostate Cancer Cohort Consortium, a large collaborative study. HSD17B1 encodes 17β-hydroxysteroid dehydrogenase 1, an enzyme that converts dihydroepiandrosterone to the testosterone precursor Δ5-androsterone-3β,17β-diol and converts estrone to estradiol. The Breast and Prostate Cancer Cohort Consortium researchers systematically characterized variation in HSD17B1 by targeted resequencing and dense genotyping; selected haplotype-tagging single nucleotide polymorphisms (htSNPs) that efficiently predict common variants in U.S. and European whites, Latinos, Japanese Americans, and Native Hawaiians; and genotyped these htSNPs in 8,290 prostate cancer cases and 9,367 study-, age-, and ethnicity-matched controls. We found no evidence that HSD17B1 htSNPs (including the nonsynonymous coding SNP S312G) or htSNP haplotypes were associated with risk of prostate cancer or tumor stage in the pooled multiethnic sample or in U.S. and European whites. Analyses stratified by age, body mass index, and family history of disease found no subgroup-specific associations between these HSD17B1 htSNPs and prostate cancer. We found significant evidence of heterogeneity in associations between HSD17B1 haplotypes and prostate cancer across ethnicity: one haplotype had a significant (p < 0.002) inverse association with risk of prostate cancer in Latinos and Japanese Americans but showed no evidence of association in African Americans, Native Hawaiians, or whites. However, the smaller numbers of Latinos and Japanese Americans in this study makes these subgroup analyses less reliable. These results suggest that the germline variants in HSD17B1 characterized by these htSNPs do not substantially influence the risk of prostate cancer in U.S. and European whites.
Steroid hormones such as estrogen and testosterone are hypothesized to play a role in the development of cancer. This is the first substantive paper from the Breast and Prostate Cancer Cohort Consortium, a large, international study designed to assess the effect of variation in genes that influence hormone production and activity on the risk of breast and prostate cancer. The investigators first constructed a detailed map of genetic variation spanning HSD17B1, a gene involved in the production of estrogen and testosterone. This enabled them to efficiently measure common variation across the whole gene, capturing information about both known variants with a plausible function and unknown variants with an unknown function. Because of the results with a large number of study participants, the investigators could rule out strong associations between common HSD17B1 variants and risk of prostate cancer among U.S. and European whites. While this sheds some light on the carcinogenic effects of one enzyme involved in the complex process of steroid hormone production, it remains to be determined whether variants in other genes play a more important role or if the combined effects of several genes within these pathways have a larger impact.
PMCID: PMC1287955  PMID: 16311626

