Search tips
Search criteria

Results 1-23 (23)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
Document Types
1.  DGIdb - Mining the druggable genome 
Nature methods  2013;10(12):10.1038/nmeth.2689.
The Drug-Gene Interaction database (DGIdb) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development. It provides an interface for searching lists of genes against a compendium of drug-gene interactions and potentially druggable genes. DGIdb can be accessed at
PMCID: PMC3851581  PMID: 24122041
Nature genetics  2013;45(3):242-252.
The genetic basis of hypodiploid acute lymphoblastic leukemia (ALL), a subtype of ALL characterized by aneuploidy and poor outcome, is unknown. Genomic profiling of 124 hypodiploid ALL cases, including whole genome and exome sequencing of 40 cases, identified two subtypes that differ in severity of aneuploidy, transcriptional profile and submicroscopic genetic alterations. Near haploid cases with 24–31 chromosomes harbor alterations targeting receptor tyrosine kinase- and Ras signaling (71%) and the lymphoid transcription factor IKZF3 (AIOLOS; 13%). In contrast, low hypodiploid ALL with 32–39 chromosomes are characterized by TP53 alterations (91.2%) which are commonly present in non-tumor cells, and alterations of IKZF2 (HELIOS; 53%) and RB1 (41%). Both near haploid and low hypodiploid tumors exhibit activation of Ras- and PI3K signaling pathways, and are sensitive to PI3K inhibitors, indicating that these drugs should be explored as a new therapeutic strategy for this aggressive form of leukemia.
PMCID: PMC3919793  PMID: 23334668
3.  RB1 gene inactivation by chromothripsis in human retinoblastoma 
Oncotarget  2014;5(2):438-450.
Retinoblastoma is a rare childhood cancer of the developing retina. Most retinoblastomas initiate with biallelic inactivation of the RB1 gene through diverse mechanisms including point mutations, nucleotide insertions, deletions, loss of heterozygosity and promoter hypermethylation. Recently, a novel mechanism of retinoblastoma initiation was proposed. Gallie and colleagues discovered that a small proportion of retinoblastomas lack RB1 mutations and had MYCN amplification [1]. In this study, we identifed recurrent chromosomal, regional and focal genomic lesions in 94 primary retinoblastomas with their matched normal DNA using SNP 6.0 chips. We also analyzed the RB1 gene mutations and compared the mechanism of RB1 inactivation to the recurrent copy number variations in the retinoblastoma genome. In addition to the previously described focal amplification of MYCN and deletions in RB1 and BCOR, we also identifed recurrent focal amplification of OTX2, a transcription factor required for retinal photoreceptor development. We identifed 10 retinoblastomas in our cohort that lacked RB1 point mutations or indels. We performed whole genome sequencing on those 10 tumors and their corresponding germline DNA. In one of the tumors, the RB1 gene was unaltered, the MYCN gene was amplified and RB1 protein was expressed in the nuclei of the tumor cells. In addition, several tumors had complex patterns of structural variations and we identified 3 tumors with chromothripsis at the RB1 locus. This is the first report of chromothripsis as a mechanism for RB1 gene inactivation in cancer.
PMCID: PMC3964219  PMID: 24509483
chromothripsis; retinoblastoma; RB1; MYCN
4.  Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas 
Nature genetics  2013;45(6):602-612.
The commonest pediatric brain tumors are low-grade gliomas (LGGs). We utilized whole genome sequencing to discover multiple novel genetic alterations involving BRAF, RAF1, FGFR1, MYB, MYBL1 and genes with histone-related functions, including H3F3A and ATRX, in 39 LGGs and low-grade glioneuronal tumors (LGGNTs). Only a single non-silent somatic alteration was detected in 24/39 (62%) tumors. Intragenic duplications of the FGFR1 tyrosine kinase domain (TKD) and rearrangements of MYB were recurrent and mutually exclusive in 53% of grade II diffuse LGGs. Transplantation of Trp53-null neonatal astrocytes containing TKD-duplicated FGFR1 into brains of nude mice generated high-grade astrocytomas with short latency and 100% penetrance. TKD-duplicated FGFR1 induced FGFR1 autophosphorylation and upregulation of the MAPK/ERK and PI3K pathways, which could be blocked by specific inhibitors. Focusing on the therapeutically challenging diffuse LGGs, our study of 151 tumors has discovered genetic alterations and potential therapeutic targets across the entire range of pediatric LGGs/LGGNTs.
PMCID: PMC3727232  PMID: 23583981
Cell  2012;150(6):1121-1134.
We report the results of whole genome and transcriptome sequencing of tumor and adjacent normal tissue samples from 17 patients with non-small cell lung carcinoma (NSCLC). We identified 3,726 point mutations and over 90 indels in the coding sequence, with an average mutation frequency more than 10-fold higher in smokers than in never-smokers. Novel alterations in genes involved in chromatic modification and DNA repair pathways were identified along with DACH1, CFTR, RELN, ABCB5, and HGF. Deep digital sequencing revealed diverse clonality patterns in both never smokers and smokers. All validated EFGR and KRAS mutations were present in the founder clones, suggesting possible roles in cancer initiation. Analysis revealed 14 fusions including ROS1 and ALK as well as novel metabolic enzymes. Cell cycle and JAK-STAT pathways are significantly altered in lung cancer along with perturbations in 54 genes that are potentially targetable with currently available drugs.
PMCID: PMC3656590  PMID: 22980976
6.  The origin and evolution of mutations in Acute Myeloid Leukemia 
Cell  2012;150(2):264-278.
Most mutations in cancer genomes are thought to be acquired after the initiating event, which may cause genomic instability, driving clonal evolution. However, for acute myeloid leukemia (AML), normal karyotypes are common, and genomic instability is unusual. To better understand clonal evolution in AML, we sequenced the genomes of AML samples with a known initiating event (PML-RARA) vs. normal karyotype AML samples, and the exomes of hematopoietic stem/progenitor cells (HSPCs) from healthy people. Collectively, the data suggest that most of the mutations found in AML genomes are actually random events that occurred in HSPCs before they acquired the initiating mutation; the mutational history of that cell is “captured” as the clone expands. In many cases, only one or two additional, cooperating mutations are needed to generate the malignant founding clone. Cells from the founding clone can acquire additional cooperating mutations, yielding subclones that can contribute to disease progression and/or relapse.
PMCID: PMC3407563  PMID: 22817890
7.  Novel mutations target distinct subgroups of medulloblastoma 
Nature  2012;488(7409):43-48.
Medulloblastoma is a malignant childhood brain tumour comprising four discrete subgroups. To identify mutations that drive medulloblastoma we sequenced the entire genomes of 37 tumours and matched normal blood. One hundred and thirty-six genes harbouring somatic mutations in this discovery set were sequenced in an additional 56 medulloblastomas. Recurrent mutations were detected in 41 genes not yet implicated in medulloblastoma: several target distinct components of the epigenetic machinery in different disease subgroups, e.g., regulators of H3K27 and H3K4 trimethylation in subgroup-3 and 4 (e.g., KDM6A and ZMYM3), and CTNNB1-associated chromatin remodellers in WNT-subgroup tumours (e.g., SMARCA4 and CREBBP). Modelling of mutations in mouse lower rhombic lip progenitors that generate WNT-subgroup tumours, identified genes that maintain this cell lineage (DDX3X) as well as mutated genes that initiate (CDH1) or cooperate (PIK3CA) in tumourigenesis. These data provide important new insights into the pathogenesis of medulloblastoma subgroups and highlight targets for therapeutic development.
PMCID: PMC3412905  PMID: 22722829
8.  SomaticSniper: identification of somatic point mutations in whole genome sequencing data 
Bioinformatics  2011;28(3):311-317.
Motivation: The sequencing of tumors and their matched normals is frequently used to study the genetic composition of cancer. Despite this fact, there remains a dearth of available software tools designed to compare sequences in pairs of samples and identify sites that are likely to be unique to one sample.
Results: In this article, we describe the mathematical basis of our SomaticSniper software for comparing tumor and normal pairs. We estimate its sensitivity and precision, and present several common sources of error resulting in miscalls.
Availability and implementation: Binaries are freely available for download at, implemented in C and supported on Linux and Mac OS X.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3268238  PMID: 22155872
9.  Whole Genome Analysis Informs Breast Cancer Response to Aromatase Inhibition 
Nature  2012;486(7403):353-360.
To correlate the variable clinical features of estrogen receptor positive (ER+) breast cancer with somatic alterations, we studied pre-treatment tumour biopsies accrued from patients in a study of neoadjuvant aromatase inhibitor (AI) therapy by massively parallel sequencing and analysis. Eighteen significantly mutated genes were identified, including five genes (RUNX1, CBFB, MYH9, MLL3 and SF3B1) previously linked to hematopoietic disorders. Mutant MAP3K1 was associated with Luminal A status, low grade histology and low proliferation rates whereas mutant TP53 associated with the opposite pattern. Moreover, mutant GATA3 correlated with suppression of proliferation upon AI treatment. Pathway analysis demonstrated mutations in MAP2K4, a MAP3K1 substrate, produced similar perturbations as MAP3K1 loss. Distinct phenotypes in ER+ breast cancer are associated with specific patterns of somatic mutations that map into cellular pathways linked to tumor biology but most recurrent mutations are relatively infrequent. Prospective clinical trials based on these findings will require comprehensive genome sequencing.
PMCID: PMC3383766  PMID: 22722193
10.  A framework for human microbiome research 
Methé, Barbara A. | Nelson, Karen E. | Pop, Mihai | Creasy, Heather H. | Giglio, Michelle G. | Huttenhower, Curtis | Gevers, Dirk | Petrosino, Joseph F. | Abubucker, Sahar | Badger, Jonathan H. | Chinwalla, Asif T. | Earl, Ashlee M. | FitzGerald, Michael G. | Fulton, Robert S. | Hallsworth-Pepin, Kymberlie | Lobos, Elizabeth A. | Madupu, Ramana | Magrini, Vincent | Martin, John C. | Mitreva, Makedonka | Muzny, Donna M. | Sodergren, Erica J. | Versalovic, James | Wollam, Aye M. | Worley, Kim C. | Wortman, Jennifer R. | Young, Sarah K. | Zeng, Qiandong | Aagaard, Kjersti M. | Abolude, Olukemi O. | Allen-Vercoe, Emma | Alm, Eric J. | Alvarado, Lucia | Andersen, Gary L. | Anderson, Scott | Appelbaum, Elizabeth | Arachchi, Harindra M. | Armitage, Gary | Arze, Cesar A. | Ayvaz, Tulin | Baker, Carl C. | Begg, Lisa | Belachew, Tsegahiwot | Bhonagiri, Veena | Bihan, Monika | Blaser, Martin J. | Bloom, Toby | Vivien Bonazzi, J. | Brooks, Paul | Buck, Gregory A. | Buhay, Christian J. | Busam, Dana A. | Campbell, Joseph L. | Canon, Shane R. | Cantarel, Brandi L. | Chain, Patrick S. | Chen, I-Min A. | Chen, Lei | Chhibba, Shaila | Chu, Ken | Ciulla, Dawn M. | Clemente, Jose C. | Clifton, Sandra W. | Conlan, Sean | Crabtree, Jonathan | Cutting, Mary A. | Davidovics, Noam J. | Davis, Catherine C. | DeSantis, Todd Z. | Deal, Carolyn | Delehaunty, Kimberley D. | Dewhirst, Floyd E. | Deych, Elena | Ding, Yan | Dooling, David J. | Dugan, Shannon P. | Dunne, Wm. Michael | Durkin, A. Scott | Edgar, Robert C. | Erlich, Rachel L. | Farmer, Candace N. | Farrell, Ruth M. | Faust, Karoline | Feldgarden, Michael | Felix, Victor M. | Fisher, Sheila | Fodor, Anthony A. | Forney, Larry | Foster, Leslie | Di Francesco, Valentina | Friedman, Jonathan | Friedrich, Dennis C. | Fronick, Catrina C. | Fulton, Lucinda L. | Gao, Hongyu | Garcia, Nathalia | Giannoukos, Georgia | Giblin, Christina | Giovanni, Maria Y. | Goldberg, Jonathan M. | Goll, Johannes | Gonzalez, Antonio | Griggs, Allison | Gujja, Sharvari | Haas, Brian J. | Hamilton, Holli A. | Harris, Emily L. | Hepburn, Theresa A. | Herter, Brandi | Hoffmann, Diane E. | Holder, Michael E. | Howarth, Clinton | Huang, Katherine H. | Huse, Susan M. | Izard, Jacques | Jansson, Janet K. | Jiang, Huaiyang | Jordan, Catherine | Joshi, Vandita | Katancik, James A. | Keitel, Wendy A. | Kelley, Scott T. | Kells, Cristyn | Kinder-Haake, Susan | King, Nicholas B. | Knight, Rob | Knights, Dan | Kong, Heidi H. | Koren, Omry | Koren, Sergey | Kota, Karthik C. | Kovar, Christie L. | Kyrpides, Nikos C. | La Rosa, Patricio S. | Lee, Sandra L. | Lemon, Katherine P. | Lennon, Niall | Lewis, Cecil M. | Lewis, Lora | Ley, Ruth E. | Li, Kelvin | Liolios, Konstantinos | Liu, Bo | Liu, Yue | Lo, Chien-Chi | Lozupone, Catherine A. | Lunsford, R. Dwayne | Madden, Tessa | Mahurkar, Anup A. | Mannon, Peter J. | Mardis, Elaine R. | Markowitz, Victor M. | Mavrommatis, Konstantinos | McCorrison, Jamison M. | McDonald, Daniel | McEwen, Jean | McGuire, Amy L. | McInnes, Pamela | Mehta, Teena | Mihindukulasuriya, Kathie A. | Miller, Jason R. | Minx, Patrick J. | Newsham, Irene | Nusbaum, Chad | O’Laughlin, Michelle | Orvis, Joshua | Pagani, Ioanna | Palaniappan, Krishna | Patel, Shital M. | Pearson, Matthew | Peterson, Jane | Podar, Mircea | Pohl, Craig | Pollard, Katherine S. | Priest, Margaret E. | Proctor, Lita M. | Qin, Xiang | Raes, Jeroen | Ravel, Jacques | Reid, Jeffrey G. | Rho, Mina | Rhodes, Rosamond | Riehle, Kevin P. | Rivera, Maria C. | Rodriguez-Mueller, Beltran | Rogers, Yu-Hui | Ross, Matthew C. | Russ, Carsten | Sanka, Ravi K. | Pamela Sankar, J. | Sathirapongsasuti, Fah | Schloss, Jeffery A. | Schloss, Patrick D. | Schmidt, Thomas M. | Scholz, Matthew | Schriml, Lynn | Schubert, Alyxandria M. | Segata, Nicola | Segre, Julia A. | Shannon, William D. | Sharp, Richard R. | Sharpton, Thomas J. | Shenoy, Narmada | Sheth, Nihar U. | Simone, Gina A. | Singh, Indresh | Smillie, Chris S. | Sobel, Jack D. | Sommer, Daniel D. | Spicer, Paul | Sutton, Granger G. | Sykes, Sean M. | Tabbaa, Diana G. | Thiagarajan, Mathangi | Tomlinson, Chad M. | Torralba, Manolito | Treangen, Todd J. | Truty, Rebecca M. | Vishnivetskaya, Tatiana A. | Walker, Jason | Wang, Lu | Wang, Zhengyuan | Ward, Doyle V. | Warren, Wesley | Watson, Mark A. | Wellington, Christopher | Wetterstrand, Kris A. | White, James R. | Wilczek-Boney, Katarzyna | Wu, Yuan Qing | Wylie, Kristine M. | Wylie, Todd | Yandava, Chandri | Ye, Liang | Ye, Yuzhen | Yooseph, Shibu | Youmans, Bonnie P. | Zhang, Lan | Zhou, Yanjiao | Zhu, Yiming | Zoloth, Laurie | Zucker, Jeremy D. | Birren, Bruce W. | Gibbs, Richard A. | Highlander, Sarah K. | Weinstock, George M. | Wilson, Richard K. | White, Owen
Nature  2012;486(7402):215-221.
A variety of microbial communities and their genes (microbiome) exist throughout the human body, playing fundamental roles in human health and disease. The NIH funded Human Microbiome Project (HMP) Consortium has established a population-scale framework which catalyzed significant development of metagenomic protocols resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 to 18 body sites up to three times, which to date, have generated 5,177 microbial taxonomic profiles from 16S rRNA genes and over 3.5 Tb of metagenomic sequence. In parallel, approximately 800 human-associated reference genomes have been sequenced. Collectively, these data represent the largest resource to date describing the abundance and variety of the human microbiome, while providing a platform for current and future studies.
PMCID: PMC3377744  PMID: 22699610
11.  Clonal Architecture of Secondary Acute Myeloid Leukemia 
The New England Journal of Medicine  2012;366(12):1090-1098.
The myelodysplastic syndromes are a group of hematologic disorders that often evolve into secondary acute myeloid leukemia (AML). The genetic changes that underlie progression from the myelodysplastic syndromes to secondary AML are not well understood.
We performed whole-genome sequencing of seven paired samples of skin and bone marrow in seven subjects with secondary AML to identify somatic mutations specific to secondary AML. We then genotyped a bone marrow sample obtained during the antecedent myelodysplastic-syndrome stage from each subject to determine the presence or absence of the specific somatic mutations. We identified recurrent mutations in coding genes and defined the clonal architecture of each pair of samples from the myelodysplastic-syndrome stage and the secondary-AML stage, using the allele burden of hundreds of mutations.
Approximately 85% of bone marrow cells were clonal in the myelodysplastic-syndrome and secondary-AML samples, regardless of the myeloblast count. The secondary-AML samples contained mutations in 11 recurrently mutated genes, including 4 genes that have not been previously implicated in the myelodysplastic syndromes or AML. In every case, progression to acute leukemia was defined by the persistence of an antecedent founding clone containing 182 to 660 somatic mutations and the outgrowth or emergence of at least one subclone, harboring dozens to hundreds of new mutations. All founding clones and subclones contained at least one mutation in a coding gene.
Nearly all the bone marrow cells in patients with myelodysplastic syndromes and secondary AML are clonally derived. Genetic evolution of secondary AML is a dynamic process shaped by multiple cycles of mutation acquisition and clonal selection. Recurrent gene mutations are found in both founding clones and daughter subclones. (Funded by the National Institutes of Health and others.)
PMCID: PMC3320218  PMID: 22417201
12.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing 
Nature  2012;481(7382):506-510.
Most patients with acute myeloid leukemia (AML) die from progressive disease after relapse, which is associated with clonal evolution at the cytogenetic level1,2. To determine the mutational spectrum associated with relapse, we sequenced the primary tumor and relapse genomes from 8 AML patients, and validated hundreds of somatic mutations using deep sequencing; this allowed us to precisely define clonality and clonal evolution patterns at relapse. Besides discovering novel, recurrently mutated genes (e.g. WAC, SMC3, DIS3, DDX41, and DAXX) in AML, we found two major clonal evolution patterns during AML relapse: 1) the founding clone in the primary tumor gained mutations and evolved into the relapse clone, or 2) a subclone of the founding clone survived initial therapy, gained additional mutations, and expanded at relapse. In all cases, chemotherapy failed to eradicate the founding clone. The comparison of relapse-specific vs. primary tumor mutations in all 8 cases revealed an increase in transversions, probably due to DNA damage caused by cytotoxic chemotherapy. These data demonstrate that AML relapse is associated with the addition of new mutations and clonal evolution, which is shaped in part by the chemotherapy that the patients receive to establish and maintain remissions.
PMCID: PMC3267864  PMID: 22237025
13.  A Novel Retinoblastoma Therapy from Genomic and Epigenetic Analyses 
Nature  2012;481(7381):329-334.
Retinoblastoma is an aggressive childhood cancer of the developing retina that is initiated by the biallelic loss of the RB1 gene. To identify the mutations that cooperate with RB1 loss, we performed whole-genome sequencing of retinoblastomas. The overall mutational rate was very low; RB1 was the only known cancer gene mutated. We then evaluated RB1’s role in genome stability and considered nongenetic mechanisms of cancer pathway deregulation. Here we show that the retinoblastoma genome is stable, but multiple cancer pathways can be epigenetically deregulated. For example, the proto-oncogene SYK is upregulated in retinoblastoma and is required for tumor cell survival. Targeting SYK with a small-molecule inhibitor induced retinoblastoma tumor cell death in vitro and in vivo. Thus, RB1 inactivation may allow preneoplastic cells to acquire multiple hallmarks of cancer through epigenetic mechanisms, resulting directly or indirectly from RB1 loss. These data provide novel targets for chemotherapeutic interventions of retinoblastoma.
PMCID: PMC3289956  PMID: 22237022
14.  The genetic basis of early T-cell precursor acute lymphoblastic leukaemia 
Nature  2012;481(7380):157-163.
Early T-cell precursor acute lymphoblastic leukaemia (ETP ALL) is an aggressive malignancy of unknown genetic basis. We performed whole-genome sequencing of 12 ETP ALL cases and assessed the frequency of the identified somatic mutations in 94 T-cell acute lymphoblastic leukaemia cases. ETP ALL was characterized by activating mutations in genes regulating cytokine receptor and RAS signalling (67% of cases; NRAS, KRAS, FLT3, IL7R, JAK3, JAK1, SH2B3 and BRAF), inactivating lesions disrupting haematopoietic development (58%; GATA3, ETV6, RUNX1, IKZF1 and EP300) and histone-modifying genes (48%; EZH2, EED, SUZ12, SETD2 and EP300). We also identified new targets of recurrent mutation including DNM2, ECT2L and RELN. The mutational spectrum is similar to myeloid tumours, and moreover, the global transcriptional profile of ETP ALL was similar to that of normal and myeloid leukaemia haematopoietic stem cells. These findings suggest that addition of myeloid-directed therapies might improve the poor outcome of ETP ALL.
PMCID: PMC3267575  PMID: 22237106
Nature Genetics  2011;44(1):53-57.
Myelodysplastic syndromes (MDS) are hematopoietic stem cell disorders that often progress to chemotherapy-resistant secondary acute myeloid leukemia (sAML). We used whole genome sequencing to perform an unbiased comprehensive screen to discover all the somatic mutations in a sAML sample and genotyped these loci in the matched MDS sample. Here we show that a missense mutation affecting the serine at codon 34 (S34) in U2AF1 was recurrently mutated in 13/150 (8.7%) de novo MDS patients, with suggestive evidence of an associated increased risk of progression to sAML. U2AF1 is a U2 auxiliary factor protein that recognizes the AG splice acceptor dinucleotide at the 3′ end of introns and mutations are located in highly conserved zinc fingers in U2AF11,2. Mutant U2AF1 promotes enhanced splicing and exon skipping in reporter assays in vitro. This novel, recurrent mutation in U2AF1 implicates altered pre-mRNA splicing as a potential mechanism for MDS pathogenesis.
PMCID: PMC3247063  PMID: 22158538
16.  Recurring Mutations Found by Sequencing an Acute Myeloid Leukemia Genome 
The New England journal of medicine  2009;361(11):1058-1066.
The full complement of DNA mutations that are responsible for the pathogenesis of acute myeloid leukemia (AML) is not yet known.
We used massively parallel DNA sequencing to obtain a very high level of coverage (approximately 98%) of a primary, cytogenetically normal, de novo genome for AML with minimal maturation (AML-M1) and a matched normal skin genome.
We identified 12 acquired (somatic) mutations within the coding sequences of genes and 52 somatic point mutations in conserved or regulatory portions of the genome. All mutations appeared to be heterozygous and present in nearly all cells in the tumor sample. Four of the 64 mutations occurred in at least 1 additional AML sample in 188 samples that were tested. Mutations in NRAS and NPM1 had been identified previously in patients with AML, but two other mutations had not been identified. One of these mutations, in the IDH1 gene, was present in 15 of 187 additional AML genomes tested and was strongly associated with normal cytogenetic status; it was present in 13 of 80 cytogenetically normal samples (16%). The other was a nongenic mutation in a genomic region with regulatory potential and conservation in higher mammals; we detected it in one additional AML tumor. The AML genome that we sequenced contains approximately 750 point mutations, of which only a small fraction are likely to be relevant to pathogenesis.
By comparing the sequences of tumor and skin genomes of a patient with AML-M1, we have identified recurring mutations that may be relevant for pathogenesis.
PMCID: PMC3201812  PMID: 19657110
17.  DNMT3A Mutations in Acute Myeloid Leukemia 
The New England journal of medicine  2010;363(25):2424-2433.
The genetic alterations responsible for an adverse outcome in most patients with acute myeloid leukemia (AML) are unknown.
Using massively parallel DNA sequencing, we identified a somatic mutation in DNMT3A, encoding a DNA methyltransferase, in the genome of cells from a patient with AML with a normal karyotype. We sequenced the exons of DNMT3A in 280 additional patients with de novo AML to define recurring mutations.
A total of 62 of 281 patients (22.1%) had mutations in DNMT3A that were predicted to affect translation. We identified 18 different missense mutations, the most common of which was predicted to affect amino acid R882 (in 37 patients). We also identified six frameshift, six nonsense, and three splice-site mutations and a 1.5-Mbp deletion encompassing DNMT3A. These mutations were highly enriched in the group of patients with an intermediate-risk cytogenetic profile (56 of 166 patients, or 33.7%) but were absent in all 79 patients with a favorable-risk cytogenetic profile (P<0.001 for both comparisons). The median overall survival among patients with DNMT3A mutations was significantly shorter than that among patients without such mutations (12.3 months vs. 41.1 months, P<0.001). DNMT3A mutations were associated with adverse outcomes among patients with an intermediate-risk cytogenetic profile or FLT3 mutations, regardless of age, and were independently associated with a poor outcome in Cox proportional-hazards analysis.
DNMT3A mutations are highly recurrent in patients with de novo AML with an intermediate-risk cytogenetic profile and are independently associated with a poor outcome. (Funded by the National Institutes of Health and others.)
PMCID: PMC3201818  PMID: 21067377
18.  The identification of a novel TP53 cancer susceptibility mutation through whole genome sequencing of a patient with therapy-related AML 
The identification of patients with inherited cancer susceptibility syndromes facilitates early diagnosis, prevention, and treatment. However, in many cases of suspected cancer susceptibility, the family history is unclear and genetic testing of common cancer susceptibility genes is unrevealing.
To apply whole-genome sequencing to a patient with suspected cancer susceptibility (and lacking a clear family history of cancer and no BRCA1 and BRCA2 mutations) to identify rare or novel germline variants in cancer susceptibility genes.
Design, Setting, and Participant
Skin (normal) and bone marrow (leukemia) DNA were obtained from a patient with early-onset breast and ovarian cancer and therapy-related acute myeloid leukemia (t-AML), and analyzed with: 1) whole genome sequencing using paired end reads; 2) SNP genotyping; 3) RNA expression profiling; and 4) spectral karyotyping.
Main Outcome Measures
Structural variants, copy number alterations, single nucleotide variants and small insertions and deletions (indels) were detected and validated using the above platforms.
Whole genome sequencing revealed a novel, heterozygous 3 Kb deletion removing exons 7-9 of TP53 in the patient’s normal skin DNA, which was homozygous in the leukemia DNA as a result of uniparental disomy. In addition, a total of 28 validated somatic single nucleotide variations or indels in coding genes, 8 somatic structural variants, and 12 somatic copy number alterations were detected in the patient’s leukemia genome.
Whole genome sequencing can identify novel, cryptic variants in cancer susceptibility genes in addition to providing unbiased information on the spectrum of mutations in a cancer genome.
PMCID: PMC3170052  PMID: 21505135
19.  A vertebrate case study of the quality of assemblies derived from next-generation sequences 
Genome Biology  2011;12(3):R31.
The unparalleled efficiency of next-generation sequencing (NGS) has prompted widespread adoption, but significant problems remain in the use of NGS data for whole genome assembly. We explore the advantages and disadvantages of chicken genome assemblies generated using a variety of sequencing and assembly methodologies. NGS assemblies are equivalent in some ways to a Sanger-based assembly yet deficient in others. Nonetheless, these assemblies are sufficient for the identification of the majority of genes and can reveal novel sequences when compared to existing assembly references.
PMCID: PMC3129681  PMID: 21453517
20.  Genome Remodeling in a Basal-like Breast Cancer Metastasis and Xenograft 
Nature  2010;464(7291):999-1005.
Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumor progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumor, a brain metastasis, and a xenograft derived from the primary tumor. The metastasis contained two de novo mutations and a large deletion not present in the primary tumor, and was significantly enriched for 20 shared mutations. The xenograft retained all primary tumor mutations, and displayed a mutation enrichment pattern that paralleled the metastasis (16 of 20 genes). Two overlapping large deletions, encompassing CTNNA1, were present in all three tumor samples. The differential mutation frequencies and structural variation patterns in metastasis and xenograft compared to the primary tumor suggest that secondary tumors may arise from a minority of cells within the primary.
PMCID: PMC2872544  PMID: 20393555
21.  Somatic mutations affect key pathways in lung adenocarcinoma 
Ding, Li | Getz, Gad | Wheeler, David A. | Mardis, Elaine R. | McLellan, Michael D. | Cibulskis, Kristian | Sougnez, Carrie | Greulich, Heidi | Muzny, Donna M. | Morgan, Margaret B. | Fulton, Lucinda | Fulton, Robert S. | Zhang, Qunyuan | Wendl, Michael C. | Lawrence, Michael S. | Larson, David E. | Chen, Ken | Dooling, David J. | Sabo, Aniko | Hawes, Alicia C. | Shen, Hua | Jhangiani, Shalini N. | Lewis, Lora R. | Hall, Otis | Zhu, Yiming | Mathew, Tittu | Ren, Yanru | Yao, Jiqiang | Scherer, Steven E. | Clerc, Kerstin | Metcalf, Ginger A. | Ng, Brian | Milosavljevic, Aleksandar | Gonzalez-Garay, Manuel L. | Osborne, John R. | Meyer, Rick | Shi, Xiaoqi | Tang, Yuzhu | Koboldt, Daniel C. | Lin, Ling | Abbott, Rachel | Miner, Tracie L. | Pohl, Craig | Fewell, Ginger | Haipek, Carrie | Schmidt, Heather | Dunford-Shore, Brian H. | Kraja, Aldi | Crosby, Seth D. | Sawyer, Christopher S. | Vickery, Tammi | Sander, Sacha | Robinson, Jody | Winckler, Wendy | Baldwin, Jennifer | Chirieac, Lucian R. | Dutt, Amit | Fennell, Tim | Hanna, Megan | Johnson, Bruce E. | Onofrio, Robert C. | Thomas, Roman K. | Tonon, Giovanni | Weir, Barbara A. | Zhao, Xiaojun | Ziaugra, Liuda | Zody, Michael C. | Giordano, Thomas | Orringer, Mark B. | Roth, Jack A. | Spitz, Margaret R. | Wistuba, Ignacio I. | Ozenberger, Bradley | Good, Peter J. | Chang, Andrew C. | Beer, David G. | Watson, Mark A. | Ladanyi, Marc | Broderick, Stephen | Yoshizawa, Akihiko | Travis, William D. | Pao, William | Province, Michael A. | Weinstock, George M. | Varmus, Harold E. | Gabriel, Stacey B. | Lander, Eric S. | Gibbs, Richard A. | Meyerson, Matthew | Wilson, Richard K.
Nature  2008;455(7216):1069-1075.
Determining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well-classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples. Our analysis identified 26 genes that are mutated at significantly high frequencies and thus are probably involved in carcinogenesis. The frequently mutated genes include tyrosine kinases, among them the EGFR homologue ERBB4; multiple ephrin receptor genes, notably EPHA3; vascular endothelial growth factor receptor KDR; and NTRK genes. These data provide evidence of somatic mutations in primary lung adenocarcinoma for several tumour suppressor genes involved in other cancers—including NF1, APC, RB1 and ATM—and for sequence changes in PTPRD as well as the frequently deleted gene LRP1B. The observed mutational profiles correlate with clinical features, smoking status and DNA repair defects. These results are reinforced by data integration including single nucleotide polymorphism array and gene expression array. Our findings shed further light on several important signalling pathways involved in lung adenocarcinoma, and suggest new molecular targets for treatment.
PMCID: PMC2694412  PMID: 18948947
22.  DNA sequencing of a cytogenetically normal acute myeloid leukemia genome 
Nature  2008;456(7218):66-72.
Lay Summary
Acute myeloid leukemia is a highly malignant hematopoietic tumor that affects about 13,000 adults yearly in the United States. The treatment of this disease has changed little in the past two decades, since most of the genetic events that initiate the disease remain undiscovered. Whole genome sequencing is now possible at a reasonable cost and timeframe to utilize this approach for unbiased discovery of tumor-specific somatic mutations that alter the protein-coding genes. Here we show the results obtained by sequencing a typical acute myeloid leukemia genome and its matched normal counterpart, obtained from the patient’s skin. We discovered 10 genes with acquired mutations; two were previously described mutations thought to contribute to tumor progression, and 8 were novel mutations present in virtually all tumor cells at presentation and relapse, whose function is not yet known. Our study establishes whole genome sequencing as an unbiased method for discovering initiating mutations in cancer genomes, and for identifying novel genes that may respond to targeted therapies.
We used massively parallel sequencing technology to sequence the genomic DNA of tumor and normal skin cells obtained from a patient with a typical presentation of FAB M1 Acute Myeloid Leukemia (AML) with normal cytogenetics. 32.7-fold ‘haploid’ coverage (98 billion bases) was obtained for the tumor genome, and 13.9-fold coverage (41.8 billion bases) was obtained for the normal sample. Of 2,647,695 well-supported Single Nucleotide Variants (SNVs) found in the tumor genome, 2,588,486 (97.7%) also were detected in the patient’s skin genome, limiting the number of variants that required further study. For the purposes of this initial study, we restricted our downstream analysis to the coding sequences of annotated genes: we found only eight heterozygous, non-synonymous somatic SNVs in the entire genome. All were novel, including mutations in protocadherin/cadherin family members (CDH24 and PCLKC), G-protein coupled receptors (GPR123 and EBI2), a protein phosphatase (PTPRT), a potential guanine nucleotide exchange factor (KNDC1), a peptide/drug transporter (SLC15A1), and a glutamate receptor gene (GRINL1B). We also detected previously described, recurrent somatic insertions in the FLT3 and NPM1 genes. Based on deep readcount data, we determined that all of these mutations (except FLT3) were present in nearly all tumor cells at presentation, and again at relapse 11 months later, suggesting that the patient had a single dominant clone containing all of the mutations. These results demonstrate the power of whole genome sequencing to discover novel cancer-associated mutations.
PMCID: PMC2603574  PMID: 18987736
23.  Design and implementation of a generalized laboratory data model 
BMC Bioinformatics  2007;8:362.
Investigators in the biological sciences continue to exploit laboratory automation methods and have dramatically increased the rates at which they can generate data. In many environments, the methods themselves also evolve in a rapid and fluid manner. These observations point to the importance of robust information management systems in the modern laboratory. Designing and implementing such systems is non-trivial and it appears that in many cases a database project ultimately proves unserviceable.
We describe a general modeling framework for laboratory data and its implementation as an information management system. The model utilizes several abstraction techniques, focusing especially on the concepts of inheritance and meta-data. Traditional approaches commingle event-oriented data with regular entity data in ad hoc ways. Instead, we define distinct regular entity and event schemas, but fully integrate these via a standardized interface. The design allows straightforward definition of a "processing pipeline" as a sequence of events, obviating the need for separate workflow management systems. A layer above the event-oriented schema integrates events into a workflow by defining "processing directives", which act as automated project managers of items in the system. Directives can be added or modified in an almost trivial fashion, i.e., without the need for schema modification or re-certification of applications. Association between regular entities and events is managed via simple "many-to-many" relationships. We describe the programming interface, as well as techniques for handling input/output, process control, and state transitions.
The implementation described here has served as the Washington University Genome Sequencing Center's primary information system for several years. It handles all transactions underlying a throughput rate of about 9 million sequencing reactions of various kinds per month and has handily weathered a number of major pipeline reconfigurations. The basic data model can be readily adapted to other high-volume processing environments.
PMCID: PMC2194795  PMID: 17897463

Results 1-23 (23)