PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (27)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
Document Types
1.  Using VarScan 2 for Germline Variant Calling and Somatic Mutation Detection 
The identification of small sequence variants remains a challenging but critical step in the analysis of next-generation sequencing data. Our variant calling tool, VarScan 2, employs heuristic and statistic thresholds based on user-defined criteria to call variants using SAMtools mpileup data as input. Here, we provide guidelines for generating that input, and describe protocols for using VarScan 2 to (1) identify germline variants in individual samples; (2) call somatic mutations, copy number alterations, and LOH events in tumor-normal pairs; and (3) identify germline variants, de novo mutations, and Mendelian inheritance errors in family trios. Further, we describe a strategy for variant filtering that removes likely false positives associated with common sequencing- and alignment-related artifacts.
doi:10.1002/0471250953.bi1504s44
PMCID: PMC4278659  PMID: 25553206
variant calling; mutation detection; trio calling; snvs; indels; varscan 2; next-generation sequencing
2.  The Next-Generation Sequencing Revolution and Its Impact on Genomics 
Cell  2013;155(1):27-38.
Genomics is a relatively new scientific discipline, having DNA sequencing as its core technology. As technology has improved the cost and scale of genome characterization over sequencing’s 40-year history, the scope of inquiry has commensurately broadened. Massively parallel sequencing has proven revolutionary, shifting the paradigm of genomics to address biological questions at a genome-wide scale. Sequencing now empowers clinical diagnostics and other aspects of medical care, including disease risk, therapeutic identification, and prenatal testing. This Review explores the current state of genomics in the massively parallel sequencing era.
doi:10.1016/j.cell.2013.09.006
PMCID: PMC3969849  PMID: 24074859
3.  Integrated Analysis of Germline and Somatic Variants in Ovarian Cancer 
Nature communications  2014;5:3156.
We report the first large-scale exome-wide analysis of the combined germline-somatic landscape in ovarian cancer. Here we analyze germline and somatic alterations in 429 ovarian carcinoma cases and 557 controls. We identify 3,635 high confidence, rare truncation and 22,953 missense variants with predicted functional impact. We find germline truncation variants and large deletions across Fanconi pathway genes in 20% of cases. Enrichment of rare truncations is shown in BRCA1, BRCA2, and PALB2. Additionally, we observe germline truncation variants in genes not previously associated with ovarian cancer susceptibility (NF1, MAP3K4, CDKN2B, and MLL3). Evidence for loss of heterozygosity was found in 100% and 76% of cases with germline BRCA1 and BRCA2 truncations respectively. Germline-somatic interaction analysis combined with extensive bioinformatics annotation identifies 237 candidate functional germline truncation and missense variants, including 2 pathogenic BRCA1 and 1 TP53 deleterious variants. Finally, integrated analyses of germline and somatic variants identify significantly altered pathways, including the Fanconi, MAPK, and MLL pathways.
doi:10.1038/ncomms4156
PMCID: PMC4025965  PMID: 24448499
4.  DGIdb - Mining the druggable genome 
Nature methods  2013;10(12):10.1038/nmeth.2689.
The Drug-Gene Interaction database (DGIdb) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development. It provides an interface for searching lists of genes against a compendium of drug-gene interactions and potentially druggable genes. DGIdb can be accessed at dgidb.org.
doi:10.1038/nmeth.2689
PMCID: PMC3851581  PMID: 24122041
5.  Identification of a Rare Coding Variant in Complement 3 Associated with Age-related Macular Degeneration 
Nature genetics  2013;45(11):10.1038/ng.2758.
Macular degeneration is a common cause of blindness in the elderly. To identify rare coding variants associated with a large increase in risk of age-related macular degeneration (AMD), we sequenced 2,335 cases and 789 controls in 10 candidate loci (57 genes). To increase power, we augmented our control set with ancestry-matched exome sequenced controls. An analysis of coding variation in 2,268 AMD cases and 2,268 ancestry matched controls revealed two large-effect rare variants; previously described R1210C in the CFH gene (fcase = 0.51%, fcontrol = 0.02%, OR = 23.11), and newly identified K155Q in the C3 gene (fcase = 1.06%, fcontrol = 0.39%, OR = 2.68). The variants suggest decreased inhibition of C3 by Factor H, resulting in increased activation of the alternative complement pathway, as a key component of disease biology.
doi:10.1038/ng.2758
PMCID: PMC3812337  PMID: 24036949
6.  Re-sequencing Expands Our Understanding of the Phenotypic Impact of Variants at GWAS Loci 
PLoS Genetics  2014;10(1):e1004147.
Genome-wide association studies (GWAS) have identified >500 common variants associated with quantitative metabolic traits, but in aggregate such variants explain at most 20–30% of the heritable component of population variation in these traits. To further investigate the impact of genotypic variation on metabolic traits, we conducted re-sequencing studies in >6,000 members of a Finnish population cohort (The Northern Finland Birth Cohort of 1966 [NFBC]) and a type 2 diabetes case-control sample (The Finland-United States Investigation of NIDDM Genetics [FUSION] study). By sequencing the coding sequence and 5′ and 3′ untranslated regions of 78 genes at 17 GWAS loci associated with one or more of six metabolic traits (serum levels of fasting HDL-C, LDL-C, total cholesterol, triglycerides, plasma glucose, and insulin), and conducting both single-variant and gene-level association tests, we obtained a more complete understanding of phenotype-genotype associations at eight of these loci. At all eight of these loci, the identification of new associations provides significant evidence for multiple genetic signals to one or more phenotypes, and at two loci, in the genes ABCA1 and CETP, we found significant gene-level evidence of association to non-synonymous variants with MAF<1%. Additionally, two potentially deleterious variants that demonstrated significant associations (rs138726309, a missense variant in G6PC2, and rs28933094, a missense variant in LIPC) were considerably more common in these Finnish samples than in European reference populations, supporting our prior hypothesis that deleterious variants could attain high frequencies in this isolated population, likely due to the effects of population bottlenecks. Our results highlight the value of large, well-phenotyped samples for rare-variant association analysis, and the challenge of evaluating the phenotypic impact of such variants.
Author Summary
Abnormal serum levels of various metabolites, including measures relevant to cholesterol, other fats, and sugars, are known to be risk factors for cardiovascular disease and type 2 diabetes. Identification of the genes that play a role in generating such abnormalities could advance the development of new treatment and prevention strategies for these disorders. Investigations of common genetic variants carried out in large sets of research subjects have successfully pinpointed such genes within many regions of the human genome. However, these studies often have not led to the identification of the specific genetic variations affecting metabolic traits. To attempt to detect such causal variations, we sequenced genes in 17 genomic regions implicated in metabolic traits in >6,000 people from Finland. By conducting statistical analyses relating specific variations (individually and grouped by gene) to the measures for these metabolic traits observed in the study subjects, we added to our understanding of how genotypes affect these traits. Our findings support a long-held hypothesis that the unique history of the Finnish population provides important advantages for analyzing the relationship between genetic variations and biomedically important traits.
doi:10.1371/journal.pgen.1004147
PMCID: PMC3907339  PMID: 24497850
7.  The origin and evolution of mutations in Acute Myeloid Leukemia 
Cell  2012;150(2):264-278.
Summary
Most mutations in cancer genomes are thought to be acquired after the initiating event, which may cause genomic instability, driving clonal evolution. However, for acute myeloid leukemia (AML), normal karyotypes are common, and genomic instability is unusual. To better understand clonal evolution in AML, we sequenced the genomes of AML samples with a known initiating event (PML-RARA) vs. normal karyotype AML samples, and the exomes of hematopoietic stem/progenitor cells (HSPCs) from healthy people. Collectively, the data suggest that most of the mutations found in AML genomes are actually random events that occurred in HSPCs before they acquired the initiating mutation; the mutational history of that cell is “captured” as the clone expands. In many cases, only one or two additional, cooperating mutations are needed to generate the malignant founding clone. Cells from the founding clone can acquire additional cooperating mutations, yielding subclones that can contribute to disease progression and/or relapse.
doi:10.1016/j.cell.2012.06.023
PMCID: PMC3407563  PMID: 22817890
8.  Massively Parallel Sequencing Approaches for Characterization of Structural Variation 
The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to detect variants of various sizes and types, as well as the precision to characterize their breakpoints at base pair resolution. In this chapter, we present methods and software algorithms that have been developed to detect SVs and copy number changes using massively parallel sequencing data. We describe visualization and de novo assembly strategies for characterizing SV breakpoints and removing false positives.
doi:10.1007/978-1-61779-507-7_18
PMCID: PMC3679911  PMID: 22228022
Next-generation sequencing; Paired-end sequencing; 454; Illumina; Solexa; Abi solid; Insertions; Deletions; Duplications; Inversions; Translocations; Indels; Copy number variants
9.  BreakDancer: An algorithm for high resolution mapping of genomic structural variation 
Nature methods  2009;6(9):677-681.
Detection and characterization of genomic structural variation are important for understanding the landscape of genetic variation in human populations and in complex diseases such as cancer. Recent studies demonstrate the feasibility of detecting structural variation using next-generation, short-insert, paired-end sequencing reads. However, the utility of these reads is not entirely clear, nor are the analysis methods under which accurate detection can be achieved. The algorithm BreakDancer predicts a wide variety of structural variants including indels, inversions, and translocations. We examined BreakDancer's performance in simulation, comparison with other methods, analysis of an acute myeloid leukemia sample, and the 1,000 Genomes trio individuals. We found that it substantially improved the detection of small and intermediate size indels from 10 bp to 1 Mbp that are difficult to detect via a single conventional approach.
doi:10.1038/nmeth.1363
PMCID: PMC3661775  PMID: 19668202
10.  Background mutations in parental cells account for most of the genetic heterogeneity of Induced Pluripotent Stem Cells 
Cell Stem Cell  2012;10(5):570-582.
Summary
To assess the genetic consequences of induced Pluripotent Stem Cell (iPSC) reprogramming, we sequenced the genomes of ten murine iPSC clones derived from three independent reprogramming experiments, and compared them to their parental cell genomes. We detected hundreds of single nucleotide variants (SNVs) in every clone, with an average of 11 in coding regions. In two experiments, all SNVs were unique for each clone and did not cluster in pathways, but in the third, all four iPSC clones contained 157 shared genetic variants, which could also be detected in rare cells (<1 in 500) within the parental MEF pool. This data suggests that most of the genetic variation in iPSC clones is not caused by reprogramming per se, but is rather a consequence of cloning individual cells, which “captures” their mutational history. These findings have implications for the development and therapeutic use of cells that are reprogrammed by any method.
doi:10.1016/j.stem.2012.03.002
PMCID: PMC3348423  PMID: 22542160
11.  Sequencing a mouse acute promyelocytic leukemia genome reveals genetic events relevant for disease progression 
The Journal of Clinical Investigation  2011;121(4):1445-1455.
Acute promyelocytic leukemia (APL) is a subtype of acute myeloid leukemia (AML). It is characterized by the t(15;17)(q22;q11.2) chromosomal translocation that creates the promyelocytic leukemia–retinoic acid receptor α (PML-RARA) fusion oncogene. Although this fusion oncogene is known to initiate APL in mice, other cooperating mutations, as yet ill defined, are important for disease pathogenesis. To identify these, we used a mouse model of APL, whereby PML-RARA expressed in myeloid cells leads to a myeloproliferative disease that ultimately evolves into APL. Sequencing of a mouse APL genome revealed 3 somatic, nonsynonymous mutations relevant to APL pathogenesis, of which 1 (Jak1 V657F) was found to be recurrent in other affected mice. This mutation was identical to the JAK1 V658F mutation previously found in human APL and acute lymphoblastic leukemia samples. Further analysis showed that JAK1 V658F cooperated in vivo with PML-RARA, causing a rapidly fatal leukemia in mice. We also discovered a somatic 150-kb deletion involving the lysine (K)-specific demethylase 6A (Kdm6a, also known as Utx) gene, in the mouse APL genome. Similar deletions were observed in 3 out of 14 additional mouse APL samples and 1 out of 150 human AML samples. In conclusion, whole genome sequencing of mouse cancer genomes can provide an unbiased and comprehensive approach for discovering functionally relevant mutations that are also present in human leukemias.
doi:10.1172/JCI45284
PMCID: PMC3069786  PMID: 21436584
12.  SomaticSniper: identification of somatic point mutations in whole genome sequencing data 
Bioinformatics  2011;28(3):311-317.
Motivation: The sequencing of tumors and their matched normals is frequently used to study the genetic composition of cancer. Despite this fact, there remains a dearth of available software tools designed to compare sequences in pairs of samples and identify sites that are likely to be unique to one sample.
Results: In this article, we describe the mathematical basis of our SomaticSniper software for comparing tumor and normal pairs. We estimate its sensitivity and precision, and present several common sources of error resulting in miscalls.
Availability and implementation: Binaries are freely available for download at http://gmt.genome.wustl.edu/somatic-sniper/current/, implemented in C and supported on Linux and Mac OS X.
Contact: delarson@wustl.edu; lding@wustl.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btr665
PMCID: PMC3268238  PMID: 22155872
13.  Clonal Architecture of Secondary Acute Myeloid Leukemia 
The New England Journal of Medicine  2012;366(12):1090-1098.
BACKGROUND
The myelodysplastic syndromes are a group of hematologic disorders that often evolve into secondary acute myeloid leukemia (AML). The genetic changes that underlie progression from the myelodysplastic syndromes to secondary AML are not well understood.
METHODS
We performed whole-genome sequencing of seven paired samples of skin and bone marrow in seven subjects with secondary AML to identify somatic mutations specific to secondary AML. We then genotyped a bone marrow sample obtained during the antecedent myelodysplastic-syndrome stage from each subject to determine the presence or absence of the specific somatic mutations. We identified recurrent mutations in coding genes and defined the clonal architecture of each pair of samples from the myelodysplastic-syndrome stage and the secondary-AML stage, using the allele burden of hundreds of mutations.
RESULTS
Approximately 85% of bone marrow cells were clonal in the myelodysplastic-syndrome and secondary-AML samples, regardless of the myeloblast count. The secondary-AML samples contained mutations in 11 recurrently mutated genes, including 4 genes that have not been previously implicated in the myelodysplastic syndromes or AML. In every case, progression to acute leukemia was defined by the persistence of an antecedent founding clone containing 182 to 660 somatic mutations and the outgrowth or emergence of at least one subclone, harboring dozens to hundreds of new mutations. All founding clones and subclones contained at least one mutation in a coding gene.
CONCLUSIONS
Nearly all the bone marrow cells in patients with myelodysplastic syndromes and secondary AML are clonally derived. Genetic evolution of secondary AML is a dynamic process shaped by multiple cycles of mutation acquisition and clonal selection. Recurrent gene mutations are found in both founding clones and daughter subclones. (Funded by the National Institutes of Health and others.)
doi:10.1056/NEJMoa1106968
PMCID: PMC3320218  PMID: 22417201
14.  Clonal evolution in relapsed acute myeloid leukemia revealed by whole genome sequencing 
Nature  2012;481(7382):506-510.
Summary
Most patients with acute myeloid leukemia (AML) die from progressive disease after relapse, which is associated with clonal evolution at the cytogenetic level1,2. To determine the mutational spectrum associated with relapse, we sequenced the primary tumor and relapse genomes from 8 AML patients, and validated hundreds of somatic mutations using deep sequencing; this allowed us to precisely define clonality and clonal evolution patterns at relapse. Besides discovering novel, recurrently mutated genes (e.g. WAC, SMC3, DIS3, DDX41, and DAXX) in AML, we found two major clonal evolution patterns during AML relapse: 1) the founding clone in the primary tumor gained mutations and evolved into the relapse clone, or 2) a subclone of the founding clone survived initial therapy, gained additional mutations, and expanded at relapse. In all cases, chemotherapy failed to eradicate the founding clone. The comparison of relapse-specific vs. primary tumor mutations in all 8 cases revealed an increase in transversions, probably due to DNA damage caused by cytotoxic chemotherapy. These data demonstrate that AML relapse is associated with the addition of new mutations and clonal evolution, which is shaped in part by the chemotherapy that the patients receive to establish and maintain remissions.
doi:10.1038/nature10738
PMCID: PMC3267864  PMID: 22237025
15.  RECURRENT MUTATIONS IN THE U2AF1 SPLICING FACTOR IN MYELODYSPLASTIC SYNDROMES 
Nature Genetics  2011;44(1):53-57.
Myelodysplastic syndromes (MDS) are hematopoietic stem cell disorders that often progress to chemotherapy-resistant secondary acute myeloid leukemia (sAML). We used whole genome sequencing to perform an unbiased comprehensive screen to discover all the somatic mutations in a sAML sample and genotyped these loci in the matched MDS sample. Here we show that a missense mutation affecting the serine at codon 34 (S34) in U2AF1 was recurrently mutated in 13/150 (8.7%) de novo MDS patients, with suggestive evidence of an associated increased risk of progression to sAML. U2AF1 is a U2 auxiliary factor protein that recognizes the AG splice acceptor dinucleotide at the 3′ end of introns and mutations are located in highly conserved zinc fingers in U2AF11,2. Mutant U2AF1 promotes enhanced splicing and exon skipping in reporter assays in vitro. This novel, recurrent mutation in U2AF1 implicates altered pre-mRNA splicing as a potential mechanism for MDS pathogenesis.
doi:10.1038/ng.1031
PMCID: PMC3247063  PMID: 22158538
16.  Recurring Mutations Found by Sequencing an Acute Myeloid Leukemia Genome 
The New England journal of medicine  2009;361(11):1058-1066.
BACKGROUND
The full complement of DNA mutations that are responsible for the pathogenesis of acute myeloid leukemia (AML) is not yet known.
METHODS
We used massively parallel DNA sequencing to obtain a very high level of coverage (approximately 98%) of a primary, cytogenetically normal, de novo genome for AML with minimal maturation (AML-M1) and a matched normal skin genome.
RESULTS
We identified 12 acquired (somatic) mutations within the coding sequences of genes and 52 somatic point mutations in conserved or regulatory portions of the genome. All mutations appeared to be heterozygous and present in nearly all cells in the tumor sample. Four of the 64 mutations occurred in at least 1 additional AML sample in 188 samples that were tested. Mutations in NRAS and NPM1 had been identified previously in patients with AML, but two other mutations had not been identified. One of these mutations, in the IDH1 gene, was present in 15 of 187 additional AML genomes tested and was strongly associated with normal cytogenetic status; it was present in 13 of 80 cytogenetically normal samples (16%). The other was a nongenic mutation in a genomic region with regulatory potential and conservation in higher mammals; we detected it in one additional AML tumor. The AML genome that we sequenced contains approximately 750 point mutations, of which only a small fraction are likely to be relevant to pathogenesis.
CONCLUSIONS
By comparing the sequences of tumor and skin genomes of a patient with AML-M1, we have identified recurring mutations that may be relevant for pathogenesis.
doi:10.1056/NEJMoa0903840
PMCID: PMC3201812  PMID: 19657110
17.  DNMT3A Mutations in Acute Myeloid Leukemia 
The New England journal of medicine  2010;363(25):2424-2433.
BACKGROUND
The genetic alterations responsible for an adverse outcome in most patients with acute myeloid leukemia (AML) are unknown.
METHODS
Using massively parallel DNA sequencing, we identified a somatic mutation in DNMT3A, encoding a DNA methyltransferase, in the genome of cells from a patient with AML with a normal karyotype. We sequenced the exons of DNMT3A in 280 additional patients with de novo AML to define recurring mutations.
RESULTS
A total of 62 of 281 patients (22.1%) had mutations in DNMT3A that were predicted to affect translation. We identified 18 different missense mutations, the most common of which was predicted to affect amino acid R882 (in 37 patients). We also identified six frameshift, six nonsense, and three splice-site mutations and a 1.5-Mbp deletion encompassing DNMT3A. These mutations were highly enriched in the group of patients with an intermediate-risk cytogenetic profile (56 of 166 patients, or 33.7%) but were absent in all 79 patients with a favorable-risk cytogenetic profile (P<0.001 for both comparisons). The median overall survival among patients with DNMT3A mutations was significantly shorter than that among patients without such mutations (12.3 months vs. 41.1 months, P<0.001). DNMT3A mutations were associated with adverse outcomes among patients with an intermediate-risk cytogenetic profile or FLT3 mutations, regardless of age, and were independently associated with a poor outcome in Cox proportional-hazards analysis.
CONCLUSIONS
DNMT3A mutations are highly recurrent in patients with de novo AML with an intermediate-risk cytogenetic profile and are independently associated with a poor outcome. (Funded by the National Institutes of Health and others.)
doi:10.1056/NEJMoa1005143
PMCID: PMC3201818  PMID: 21067377
18.  Use of whole genome sequencing to diagnose a cryptic fusion oncogene 
Context
Whole genome sequencing (WGS) is becoming increasingly available for research purposes, but it has not yet been routinely used for clinical diagnosis.
Object
To determine whether whole genome sequencing can identify cryptic, actionable mutations in a clinically relevant time frame.
Design, Setting, and Patient
We were referred a difficult diagnostic case of acute promyelocytic leukemia with no pathogenic X-RARA fusion identified by routine metaphase cytogenetics or interphase FISH. The patient was enrolled in an IRB approved protocol, with consent specifically tailored to the implications of whole genome sequencing. The protocol employs a ‘movable firewall,’ which maintains patient anonymity within the entire research team, but allows the research team to communicate medically relevant information to the treating physician.
Main Outcome Measure
Clinical relevance of whole genome sequencing and time to communicate validated results to the treating physician.
Results
Massively parallel paired-end sequencing allowed us to identify a cytogenetically cryptic event: 77 kilobases from chromosome 15 was inserted en bloc into the second intron of the RARA gene on chromosome 17, resulting in a classic bcr3 PML-RARA fusion gene. RT-PCR subsequently validated the expression of the fusion transcript. Novel FISH probes identified two additional cases of t(15;17)-negative acute promyelocytic leukemia that had cytogenetically invisible insertions. Whole genome sequencing and validation were completed in seven weeks, and changed the treatment plan for the patient.
Conclusions
Whole genome sequencing can identify cytogenetically invisible oncogenes in a clinically relevant timeframe.
doi:10.1001/jama.2011.497
PMCID: PMC3156695  PMID: 21505136
19.  CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data 
Bioinformatics  2009;26(4):464-469.
Motivation: DNA copy number aberration (CNA) is a hallmark of genomic abnormality in tumor cells. Recurrent CNA (RCNA) occurs in multiple cancer samples across the same chromosomal region and has greater implication in tumorigenesis. Current commonly used methods for RCNA identification require CNA calling for individual samples before cross-sample analysis. This two-step strategy may result in a heavy computational burden, as well as a loss of the overall statistical power due to segmentation and discretization of individual sample's data. We propose a population-based approach for RCNA detection with no need of single-sample analysis, which is statistically powerful, computationally efficient and particularly suitable for high-resolution and large-population studies.
Results: Our approach, correlation matrix diagonal segmentation (CMDS), identifies RCNAs based on a between-chromosomal-site correlation analysis. Directly using the raw intensity ratio data from all samples and adopting a diagonal transformation strategy, CMDS substantially reduces computational burden and can obtain results very quickly from large datasets. Our simulation indicates that the statistical power of CMDS is higher than that of single-sample CNA calling based two-step approaches. We applied CMDS to two real datasets of lung cancer and brain cancer from Affymetrix and Illumina array platforms, respectively, and successfully identified known regions of CNA associated with EGFR, KRAS and other important oncogenes. CMDS provides a fast, powerful and easily implemented tool for the RCNA analysis of large-scale data from cancer genomes.
Availability: The R and C programs implementing our method are available at https://dsgweb.wustl.edu/qunyuan/software/cmds.
Contact: qunyuan@wustl.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp708
PMCID: PMC2852218  PMID: 20031968
20.  Cellular behavior in the developing Drosophila pupal retina 
Mechanisms of development  2007;125(3-4):223-232.
Correct patterning of cells within an epithelium is key to establishing their normal function. However, the precise mechanisms by which individual cells arrive at their final developmental niche remains poorly understood. We developed an optimized system for imaging the developing Drosophila retina, an ideal tissue for the study of cell positioning. Using this technique, we characterized the cellular dynamics of developing wild-type pupal retinas. We also analyzed two mutants affecting eye patterning and demonstrate that cells mutant for Notch or Roughest signaling were aberrantly dynamic in their cell movements. Finally, we establish a role for the adherens junction regulator P120-Catenin in retinal patterning through its regulation of normal adherens junction integrity. Our results indicate a requirement for P120-Catenin in the developing retina, the first reported developmental function of this protein in the epithelia of lower metazoa. Based upon our live visualization of the P120-Catenin mutant as well as genetic data, we conclude that P120-Catenin is acting to stabilize E-cadherin and adherens junction integrity during eye development.
doi:10.1016/j.mod.2007.11.007
PMCID: PMC2965056  PMID: 18166433
Live visualization; p120Catenin; Drosophila; Retina; Tissue patterning
21.  Dynamic Decapentaplegic signaling regulates patterning and adhesion in the Drosophila pupal retina 
Development (Cambridge, England)  2007;134(10):1861-1871.
The correct organization of cells within an epithelium is essential for proper tissue and organ morphogenesis. The role of Decapentaplegic/Bone morphogenetic protein (Dpp/BMP) signaling in cellular morphogenesis during epithelial development is poorly understood. In this paper, we used the developing Drosophila pupal retina – looking specifically at the reorganization of glial-like support cells that lie between the retinal ommatidia – to better understand the role of Dpp signaling during epithelial patterning. Our results indicate that Dpp pathway activity is tightly regulated across time in the pupal retina and that epithelial cells in this tissue require Dpp signaling to achieve their correct shape and position within the ommatidial hexagon. These results point to the Dpp pathway as a third component and functional link between two adhesion systems, Hibris-Roughest and DE-cadherin. A balanced interplay between these three systems is essential for epithelial patterning during morphogenesis of the pupal retina. Importantly, we identify a similar functional connection between Dpp activity and DE-cadherin and Rho1 during cell fate determination in the wing, suggesting a broader link between Dpp function and junctional integrity during epithelial development.
doi:10.1242/dev.002972
PMCID: PMC2957290  PMID: 17428827
Adhesion; BMP; Dpp; Epithelia; Patterning
22.  Genome Remodeling in a Basal-like Breast Cancer Metastasis and Xenograft 
Nature  2010;464(7291):999-1005.
Massively parallel DNA sequencing technologies provide an unprecedented ability to screen entire genomes for genetic changes associated with tumor progression. Here we describe the genomic analyses of four DNA samples from an African-American patient with basal-like breast cancer: peripheral blood, the primary tumor, a brain metastasis, and a xenograft derived from the primary tumor. The metastasis contained two de novo mutations and a large deletion not present in the primary tumor, and was significantly enriched for 20 shared mutations. The xenograft retained all primary tumor mutations, and displayed a mutation enrichment pattern that paralleled the metastasis (16 of 20 genes). Two overlapping large deletions, encompassing CTNNA1, were present in all three tumor samples. The differential mutation frequencies and structural variation patterns in metastasis and xenograft compared to the primary tumor suggest that secondary tumors may arise from a minority of cells within the primary.
doi:10.1038/nature08989
PMCID: PMC2872544  PMID: 20393555
23.  VarScan: variant detection in massively parallel sequencing of individual and pooled samples 
Bioinformatics  2009;25(17):2283-2285.
Summary: Massively parallel sequencing technologies hold incredible promise for the study of DNA sequence variation, particularly the identification of variants affecting human disease. The unprecedented throughput and relatively short read lengths of Roche/454, Illumina/Solexa, and other platforms have spurred development of a new generation of sequence alignment algorithms. Yet detection of sequence variants based on short read alignments remains challenging, and most currently available tools are limited to a single platform or aligner type. We present VarScan, an open source tool for variant detection that is compatible with several short read aligners. We demonstrate VarScan's ability to detect SNPs and indels with high sensitivity and specificity, in both Roche/454 sequencing of individuals and deep Illumina/Solexa sequencing of pooled samples.
Availability and Implementation: Source code and documentation freely available at http://genome.wustl.edu/tools/cancer-genomics implemented as a Perl package and supported on Linux/UNIX, MS Windows and Mac OSX.
Contact: dkoboldt@genome.wustl.edu
Supplementary information: Supplementary data are available at Bioinformatics online.
doi:10.1093/bioinformatics/btp373
PMCID: PMC2734323  PMID: 19542151
24.  Computer Simulation of Cellular Patterning Within the Drosophila Pupal Eye 
PLoS Computational Biology  2010;6(7):e1000841.
We present a computer simulation and associated experimental validation of assembly of glial-like support cells into the interweaving hexagonal lattice that spans the Drosophila pupal eye. This process of cell movements organizes the ommatidial array into a functional pattern. Unlike earlier simulations that focused on the arrangements of cells within individual ommatidia, here we examine the local movements that lead to large-scale organization of the emerging eye field. Simulations based on our experimental observations of cell adhesion, cell death, and cell movement successfully patterned a tracing of an emerging wild-type pupal eye. Surprisingly, altering cell adhesion had only a mild effect on patterning, contradicting our previous hypothesis that the patterning was primarily the result of preferential adhesion between IRM-class surface proteins. Instead, our simulations highlighted the importance of programmed cell death (PCD) as well as a previously unappreciated variable: the expansion of cells' apical surface areas, which promoted rearrangement of neighboring cells. We tested this prediction experimentally by preventing expansion in the apical area of individual cells: patterning was disrupted in a manner predicted by our simulations. Our work demonstrates the value of combining computer simulation with in vivo experiments to uncover novel mechanisms that are perpetuated throughout the eye field. It also demonstrates the utility of the Glazier–Graner–Hogeweg model (GGH) for modeling the links between local cellular interactions and emergent properties of developing epithelia as well as predicting unanticipated results in vivo.
Author Summary
During development, organs are assembled through a complex combination of cell proliferation, programmed cell death, cell movements, etc. These aspects of tissue maturation must be achieved with a limited gene set—to achieve complexity, tissues utilize patterning mechanisms. That is, “rules” are used to create heterogeneity in initially homogeneous cell populations. A large number of genes and cell biology mechanisms have been uncovered that mediate this process but we have a limited understanding of how these factors act together to generate the large-scale patterns necessary to create a useful organ. Here, we combine computational modeling with in situ experiments in the developing Drosophila eye to explore these issues. Computer modeling is often criticized for describing known outcomes. We demonstrate how the Glazier–Graner–Hogeweg model can successfully predict surprising outcomes contradictory to models that emerged from our previous studies. We then validated these predictions in the developing eye. These mechanisms, which include the importance of dynamic nuclear movements, may prove generally important in directing cells into their proper niches as developing epithelia mature.
doi:10.1371/journal.pcbi.1000841
PMCID: PMC2895643  PMID: 20617161
25.  Somatic mutations affect key pathways in lung adenocarcinoma 
Ding, Li | Getz, Gad | Wheeler, David A. | Mardis, Elaine R. | McLellan, Michael D. | Cibulskis, Kristian | Sougnez, Carrie | Greulich, Heidi | Muzny, Donna M. | Morgan, Margaret B. | Fulton, Lucinda | Fulton, Robert S. | Zhang, Qunyuan | Wendl, Michael C. | Lawrence, Michael S. | Larson, David E. | Chen, Ken | Dooling, David J. | Sabo, Aniko | Hawes, Alicia C. | Shen, Hua | Jhangiani, Shalini N. | Lewis, Lora R. | Hall, Otis | Zhu, Yiming | Mathew, Tittu | Ren, Yanru | Yao, Jiqiang | Scherer, Steven E. | Clerc, Kerstin | Metcalf, Ginger A. | Ng, Brian | Milosavljevic, Aleksandar | Gonzalez-Garay, Manuel L. | Osborne, John R. | Meyer, Rick | Shi, Xiaoqi | Tang, Yuzhu | Koboldt, Daniel C. | Lin, Ling | Abbott, Rachel | Miner, Tracie L. | Pohl, Craig | Fewell, Ginger | Haipek, Carrie | Schmidt, Heather | Dunford-Shore, Brian H. | Kraja, Aldi | Crosby, Seth D. | Sawyer, Christopher S. | Vickery, Tammi | Sander, Sacha | Robinson, Jody | Winckler, Wendy | Baldwin, Jennifer | Chirieac, Lucian R. | Dutt, Amit | Fennell, Tim | Hanna, Megan | Johnson, Bruce E. | Onofrio, Robert C. | Thomas, Roman K. | Tonon, Giovanni | Weir, Barbara A. | Zhao, Xiaojun | Ziaugra, Liuda | Zody, Michael C. | Giordano, Thomas | Orringer, Mark B. | Roth, Jack A. | Spitz, Margaret R. | Wistuba, Ignacio I. | Ozenberger, Bradley | Good, Peter J. | Chang, Andrew C. | Beer, David G. | Watson, Mark A. | Ladanyi, Marc | Broderick, Stephen | Yoshizawa, Akihiko | Travis, William D. | Pao, William | Province, Michael A. | Weinstock, George M. | Varmus, Harold E. | Gabriel, Stacey B. | Lander, Eric S. | Gibbs, Richard A. | Meyerson, Matthew | Wilson, Richard K.
Nature  2008;455(7216):1069-1075.
Determining the genetic basis of cancer requires comprehensive analyses of large collections of histopathologically well-classified primary tumours. Here we report the results of a collaborative study to discover somatic mutations in 188 human lung adenocarcinomas. DNA sequencing of 623 genes with known or potential relationships to cancer revealed more than 1,000 somatic mutations across the samples. Our analysis identified 26 genes that are mutated at significantly high frequencies and thus are probably involved in carcinogenesis. The frequently mutated genes include tyrosine kinases, among them the EGFR homologue ERBB4; multiple ephrin receptor genes, notably EPHA3; vascular endothelial growth factor receptor KDR; and NTRK genes. These data provide evidence of somatic mutations in primary lung adenocarcinoma for several tumour suppressor genes involved in other cancers—including NF1, APC, RB1 and ATM—and for sequence changes in PTPRD as well as the frequently deleted gene LRP1B. The observed mutational profiles correlate with clinical features, smoking status and DNA repair defects. These results are reinforced by data integration including single nucleotide polymorphism array and gene expression array. Our findings shed further light on several important signalling pathways involved in lung adenocarcinoma, and suggest new molecular targets for treatment.
doi:10.1038/nature07423
PMCID: PMC2694412  PMID: 18948947

Results 1-25 (27)