Search tips
Search criteria

Results 1-25 (226)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
1.  Using genomic information to guide weight management: From universal to precision treatment 
Obesity (Silver Spring, Md.)  2016;24(1):14-22.
Precision medicine utilizes genomic and other data to optimize and personalize treatment. While more than 2,500 genetic tests are currently available, largely for extreme and/or rare phenotypes, the question remains whether this approach can be used for the treatment of common, complex conditions like obesity, inflammation, and insulin resistance, which underlie a host of metabolic diseases.
This review, developed from a Trans-NIH Conference titled, “Genes, Behaviors, and Response to Weight Loss Interventions,” provides an overview of the state of genetic and genomic research in the area of weight change and identifies key areas for future research.
While many loci have been identified that are associated with cross-sectional measures of obesity/body size, relatively little is known regarding the genes/loci that influence dynamic measures of weight change over time. Successful short-term weight loss has been achieved using many different strategies, but sustainable weight loss has proven elusive for many, and there are important gaps in our understanding of energy balance regulation.
Elucidating the molecular basis of variability in weight change has the potential to improve treatment outcomes and inform innovative approaches that can simultaneously take into account information from genomic and other sources in devising individualized treatment plans.
PMCID: PMC4689320  PMID: 26692578
genomics; precision medicine; weight loss
2.  Concerted genomic targeting of H3K27 demethylase REF6 and chromatin-remodeling ATPase BRM in Arabidopsis 
Nature genetics  2016;48(6):687-693.
SWI/SNF-type chromatin remodelers, such as BRAHMA (BRM), and H3K27 demethylases both have active roles in regulating gene expression at the chromatin level1–5, but how they are recruited to specific genomic sites remains largely unknown. Here we show that RELATIVE OF EARLY FLOWERING 6 (REF6), a plant-unique H3K27 demethylase6, targets genomic loci containing a CTCTGYTY motif via its zinc-finger (ZnF) domains and facilitates the recruitment of BRM. Genome-wide analyses showed that REF6 colocalizes with BRM at many genomic sites with the CTCTGYTY motif. Loss of REF6 results in decreased BRM occupancy at BRM–REF6 co-targets. Furthermore, REF6 directly binds to the CTCTGYTY motif in vitro, and deletion of the motif from a target gene renders it inaccessible to REF6 in vivo. Finally, we show that, when its ZnF domains are deleted, REF6 loses its genomic targeting ability. Thus, our work identifies a new genomic targeting mechanism for an H3K27 demethylase and demonstrates its key role in recruiting the BRM chromatin remodeler.
PMCID: PMC5134324  PMID: 27111034
3.  Identification of Human Neuronal Protein Complexes Reveals Biochemical Activities and Convergent Mechanisms of Action in Autism Spectrum Disorders 
Cell systems  2015;1(5):361-374.
The prevalence of autism spectrum disorders (ASDs) is rapidly growing, yet its molecular basis is poorly understood. We used a systems approach in which ASD candidate genes were mapped onto the ubiquitous human protein complexes and the resulting complexes were characterized. The studies revealed the role of histone deacetylases (HDAC1/2) in regulating the expression of ASD orthologs in the embryonic mouse brain. Proteome-wide screens for the co-complexed subunits with HDAC1 and six other key ASD proteins in neuronal cells revealed a protein interaction network, which displayed preferential expression in fetal brain development, exhibited increased deleterious mutations in ASD cases, and were strongly regulated by FMRP and MECP2 causal for Fragile X and Rett syndromes, respectively. Overall, our study reveals molecular components in ASD, suggests a shared mechanism between the syndromic and idiopathic forms of ASDs, and provides a systems framework for analyzing complex human diseases.
Graphical Abstract
PMCID: PMC4776331  PMID: 26949739
4.  Can heavy isotopes increase lifespan? Studies of relative abundance in various organisms reveal chemical perspectives on aging 
Bioessays  2016;38(11):1093-1101.
Stable heavy isotopes co‐exist with their lighter counterparts in all elements commonly found in biology. These heavy isotopes represent a low natural abundance in isotopic composition but impose great retardation effects in chemical reactions because of kinetic isotopic effects (KIEs). Previous isotope analyses have recorded pervasive enrichment or depletion of heavy isotopes in various organisms, strongly supporting the capability of biological systems to distinguish different isotopes. This capability has recently been found to lead to general decline of heavy isotopes in metabolites during yeast aging. Conversely, supplementing heavy isotopes in growth medium promotes longevity. Whether this observation prevails in other organisms is not known, but it potentially bears promise in promoting human longevity.
PMCID: PMC5108472  PMID: 27554342
aging; kinetic isotopic effects; longevity; stable isotope
5.  Nat1 deficiency is associated with mitochondrial dysfunction and exercise intolerance in mice 
Cell reports  2016;17(2):527-540.
We recently identified human N-acetyltransferase 2 (NAT2) as an insulin resistance (IR) gene. Here we examine the cellular mechanism linking NAT2 to IR and find that Nat1 (mouse ortholog of NAT2) is co-regulated with key regulators of mitochondria. RNA-interference mediated silencing of Nat1 led to mitochondrial dysfunction characterized by increased intracellular reactive oxygen species and mitochondrial fragmentation as well as decreased mitochondrial membrane potential, biogenesis, mass, cellular respiration and ATP generation. These effects were consistent in 3T3-L1 adipocytes, C2C12 myoblasts and in tissues from Nat1 deficient mice including white adipose tissue, heart and skeletal muscle. Nat1 deficient mice had changes in plasma metabolites and lipids consistent with a decreased ability to utilize fats for energy and a decrease in basal metabolic rate and exercise capacity without altered thermogenesis. Collectively, our results suggest that Nat1 deficiency results in mitochondrial dysfunction, which may constitute a mechanistic link between this gene and IR.
Graphical Abstract
PMCID: PMC5097870  PMID: 27705799
NAT2; Nat1; Insulin resistance; Mitochondrial dysfunction; Adipose tissue
6.  Transcriptome profiling of patient-specific human iPSC-cardiomyocytes predicts individual drug safety and efficacy responses in vitro 
Cell stem cell  2016;19(3):311-325.
Understanding individual susceptibility to drug-induced cardiotoxicity is key to improving patient safety and preventing drug attrition. Human induced pluripotent stem cells (hiPSCs) enable the study of pharmacological and toxicological responses in patient-specific cardiomyocytes (CMs), and may serve as preclinical platforms for precision medicine. Transcriptome profiling in hiPSC-CMs from seven individuals lacking known cardiovascular disease-associated mutations, and in three isogenic human heart tissue and hiPSC-CM pairs, showed greater inter-patient variation than intra-patient variation, verifying that reprogramming and differentiation preserve patient-specific gene expression, particularly in metabolic and stress-response genes. Transcriptome-based toxicology analysis predicted and risk-stratified patient-specific susceptibility to cardiotoxicity, and functional assays in hiPSC-CMs using tacrolimus and rosiglitazone, drugs targeting pathways predicted to produce cardiotoxicity, validated inter-patient differential responses. CRISPR/Cas9-mediated pathway correction prevented drug-induced cardiotoxicity. Our data suggest that hiPSC-CMs can be used in vitro to predict and validate patient-specific drug safety and efficacy, potentially enabling future clinical approaches to precision medicine.
PMCID: PMC5087997  PMID: 27545504
Induced pluripotent stem cells; cardiomyocytes; personalized drug safety and efficacy; precision medicine
7.  Mango: a bias-correcting ChIA-PET analysis pipeline 
Bioinformatics  2015;31(19):3092-3098.
Motivation: Chromatin Interaction Analysis by Paired-End Tag sequencing (ChIA-PET) is an established method for detecting genome-wide looping interactions at high resolution. Current ChIA-PET analysis software packages either fail to correct for non-specific interactions due to genomic proximity or only address a fraction of the steps required for data processing. We present Mango, a complete ChIA-PET data analysis pipeline that provides statistical confidence estimates for interactions and corrects for major sources of bias including differential peak enrichment and genomic proximity.
Results: Comparison to the existing software packages, ChIA-PET Tool and ChiaSig revealed that Mango interactions exhibit much better agreement with high-resolution Hi-C data. Importantly, Mango executes all steps required for processing ChIA-PET datasets, whereas ChiaSig only completes 20% of the required steps. Application of Mango to multiple available ChIA-PET datasets permitted the independent rediscovery of known trends in chromatin loops including enrichment of CTCF, RAD21, SMC3 and ZNF143 at the anchor regions of interactions and strong bias for convergent CTCF motifs.
Availability and implementation: Mango is open source and distributed through github at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4592333  PMID: 26034063
8.  Multiple Pairwise Analysis of Non-homologous Centromere Coupling Reveals Preferential Chromosome Size-Dependent Interactions and a Role for Bouquet Formation in Establishing the Interaction Pattern 
PLoS Genetics  2016;12(10):e1006347.
During meiosis, chromosomes undergo a homology search in order to locate their homolog to form stable pairs and exchange genetic material. Early in prophase, chromosomes associate in mostly non-homologous pairs, tethered only at their centromeres. This phenomenon, conserved through higher eukaryotes, is termed centromere coupling in budding yeast. Both initiation of recombination and the presence of homologs are dispensable for centromere coupling (occurring in spo11 mutants and haploids induced to undergo meiosis) but the presence of the synaptonemal complex (SC) protein Zip1 is required. The nature and mechanism of coupling have yet to be elucidated. Here we present the first pairwise analysis of centromere coupling in an effort to uncover underlying rules that may exist within these non-homologous interactions. We designed a novel chromosome conformation capture (3C)-based assay to detect all possible interactions between non-homologous yeast centromeres during early meiosis. Using this variant of 3C-qPCR, we found a size-dependent interaction pattern, in which chromosomes assort preferentially with chromosomes of similar sizes, in haploid and diploid spo11 cells, but not in a coupling-defective mutant (spo11 zip1 haploid and diploid yeast). This pattern is also observed in wild-type diploids early in meiosis but disappears as meiosis progresses and homologous chromosomes pair. We found no evidence to support the notion that ancestral centromere homology plays a role in pattern establishment in S. cerevisiae post-genome duplication. Moreover, we found a role for the meiotic bouquet in establishing the size dependence of centromere coupling, as abolishing bouquet (using the bouquet-defective spo11 ndj1 mutant) reduces it. Coupling in spo11 ndj1 rather follows telomere clustering preferences. We propose that a chromosome size preference for centromere coupling helps establish efficient homolog recognition.
Author Summary
Meiosis enables sexual reproduction in eukaryotes by producing gametes. In the process, it increases genetic diversity through recombination of homologous chromosomes from the parents. Genetic diversity constitutes an evolutionary advantage. Prior to finding their unique pairing partner (homolog), chromosomes associate non-homologously with other chromosomes through their centromeres, a process termed centromere coupling. Little is known about the nature and mechanism of centromere coupling. In this study, we present the first pairwise characterization of this process conserved amongst eukaryotes, using the budding yeast as a model. We quantitatively analyzed the interactions between centromeres for each pair of chromosomes. We observed an interaction pattern based on chromosome size, where centromeres from smaller chromosomes frequently associated with those from other small chromosomes, and a similar association for large chromosomes. This pattern appears ubiquitous, since recombination-defective diploid cells, haploid cells forced to undergo meiosis, and wild-type yeast early in meiosis, until homologous chromosomes become paired, all undergo non-homologous centromere coupling. Centromeres derived from a common ancestor, prior to genome duplication, do not associate more often, excluding ancestral homology as the mechanism. Data from mutants affecting the meiotic bouquet, where all chromosome ends become embedded and clustered in the nuclear envelope prior to coupling, suggest a potential mechanism to generate interactions. Deciphering the mechanisms for proper pairing of homologous chromosomes helps us to understand and prevent chromosomal abnormalities in pregnancy.
PMCID: PMC5074576  PMID: 27768699
9.  Full-Length Isoform Sequencing Reveals Novel Transcripts and Substantial Transcriptional Overlaps in a Herpesvirus 
PLoS ONE  2016;11(9):e0162868.
Whole transcriptome studies have become essential for understanding the complexity of genetic regulation. However, the conventionally applied short-read sequencing platforms cannot be used to reliably distinguish between many transcript isoforms. The Pacific Biosciences (PacBio) RS II platform is capable of reading long nucleic acid stretches in a single sequencing run. The pseudorabies virus (PRV) is an excellent system to study herpesvirus gene expression and potential interactions between the transcriptional units. In this work, non-amplified and amplified isoform sequencing protocols were used to characterize the poly(A+) fraction of the lytic transcriptome of PRV, with the aim of a complete transcriptional annotation of the viral genes. The analyses revealed a previously unrecognized complexity of the PRV transcriptome including the discovery of novel protein-coding and non-coding genes, novel mono- and polycistronic transcription units, as well as extensive transcriptional overlaps between neighboring and distal genes. This study identified non-coding transcripts overlapping all three replication origins of the PRV, which might play a role in the control of DNA synthesis. We additionally established the relative expression levels of gene products. Our investigations revealed that the whole PRV genome is utilized for transcription, including both DNA strands in all coding and intergenic regions. The genome-wide occurrence of transcript overlaps suggests a crosstalk between genes through a network formed by interacting transcriptional machineries with a potential function in the control of gene expression.
PMCID: PMC5042381  PMID: 27685795
10.  ChIA-PET2: a versatile and flexible pipeline for ChIA-PET data analysis 
Nucleic Acids Research  2016;45(1):e4.
ChIA-PET2 is a versatile and flexible pipeline for analyzing different types of ChIA-PET data from raw sequencing reads to chromatin loops. ChIA-PET2 integrates all steps required for ChIA-PET data analysis, including linker trimming, read alignment, duplicate removal, peak calling and chromatin loop calling. It supports different kinds of ChIA-PET data generated from different ChIA-PET protocols and also provides quality controls for different steps of ChIA-PET analysis. In addition, ChIA-PET2 can use phased genotype data to call allele-specific chromatin interactions. We applied ChIA-PET2 to different ChIA-PET datasets, demonstrating its significantly improved performance as well as its ability to easily process ChIA-PET raw data. ChIA-PET2 is available at
PMCID: PMC5224499  PMID: 27625391
11.  Exome Sequencing of Neonatal Blood Spots and the Identification of Genes Implicated in Bronchopulmonary Dysplasia 
Rationale: Bronchopulmonary dysplasia (BPD), a prevalent severe lung disease of premature infants, has a strong genetic component. Large-scale genome-wide association studies for common variants have not revealed its genetic basis.
Objectives: Given the historical high mortality rate of extremely preterm infants who now survive and develop BPD, we hypothesized that risk loci underlying this disease are under severe purifying selection during evolution; thus, rare variants likely explain greater risk of the disease.
Methods: We performed exome sequencing on 50 BPD-affected and unaffected twin pairs using DNA isolated from neonatal blood spots and identified genes affected by extremely rare nonsynonymous mutations. Functional genomic approaches were then used to systematically compare these affected genes.
Measurements and Main Results: We identified 258 genes with rare nonsynonymous mutations in patients with BPD. These genes were highly enriched for processes involved in pulmonary structure and function including collagen fibril organization, morphogenesis of embryonic epithelium, and regulation of Wnt signaling pathway; displayed significantly elevated expression in fetal and adult lungs; and were substantially up-regulated in a murine model of BPD. Analyses of mouse mutants revealed their phenotypic enrichment for embryonic development and the cyanosis phenotype, a clinical manifestation of BPD.
Conclusions: Our study supports the role of rare variants in BPD, in contrast with the role of common variants targeted by genome-wide association studies. Overall, our study is the first to sequence BPD exomes from newborn blood spot samples and identify with high confidence genes implicated in BPD, thereby providing important insights into its biology and molecular etiology.
PMCID: PMC4595691  PMID: 26030808
exome sequencing; chronic lung disease; bronchopulmonary dysplasia; genetic predisposition to disease; premature
12.  Genetic Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions 
Cell  2015;162(5):1051-1065.
Deciphering the impact of genetic variants on gene regulation is fundamental to understanding human disease. Although gene regulation often involves long-range interactions, it is unknown to what extent non-coding genetic variants influence distal molecular phenotypes. Here, we integrate chromatin profiling for three histone marks in lymphoblastoid cell lines (LCLs) from 75 sequenced individuals with LCL-specific Hi-C and ChIA-PET-based chromatin contact maps to uncover one of the largest collections of local and distal histone quantitative trait loci (hQTLs). Distal QTLs are enriched within topologically associated domains and exhibit largely concordant variation of chromatin state coordinated by proximal and distal non-coding genetic variants. Histone QTLs are enriched for common variants associated with autoimmune diseases and enable identification of putative target genes of disease-associated variants from genome-wide association studies. These analyses provide insights into how genetic variation can affect human disease phenotypes by coordinated changes in chromatin at interacting regulatory elements.
PMCID: PMC4556133  PMID: 26300125
13.  Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features 
Nature Communications  2016;7:12474.
Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs.
Diagnosis of lung cancer through manual histopathology evaluation is insufficient to predict patient survival. Here, the authors use computerized image processing to identify diagnostically relevant image features and use these features to distinguish lung cancer patients with different prognoses.
PMCID: PMC4990706  PMID: 27527408
14.  RNA Sequencing Analysis Detection of a Novel Pathway of Endothelial Dysfunction in Pulmonary Arterial Hypertension 
Rationale: Pulmonary arterial hypertension is characterized by endothelial dysregulation, but global changes in gene expression have not been related to perturbations in function.
Objectives: RNA sequencing was used to discriminate changes in transcriptomes of endothelial cells cultured from lungs of patients with idiopathic pulmonary arterial hypertension versus control subjects and to assess the functional significance of major differentially expressed transcripts.
Methods: The endothelial transcriptomes from the lungs of seven control subjects and six patients with idiopathic pulmonary arterial hypertension were analyzed. Differentially expressed genes were related to bone morphogenetic protein type 2 receptor (BMPR2) signaling. Those down-regulated were assessed for function in cultured cells and in a transgenic mouse.
Measurements and Main Results: Fold differences in 10 genes were significant (P < 0.05), four increased and six decreased in patients versus control subjects. No patient was mutant for BMPR2. However, knockdown of BMPR2 by siRNA in control pulmonary arterial endothelial cells recapitulated 6 of 10 patient-related gene changes, including decreased collagen IV (COL4A1, COL4A2) and ephrinA1 (EFNA1). Reduction of BMPR2-regulated transcripts was related to decreased β-catenin. Reducing COL4A1, COL4A2, and EFNA1 by siRNA inhibited pulmonary endothelial adhesion, migration, and tube formation. In mice null for the EFNA1 receptor, EphA2, versus control animals, vascular endothelial growth factor receptor blockade and hypoxia caused more severe pulmonary hypertension, judged by elevated right ventricular systolic pressure, right ventricular hypertrophy, and loss of small arteries.
Conclusions: The novel relationship between BMPR2 dysfunction and reduced expression of endothelial COL4 and EFNA1 may underlie vulnerability to injury in pulmonary arterial hypertension.
PMCID: PMC4584250  PMID: 26030479
pulmonary hypertension; vascular endothelium; transcriptome; ephrin; collagen IV
15.  EPHB4 kinase–inactivating mutations cause autosomal dominant lymphatic-related hydrops fetalis 
The Journal of Clinical Investigation  null;126(8):3080-3088.
Hydrops fetalis describes fluid accumulation in at least 2 fetal compartments, including abdominal cavities, pleura, and pericardium, or in body tissue. The majority of hydrops fetalis cases are nonimmune conditions that present with generalized edema of the fetus, and approximately 15% of these nonimmune cases result from a lymphatic abnormality. Here, we have identified an autosomal dominant, inherited form of lymphatic-related (nonimmune) hydrops fetalis (LRHF). Independent exome sequencing projects on 2 families with a history of in utero and neonatal deaths associated with nonimmune hydrops fetalis uncovered 2 heterozygous missense variants in the gene encoding Eph receptor B4 (EPHB4). Biochemical analysis determined that the mutant EPHB4 proteins are devoid of tyrosine kinase activity, indicating that loss of EPHB4 signaling contributes to LRHF pathogenesis. Further, inactivation of Ephb4 in lymphatic endothelial cells of developing mouse embryos led to defective lymphovenous valve formation and consequent subcutaneous edema. Together, these findings identify EPHB4 as a critical regulator of early lymphatic vascular development and demonstrate that mutations in the gene can cause an autosomal dominant form of LRHF that is associated with a high mortality rate.
PMCID: PMC4966301  PMID: 27400125
16.  Effects of cellular origin on differentiation of human induced pluripotent stem cell–derived endothelial cells 
JCI insight  2016;1(8):e85558.
Human induced pluripotent stem cells (iPSCs) can be derived from various types of somatic cells by transient overexpression of 4 Yamanaka factors (OCT4, SOX2, C-MYC, and KLF4). Patient-specific iPSC derivatives (e.g., neuronal, cardiac, hepatic, muscular, and endothelial cells [ECs]) hold great promise in drug discovery and regenerative medicine. In this study, we aimed to evaluate whether the cellular origin can affect the differentiation, in vivo behavior, and single-cell gene expression signatures of human iPSC–derived ECs. We derived human iPSCs from 3 types of somatic cells of the same individuals: fibroblasts (FB-iPSCs), ECs (EC-iPSCs), and cardiac progenitor cells (CPC-iPSCs). We then differentiated them into ECs by sequential administration of Activin, BMP4, bFGF, and VEGF. EC-iPSCs at early passage (10 < P < 20) showed higher EC differentiation propensity and gene expression of EC-specific markers (PECAM1 and NOS3) than FB-iPSCs and CPC-iPSCs. In vivo transplanted EC-iPSC–ECs were recovered with a higher percentage of CD31+ population and expressed higher EC-specific gene expression markers (PECAM1, KDR, and ICAM) as revealed by microfluidic single-cell quantitative PCR (qPCR). In vitro EC-iPSC–ECs maintained a higher CD31+ population than FB-iPSC–ECs and CPC-iPSC–ECs with long-term culturing and passaging. These results indicate that cellular origin may influence lineage differentiation propensity of human iPSCs; hence, the somatic memory carried by early passage iPSCs should be carefully considered before clinical translation.
PMCID: PMC4937999  PMID: 27398408
17.  Synthetic long read sequencing reveals the composition and intraspecies diversity of the human microbiome 
Nature biotechnology  2015;34(1):64-69.
Identifying bacterial strains in metagenome and microbiome samples using computational analyses of short-read sequence remains a difficult problem. Here, we present an analysis of a human gut microbiome using on Tru-seq synthetic long reads combined with new computational tools for metagenomic long-read assembly, variant-calling and haplotyping (Nanoscope and Lens). Our analysis identifies 178 bacterial species of which 51 were not found using short sequence reads alone. We recover bacterial contigs that comprise multiple operons, including 22 contigs of >1Mbp. Extensive intraspecies variation among microbial strains in the form of haplotypes that span up to hundreds of Kbp can be observed using our approach. Our method incorporates synthetic long-read sequencing technology with standard shotgun approaches to move towards rapid, precise and comprehensive analyses of metagenome and microbiome samples.
PMCID: PMC4884093  PMID: 26655498
18.  A Critical Role of the PTEN/PDGF Signaling Network for the Regulation of Radiosensitivity in Adenocarcinoma of the Prostate 
Loss or mutation of the phosphate and tensin homologue (PTEN) is a common genetic abnormality in prostate cancer (PCa) and induces platelet-derived growth factor D (PDGF D) signaling. We examined the role of the PTEN/PDGF axis on radioresponse using a murine PTEN null prostate epithelial cell model.
Methods and Materials
PTEN wild-type (PTEN+/+) and PTEN knockout (PTEN−/−) murine prostate epithelial cell lines were used to examine the relationship between the PTEN status and radiosensitivity and also to modulate the PDGF D expression levels. PTEN−/− cells were transduced with a small hairpin RNA (shRNA) lentiviral vector containing either scrambled nucleotides (SCRM) or sequences targeted to PDGF D (shPDGF D). Tumorigenesis and morphogenesis of these cell lines were evaluated in vivo via subcutaneous injection of male nude mice and in vitro using Matrigel 3-dimensional (3D) culture. Effects of irradiation on clonogenic survival, cell migration, and invasion were measured with respect to the PTEN status and the PDGF D expression level. In addition, apoptosis and cell cycle redistribution were examined as potential mechanisms for differences seen.
PTEN−/− cells were highly tumorigenic in animals and effectively formed foci in 3D culture. Importantly, loss of PDGF D in these cell lines drastically diminished these phenotypes. Furthermore, PTEN−/− cells demonstrated increased clonogenic survival in vitro compared to PTEN+/+, and attenuation of PDGF D significantly reversed this radioresistant phenotype. PTEN−/− cells displayed greater migratory and invasive potential at baseline as well as after irradiation. Both the basal and radiation-induced migratory and invasive phenotypes in PTEN−/− cells required PDGF D expression. Interestingly, these differences were independent of apoptosis and cell cycle redistribution, as they showed no significant difference.
We propose that PDGF D represents a potentially promising target for PCa treatment resistance in the absence of PTEN function, and warrants further laboratory evaluation and clinical study.
PMCID: PMC4920083  PMID: 24331662
19.  Identification of significantly mutated regions across cancer types highlights a rich landscape of functional molecular alterations 
Nature genetics  2015;48(2):117-125.
Cancer sequencing studies have primarily identified cancer-driver genes by the accumulation of protein-altering mutations. An improved method would be annotation-independent, sensitive to unknown distributions of functions within proteins, and inclusive of non-coding drivers. We employed density-based clustering methods in 21 tumor types to detect variably-sized significantly mutated regions (SMRs). SMRs reveal recurrent alterations across a spectrum of coding and non-coding elements, including transcription factor binding sites and untranslated regions mutated in up to ∼15% of specific tumor types. SMRs reveal spatial clustering of mutations at molecular domains and interfaces, often with associated changes in signaling. Mutation frequencies in SMRs demonstrate that distinct protein regions are differentially mutated among tumor types, as exemplified by a linker region of PIK3CA in which biophysical simulations suggest mutations affect regulatory interactions. The functional diversity of SMRs underscores both the varied mechanisms of oncogenic misregulation and the advantage of functionally-agnostic driver identification.
PMCID: PMC4731297  PMID: 26691984
Significantly Mutated Regions (SMRs); cancer; exome sequencing; non-coding variation
20.  Genome assembly from synthetic long read clouds 
Bioinformatics  2016;32(12):i216-i224.
Motivation: Despite rapid progress in sequencing technology, assembling de novo the genomes of new species as well as reconstructing complex metagenomes remains major technological challenges. New synthetic long read (SLR) technologies promise significant advances towards these goals; however, their applicability is limited by high sequencing requirements and the inability of current assembly paradigms to cope with combinations of short and long reads.
Results: Here, we introduce Architect, a new de novo scaffolder aimed at SLR technologies. Unlike previous assembly strategies, Architect does not require a costly subassembly step; instead it assembles genomes directly from the SLR’s underlying short reads, which we refer to as read clouds. This enables a 4- to 20-fold reduction in sequencing requirements and a 5-fold increase in assembly contiguity on both genomic and metagenomic datasets relative to state-of-the-art assembly strategies aimed directly at fully subassembled long reads.
Availability and Implementation: Our source code is freely available at
PMCID: PMC4908351  PMID: 27307620
21.  Effects of cellular origin on differentiation of human induced pluripotent stem cell–derived endothelial cells 
JCI Insight  null;1(8):e85558.
Human induced pluripotent stem cells (iPSCs) can be derived from various types of somatic cells by transient overexpression of 4 Yamanaka factors (OCT4, SOX2, C-MYC, and KLF4). Patient-specific iPSC derivatives (e.g., neuronal, cardiac, hepatic, muscular, and endothelial cells [ECs]) hold great promise in drug discovery and regenerative medicine. In this study, we aimed to evaluate whether the cellular origin can affect the differentiation, in vivo behavior, and single-cell gene expression signatures of human iPSC–derived ECs. We derived human iPSCs from 3 types of somatic cells of the same individuals: fibroblasts (FB-iPSCs), ECs (EC-iPSCs), and cardiac progenitor cells (CPC-iPSCs). We then differentiated them into ECs by sequential administration of Activin, BMP4, bFGF, and VEGF. EC-iPSCs at early passage (10 < P < 20) showed higher EC differentiation propensity and gene expression of EC-specific markers (PECAM1 and NOS3) than FB-iPSCs and CPC-iPSCs. In vivo transplanted EC-iPSC–ECs were recovered with a higher percentage of CD31+ population and expressed higher EC-specific gene expression markers (PECAM1, KDR, and ICAM) as revealed by microfluidic single-cell quantitative PCR (qPCR). In vitro EC-iPSC–ECs maintained a higher CD31+ population than FB-iPSC–ECs and CPC-iPSC–ECs with long-term culturing and passaging. These results indicate that cellular origin may influence lineage differentiation propensity of human iPSCs; hence, the somatic memory carried by early passage iPSCs should be carefully considered before clinical translation.
Cellular origin influences the endothelial cell differentiation propensity of human iPSCs, suggesting that the source of iPSCs should be carefully considered before clinical translation.
PMCID: PMC4937999  PMID: 27398408
22.  High-Throughput Sequencing Technologies 
Molecular cell  2015;58(4):586-597.
The human genome sequence has profoundly altered our understanding of biology, human diversity and disease. The path from the first draft sequence to our nascent era of personal genomes and genomic medicine has been made possible only because of the extraordinary advancements in DNA sequencing technologies over the past ten years. Here, we discuss commonly used high-throughput sequencing platforms, the growing array of sequencing assays developed around them as well as the challenges facing current sequencing platforms and their clinical application.
PMCID: PMC4494749  PMID: 26000844
High-throughput sequencing; next-generation sequencing; genomics; sequencing applications; personalized medicine
23.  Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events 
Nature biotechnology  2015;33(7):736-742.
Alternative splicing shapes mammalian transcriptomes, with many RNA molecules undergoing multiple distant alternative splicing events. Comprehensive transcriptome analysis, including analysis of exon co-association in the same molecule, requires deep, long-read sequencing. Here we introduce an RNA sequencing method, synthetic long-read RNA sequencing (SLR-RNA-seq), in which small pools (≤1,000 molecules/pool, ≤1 molecule/gene for most genes) of full-length cDNAs are amplified, fragmented and short-read-sequenced. We demonstrate that these RNA sequences reconstructed from the short reads from each of the pools are mostly close to full length and contain few insertion and deletion errors. We report many previously undescribed isoforms (human brain: ∼13,800 affected genes, 14.5% of molecules; mouse brain ∼8,600 genes, 18% of molecules) and up to 165 human distant molecularly associated exon pairs (dMAPs) and distant molecularly and mutually exclusive pairs (dMEPs). Of 16 associated pairs detected in the mouse brain, 9 are conserved in human. Our results indicate conserved mechanisms that can produce distant but phased features on transcript and proteome isoforms.
PMCID: PMC4832928  PMID: 25985263
24.  Impact of allele-specific peptides in proteome quantification 
Mass spectrometry-based proteome technologies have greatly improved our ability to detect and quantify proteomes across various biological samples. High throughput bottom-up proteome profiling in combination with targeted mass spectrometry method, e.g. selected reaction monitoring (SRM) assay, is emerging as a powerful approach in the field of biomarker discovery. In the past few years, increasing number of studies have attempted to integrate genomic and proteomic data for biomarker discovery. Here we describe how allele-specific peptide can be applied in biomarker discovery and their impact in protein quantification.
PMCID: PMC4448739  PMID: 25676416
The decreasing cost of genotyping and genome sequencing has ushered in an era of genomic personalized medicine. More than 100,000 individuals have been genotyped by direct-to-consumer genetic testing services, which offer a glimpse into the interpretation and exploration of a personal genome. However, these interpretations, which require extensive manual curation, are subject to the preferences of the company and are not customizable by the individual. Academic institutions teaching personalized medicine, as well as genetic hobbyists, may prefer to customize their analysis and have full control over the content and method of interpretation. We present the Interpretome, a system for private genome interpretation, which contains all genotype information in client-side interpretation scripts, supported by server-side databases. We provide state-of-the-art analyses for teaching clinical implications of personal genomics, including disease risk assessment and pharmacogenomics. Additionally, we have implemented client-side algorithms for ancestry inference, demonstrating the power of these methods without excessive computation. Finally, the modular nature of the system allows for plugin capabilities for custom analyses. This system will allow for personal genome exploration without compromising privacy, facilitating hands-on courses in genomics and personalized medicine.
PMCID: PMC4809242  PMID: 22174289

Results 1-25 (226)