PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-16 (16)
 

Clipboard (0)
None

Select a Filter Below

Journals
more »
Year of Publication
1.  Hemogenic endocardium contributes to transient definitive hematopoiesis 
Nature communications  2013;4:1564.
Hematopoietic cells arise from spatiotemporally restricted domains in the developing embryo. Although studies of non-mammalian animal and in vitro embryonic stem cell models suggest a close relationship among cardiac, endocardial, and hematopoietic lineages, it remains unknown whether the mammalian heart tube serves as a hemogenic organ akin to the dorsal aorta. Here we examine the hemogenic activity of the developing endocardium. Mouse heart explants generate myeloid and erythroid colonies in the absence of circulation. Hemogenic activity arises from a subset of endocardial cells in the outflow cushion and atria earlier than in the aorta-gonad-mesonephros region, and is transient and definitive in nature. Interestingly, key cardiac transcription factors, Nkx2-5 and Isl1, are expressed in and required for the hemogenic population of the endocardium. Together, these data suggest that a subset of endocardial/endothelial cells expressing cardiac markers serve as a de novo source for transient definitive hematopoietic progenitors.
doi:10.1038/ncomms2569
PMCID: PMC3612528  PMID: 23463007
2.  Scl represses cardiomyogenesis in prospective hemogenic endothelium and endocardium 
Cell  2012;150(3):590-605.
Summary
Endothelium in embryonic hematopoietic tissues generates hematopoietic stem/progenitor cells; however, it is unknown how its unique potential is specified. We show that transcription factor Scl/Tal1 is essential for both establishing the hematopoietic transcriptional program in hemogenic endothelium and preventing its misspecification to a cardiomyogenic fate. Scl−/− embryos activated a cardiac transcriptional program in yolk sac endothelium, leading to the emergence of CD31+Pdgfrα+ cardiogenic precursors that generated spontaneously beating cardiomyocytes. Ectopic cardiogenesis was also observed in Scl−/− hearts, where the disorganized endocardium precociously differentiated into cardiomyocytes. Induction of mosaic deletion of Scl in Sclfl/fl Rosa26Cre-ERT2 embryos revealed a cell-intrinsic, temporal requirement for Scl to prevent cardiomyogenesis from endothelium. Scl−/− endothelium also upregulated the expression of Wnt antagonists, which promoted rapid cardiomyocyte differentiation of ectopic cardiogenic cells. These results reveal unexpected plasticity in embryonic endothelium such that loss of a single master regulator can induce ectopic cardiomyogenesis from endothelial cells.
doi:10.1016/j.cell.2012.06.026
PMCID: PMC3624753  PMID: 22863011
3.  Lymphoid Priming in Human Bone Marrow Begins Prior to CD10 Expression with Up-Regulation of L-selectin 
Nature immunology  2012;13(10):963-971.
The expression of CD10 has long been used to define human lymphoid commitment. We report a unique lymphoid-primed population in human bone marrow that was generated from hematopoietic stem cells (HSCs) before the onset of CD10 expression and B cell commitment. This subset was identified by high expression of the homing molecule L-selectin (CD62L). CD10−CD62Lhi progenitors possessed full lymphoid and monocytic potential, but lacked erythroid potential. Gene expression profiling placed the CD10−CD62Lhi population at an intermediate stage of differentiation between HSCs and lineage-negative (Lin−) CD34+CD10+ progenitors. L-selectin was expressed on immature thymocytes and its ligands were expressed at the cortico-medullary junction, suggesting a possible role in thymic homing. These studies identify the earliest stage of lymphoid priming in human bone marrow.
doi:10.1038/ni.2405
PMCID: PMC3448017  PMID: 22941246
4.  Expansion on Stromal Cells Preserves the Undifferentiated State of Human Hematopoietic Stem Cells Despite Compromised Reconstitution Ability 
PLoS ONE  2013;8(1):e53912.
Lack of HLA-matched hematopoietic stem cells (HSC) limits the number of patients with life-threatening blood disorders that can be treated by HSC transplantation. So far, insufficient understanding of the regulatory mechanisms governing human HSC has precluded the development of effective protocols for culturing HSC for therapeutic use and molecular studies. We defined a culture system using OP9M2 mesenchymal stem cell (MSC) stroma that protects human hematopoietic stem/progenitor cells (HSPC) from differentiation and apoptosis. In addition, it facilitates a dramatic expansion of multipotent progenitors that retain the immunophenotype (CD34+CD38−CD90+) characteristic of human HSPC and proliferative potential over several weeks in culture. In contrast, transplantable HSC could be maintained, but not significantly expanded, during 2-week culture. Temporal analysis of the transcriptome of the ex vivo expanded CD34+CD38−CD90+ cells documented remarkable stability of most transcriptional regulators known to govern the undifferentiated HSC state. Nevertheless, it revealed dynamic fluctuations in transcriptional programs that associate with HSC behavior and may compromise HSC function, such as dysregulation of PBX1 regulated genetic networks. This culture system serves now as a platform for modeling human multilineage hematopoietic stem/progenitor cell hierarchy and studying the complex regulation of HSC identity and function required for successful ex vivo expansion of transplantable HSC.
doi:10.1371/journal.pone.0053912
PMCID: PMC3547050  PMID: 23342037
5.  GFam: a platform for automatic annotation of gene families 
Nucleic Acids Research  2012;40(19):e152.
We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam’s capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.
doi:10.1093/nar/gks631
PMCID: PMC3479161  PMID: 22790981
6.  The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools 
Nucleic Acids Research  2011;40(D1):D1202-D1210.
The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is a genome database for Arabidopsis thaliana, an important reference organism for many fundamental aspects of biology as well as basic and applied plant biology research. TAIR serves as a central access point for Arabidopsis data, annotates gene function and expression patterns using controlled vocabulary terms, and maintains and updates the A. thaliana genome assembly and annotation. TAIR also provides researchers with an extensive set of visualization and analysis tools. Recent developments include several new genome releases (TAIR8, TAIR9 and TAIR10) in which the A. thaliana assembly was updated, pseudogenes and transposon genes were re-annotated, and new data from proteomics and next generation transcriptome sequencing were incorporated into gene models and splice variants. Other highlights include progress on functional annotation of the genome and the release of several new tools including Textpresso for Arabidopsis which provides the capability to carry out full text searches on a large body of research literature.
doi:10.1093/nar/gkr1090
PMCID: PMC3245047  PMID: 22140109
7.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project 
Gerstein, Mark B. | Lu, Zhi John | Van Nostrand, Eric L. | Cheng, Chao | Arshinoff, Bradley I. | Liu, Tao | Yip, Kevin Y. | Robilotto, Rebecca | Rechtsteiner, Andreas | Ikegami, Kohta | Alves, Pedro | Chateigner, Aurelien | Perry, Marc | Morris, Mitzi | Auerbach, Raymond K. | Feng, Xin | Leng, Jing | Vielle, Anne | Niu, Wei | Rhrissorrakrai, Kahn | Agarwal, Ashish | Alexander, Roger P. | Barber, Galt | Brdlik, Cathleen M. | Brennan, Jennifer | Brouillet, Jeremy Jean | Carr, Adrian | Cheung, Ming-Sin | Clawson, Hiram | Contrino, Sergio | Dannenberg, Luke O. | Dernburg, Abby F. | Desai, Arshad | Dick, Lindsay | Dosé, Andréa C. | Du, Jiang | Egelhofer, Thea | Ercan, Sevinc | Euskirchen, Ghia | Ewing, Brent | Feingold, Elise A. | Gassmann, Reto | Good, Peter J. | Green, Phil | Gullier, Francois | Gutwein, Michelle | Guyer, Mark S. | Habegger, Lukas | Han, Ting | Henikoff, Jorja G. | Henz, Stefan R. | Hinrichs, Angie | Holster, Heather | Hyman, Tony | Iniguez, A. Leo | Janette, Judith | Jensen, Morten | Kato, Masaomi | Kent, W. James | Kephart, Ellen | Khivansara, Vishal | Khurana, Ekta | Kim, John K. | Kolasinska-Zwierz, Paulina | Lai, Eric C. | Latorre, Isabel | Leahey, Amber | Lewis, Suzanna | Lloyd, Paul | Lochovsky, Lucas | Lowdon, Rebecca F. | Lubling, Yaniv | Lyne, Rachel | MacCoss, Michael | Mackowiak, Sebastian D. | Mangone, Marco | McKay, Sheldon | Mecenas, Desirea | Merrihew, Gennifer | Miller, David M. | Muroyama, Andrew | Murray, John I. | Ooi, Siew-Loon | Pham, Hoang | Phippen, Taryn | Preston, Elicia A. | Rajewsky, Nikolaus | Rätsch, Gunnar | Rosenbaum, Heidi | Rozowsky, Joel | Rutherford, Kim | Ruzanov, Peter | Sarov, Mihail | Sasidharan, Rajkumar | Sboner, Andrea | Scheid, Paul | Segal, Eran | Shin, Hyunjin | Shou, Chong | Slack, Frank J. | Slightam, Cindie | Smith, Richard | Spencer, William C. | Stinson, E. O. | Taing, Scott | Takasaki, Teruaki | Vafeados, Dionne | Voronina, Ksenia | Wang, Guilin | Washington, Nicole L. | Whittle, Christina M. | Wu, Beijing | Yan, Koon-Kiu | Zeller, Georg | Zha, Zheng | Zhong, Mei | Zhou, Xingliang | Ahringer, Julie | Strome, Susan | Gunsalus, Kristin C. | Micklem, Gos | Liu, X. Shirley | Reinke, Valerie | Kim, Stuart K. | Hillier, LaDeana W. | Henikoff, Steven | Piano, Fabio | Snyder, Michael | Stein, Lincoln | Lieb, Jason D. | Waterston, Robert H.
Science (New York, N.Y.)  2010;330(6012):1775-1787.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
doi:10.1126/science.1196914
PMCID: PMC3142569  PMID: 21177976
8.  Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays 
BMC Genomics  2010;11:383.
Background
Tiling arrays have been the tool of choice for probing an organism's transcriptome without prior assumptions about the transcribed regions, but RNA-Seq is becoming a viable alternative as the costs of sequencing continue to decrease. Understanding the relative merits of these technologies will help researchers select the appropriate technology for their needs.
Results
Here, we compare these two platforms using a matched sample of poly(A)-enriched RNA isolated from the second larval stage of C. elegans. We find that the raw signals from these two technologies are reasonably well correlated but that RNA-Seq outperforms tiling arrays in several respects, notably in exon boundary detection and dynamic range of expression. By exploring the accuracy of sequencing as a function of depth of coverage, we found that about 4 million reads are required to match the sensitivity of two tiling array replicates. The effects of cross-hybridization were analyzed using a "nearest neighbor" classifier applied to array probes; we describe a method for determining potential "black list" regions whose signals are unreliable. Finally, we propose a strategy for using RNA-Seq data as a gold standard set to calibrate tiling array data. All tiling array and RNA-Seq data sets have been submitted to the modENCODE Data Coordinating Center.
Conclusions
Tiling arrays effectively detect transcript expression levels at a low cost for many species while RNA-Seq provides greater accuracy in several regards. Researchers will need to carefully select the technology appropriate to the biological investigations they are undertaking. It will also be important to reconsider a comparison such as ours as sequencing technologies continue to evolve.
doi:10.1186/1471-2164-11-383
PMCID: PMC3091629  PMID: 20565764
9.  SCPS: a fast implementation of a spectral method for detecting protein families on a genome-wide scale 
BMC Bioinformatics  2010;11:120.
Background
An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed for this task which are "local" in the sense that they assign a protein to a cluster based only on the distances between that protein and the other proteins in the set. It was shown recently that global methods such as spectral clustering have better performance on a wide variety of datasets. However, currently available implementations of spectral clustering methods mostly consist of a few loosely coupled Matlab scripts that assume a fair amount of familiarity with Matlab programming and hence they are inaccessible for large parts of the research community.
Results
SCPS (Spectral Clustering of Protein Sequences) is an efficient and user-friendly implementation of a spectral method for inferring protein families. The method uses only pairwise sequence similarities, and is therefore practical when only sequence information is available. SCPS was tested on difficult sets of proteins whose relationships were extracted from the SCOP database, and its results were extensively compared with those obtained using other popular protein clustering algorithms such as TribeMCL, hierarchical clustering and connected component analysis. We show that SCPS is able to identify many of the family/superfamily relationships correctly and that the quality of the obtained clusters as indicated by their F-scores is consistently better than all the other methods we compared it with. We also demonstrate the scalability of SCPS by clustering the entire SCOP database (14,183 sequences) and the complete genome of the yeast Saccharomyces cerevisiae (6,690 sequences).
Conclusions
Besides the spectral method, SCPS also implements connected component analysis and hierarchical clustering, it integrates TribeMCL, it provides different cluster quality tools, it can extract human-readable protein descriptions using GI numbers from NCBI, it interfaces with external tools such as BLAST and Cytoscape, and it can produce publication-quality graphical representations of the clusters obtained, thus constituting a comprehensive and effective tool for practical research in computational biology. Source code and precompiled executables for Windows, Linux and Mac OS X are freely available at http://www.paccanarolab.org/software/scps.
doi:10.1186/1471-2105-11-120
PMCID: PMC2841596  PMID: 20214776
10.  An approach to compare genome tiling microarray and MPSS sequencing data for transcript mapping 
BMC Research Notes  2009;2:211.
We are correcting the abstract of our published article ([1]). The sentence that starts "We observe that 4.5% of MPSS tags...." was not scientifically complete in the original abstract, having only two of the four numbers required to describe a comparison of two technologies in two different organisms. The abstract below more accurately describes our findings, as documented in Figure 1 of the manuscript.
doi:10.1186/1756-0500-2-211
PMCID: PMC2770075
11.  An approach to comparing tiling array and high throughput sequencing technologies for genomic transcript mapping 
BMC Research Notes  2009;2:150.
Background
There are two main technologies for transcriptome profiling, namely, tiling microarrays and high-throughput sequencing. Recently there has been a tremendous amount of excitement about the latter because of the advent of next-generation sequencing technologies and its promises. Consequently, the question of the moment is how these two technologies compare. Here we attempt to develop an approach to do a fair comparison of transcripts identified from tiling microarray and MPSS sequencing data.
Findings
This comparison is a challenging task because the sequencing data is discrete while the tiling array data is continuous. We use the published rice and Arabidopsis datasets which provide currently best matched sets of arrays and sequencing experiments using a slightly earlier generation of sequencing, the MPSS tag sequencing technology. After scoring the arrays consistently in both the organisms, a first pass comparison reveals a surprisingly small overlap in transcripts of 22% and 66% respectively, in rice and Arabidopsis. However, when we do the analysis in detail, we find that this is an underestimate. In particular, when we map the probe intensities onto the sequencing tags and then look at their intensity distribution, we see that they are very similar to exons. Furthermore, restricting our comparison to only protein-coding gene loci revealed a very good overlap between the two technologies.
Conclusion
Our approach to compare genome tiling microarray and MPSS sequencing data suggests that there is actually a reasonable overlap in transcripts identified by the two technologies. This overlap is distorted by the scoring and thresholding in the tiling array scoring procedure.
doi:10.1186/1756-0500-2-150
PMCID: PMC2764720  PMID: 19630981
12.  Domain Insertions in Protein Structures 
Journal of molecular biology  2004;338(4):633-641.
Domains are the structural, functional or evolutionary units of proteins. Proteins can comprise a single domain or a combination of domains. In multi-domain proteins, the domains almost always occur end-to-end, i.e., one domain follows the C-terminal end of another domain. However, there are exceptions to this common pattern, where multi-domain proteins are formed by insertion of one domain (insert) into another domain (parent). Here, we provide a quantitative description of known insertions in the Protein Data Bank (PDB). We found that 9% of domain combinations observed in non-redundant PDB are insertions. Although 90% of all insertions involve only one insert, proteins can clearly have multiple (nested, two-domain and three-domain) inserts. We also observed correlations between the structure and function of a domain and its tendency to be found as a parent or an insert. There is a bias in insert position towards the C terminus of parents. We observed that the atomic distance between the N and C terminus of an insert is significantly smaller when compared to the N-to-C distance in a parent context or a single domain context. Insertions are found always to occur in loop regions of parent domains. Our observations regarding the relationship between domain insertions and the structure, function and evolution of proteins have implications for protein engineering.
doi:10.1016/j.jmb.2004.03.039
PMCID: PMC2665287  PMID: 15099733
domain insertion; inserted domain; discontinuous domains; non-contiguous domains; protein engineering
13.  Transmembrane Protein Oxygen Content and Compartmentalization of Cells 
PLoS ONE  2008;3(7):e2726.
Recently, there was a report that explored the oxygen content of transmembrane proteins over macroevolutionary time scales where the authors observed a correlation between the geological time of appearance of compartmentalized cells with atmospheric oxygen concentration. The authors predicted, characterized and correlated the differences in the structure and composition of transmembrane proteins from the three kingdoms of life with atmospheric oxygen concentrations in geological timescale. They hypothesized that transmembrane proteins in ancient taxa were selectively excluding oxygen and as this constraint relaxed over time with increase in the levels of atmospheric oxygen the size and number of communication-related transmembrane proteins increased. In summary, they concluded that compartmentalized and non-compartmentalized cells can be distinguished by how oxygen is partitioned at the proteome level. They derived this conclusion from an analysis of 19 taxa. We extended their analysis on a larger sample of taxa comprising 309 eubacterial, 34 archaeal, and 30 eukaryotic complete proteomes and observed that one can not absolutely separate the two groups of cells based on partition of oxygen in their membrane proteins. In addition, the origin of compartmentalized cells is likely to have been driven by an innovation than happened 2700 million years ago in the membrane composition of cells that led to the evolution of endocytosis and exocytosis rather than due to the rise in concentration of atmospheric oxygen.
doi:10.1371/journal.pone.0002726
PMCID: PMC2443287  PMID: 18628944
15.  Global Identification and Characterization of Transcriptionally Active Regions in the Rice Genome 
PLoS ONE  2007;2(3):e294.
Genome tiling microarray studies have consistently documented rich transcriptional activity beyond the annotated genes. However, systematic characterization and transcriptional profiling of the putative novel transcripts on the genome scale are still lacking. We report here the identification of 25,352 and 27,744 transcriptionally active regions (TARs) not encoded by annotated exons in the rice (Oryza. sativa) subspecies japonica and indica, respectively. The non-exonic TARs account for approximately two thirds of the total TARs detected by tiling arrays and represent transcripts likely conserved between japonica and indica. Transcription of 21,018 (83%) japonica non-exonic TARs was verified through expression profiling in 10 tissue types using a re-array in which annotated genes and TARs were each represented by five independent probes. Subsequent analyses indicate that about 80% of the japonica TARs that were not assigned to annotated exons can be assigned to various putatively functional or structural elements of the rice genome, including splice variants, uncharacterized portions of incompletely annotated genes, antisense transcripts, duplicated gene fragments, and potential non-coding RNAs. These results provide a systematic characterization of non-exonic transcripts in rice and thus expand the current view of the complexity and dynamics of the rice transcriptome.
doi:10.1371/journal.pone.0000294
PMCID: PMC1808428  PMID: 17372628
16.  DomIns: a web resource for domain insertions in known protein structures 
Nucleic Acids Research  2004;32(Database issue):D193-D195.
Proteins can be formed by single or multiple domains. The process of recombination at the molecular level has generated a wide variety of multi-domain proteins with specific domain organization to cater to the functional requirements of an organism. The functional and structural costs of inserting a domain into another means that multi-domain proteins are usually formed by covalently linking the N-terminus of one domain to the C-terminus of the preceding domain. While this is true in a large proportion of multi-domain proteins, we find a significant fraction of proteins that are the result of domain insertion. The inserted domain breaks the sequence contiguity of the domain into which it is inserted leading to a novel domain organization. This web resource aims to document domain insertions in known protein structures that are classified in the SCOP database. The web server can be accessed from http://stash.mrc-lmb.cam.ac.uk/DomIns/.
doi:10.1093/nar/gkh047
PMCID: PMC308781  PMID: 14681392

Results 1-16 (16)