1.  Epigenetic repression of miR-31 disrupts androgen receptor homeostasis and contributes to prostate cancer progression 
Cancer research  2012;73(3):1232-1244.
Androgen receptor (AR) signaling plays a critical role in prostate cancer (PCA) pathogenesis. Yet, the regulation of AR signaling remains elusive. Even with stringent androgen deprivation therapy, AR signaling persists. Here, our data suggest that there is a complex interaction between the expression of the tumor suppressor miRNA, miR-31 and AR signaling. We examined primary and metastatic PCA and found that miR-31 expression was reduced as a result of promoter hypermethylation and importantly, the levels of miR-31 expression was inversely correlated with the aggressiveness of the disease. As the expression of AR and miR-31 was inversely correlated in the cell lines, our study further suggested that miR-31 and AR could mutually repress each other. Upregulation of miR-31 effectively suppressed AR expression through multiple mechanisms and inhibited PCA growth in vivo. Notably, we found that miR-31 targeted AR directly at a site located in the coding region, which was commonly mutated in PCA. Additionally, miR-31 suppressed cell cycle regulators, including E2F1, E2F2, EXO1, FOXM1, and MCM2. Together, our findings suggest a novel AR regulatory mechanism mediated through miR-31 expression. The downregulation of miR-31 may disrupt cellular homeostasis and contribute to the evolution and progression of PCA. We provide implications for epigenetic treatment and support clinical development of detecting miR-31 promoter methylation as a novel biomarker.
PMCID: PMC3563734  PMID: 23233736
prostate cancer; androgen receptor; miR-31; DNA hypermethylation; biomarker
2.  MicroRNA-223 controls susceptibility to tuberculosis by regulating lung neutrophil recruitment 
The Journal of Clinical Investigation  2013;123(11):4836-4848.
The molecular mechanisms that control innate immune cell trafficking during chronic infection and inflammation, such as in tuberculosis (TB), are incompletely understood. During active TB, myeloid cells infiltrate the lung and sustain local inflammation. While the chemoattractants that orchestrate these processes are increasingly recognized, the posttranscriptional events that dictate their availability are unclear. We identified microRNA-223 (miR-223) as an upregulated small noncoding RNA in blood and lung parenchyma of TB patients and during murine TB. Deletion of miR-223 rendered TB-resistant mice highly susceptible to acute lung infection. The lethality of miR-223–/– mice was apparently not due to defects in antimycobacterial T cell responses. Exacerbated TB in miR-223–/– animals could be partially reversed by neutralization of CXCL2, CCL3, and IL-6, by mAb depletion of neutrophils, and by genetic deletion of Cxcr2. We found that miR-223 controlled lung recruitment of myeloid cells, and consequently, neutrophil-driven lethal inflammation. We conclude that miR-223 directly targets the chemoattractants CXCL2, CCL3, and IL-6 in myeloid cells. Our study not only reveals an essential role for a single miRNA in TB, it also identifies new targets for, and assigns biological functions to, miR-223. By regulating leukocyte chemotaxis via chemoattractants, miR-223 is critical for the control of TB and potentially other chronic inflammatory diseases.
PMCID: PMC3809781  PMID: 24084739
3.  Prediction of Cross-Recognition of Peptide-HLA A2 by Melan-A-Specific Cytotoxic T Lymphocytes Using Three-Dimensional Quantitative Structure-Activity Relationships 
PLoS ONE  2013;8(7):e65590.
The cross-recognition of peptides by cytotoxic T lymphocytes is a key element in immunology and in particular in peptide based immunotherapy. Here we develop three-dimensional (3D) quantitative structure-activity relationships (QSARs) to predict cross-recognition by Melan-A-specific cytotoxic T lymphocytes of peptides bound to HLA A*0201 (hereafter referred to as HLA A2). First, we predict the structure of a set of self- and pathogen-derived peptides bound to HLA A2 using a previously developed ab initio structure prediction approach [Fagerberg et al., J. Mol. Biol., 521–46 (2006)]. Second, shape and electrostatic energy calculations are performed on a 3D grid to produce similarity matrices which are combined with a genetic neural network method [So et al., J. Med. Chem., 4347–59 (1997)] to generate 3D-QSAR models. The models are extensively validated using several different approaches. During the model generation, the leave-one-out cross-validated correlation coefficient (q2) is used as the fitness criterion and all obtained models are evaluated based on their q2 values. Moreover, the best model obtained for a partitioned data set is evaluated by its correlation coefficient (r = 0.92 for the external test set). The physical relevance of all models is tested using a functional dependence analysis and the robustness of the models obtained for the entire data set is confirmed using y-randomization. Finally, the validated models are tested for their utility in the setting of rational peptide design: their ability to discriminate between peptides that only contain side chain substitutions in a single secondary anchor position is evaluated. In addition, the predicted cross-recognition of the mono-substituted peptides is confirmed experimentally in chromium-release assays. These results underline the utility of 3D-QSARs in peptide mimetic design and suggest that the properties of the unbound epitope are sufficient to capture most of the information to determine the cross-recognition.
PMCID: PMC3713012  PMID: 23874382
4.  Beyond Fifty Years of Millard's Rotation-Advancement Technique in Cleft Lip Closure: Are There Many “Millards”? 
Plastic Surgery International  2012;2012:731029.
In 1955, Millard developed the concept of rotation-advancement flap to treat cleft lip. Almost 6 decades later, it remains the most popular technique worldwide. Since the technique evolved and Millard published many technical variations, we decided to ask 10 experienced cleft surgeons how they would mark Millard's 7 points in two unilateral cleft lip patient photos and compared the results. In both pictures, points 1 and 2 were marked identically among surgeons. Points 3 were located adjacent to each other, but not coincident, and the largest distances between points 3 were 4.95 mm and 4.03 mm on pictures 1 and 2, respectively. Similar patterns were obtained for points 4, eight of them were adjacent, and the greatest distance between the points was 4.39 mm. Points 5 had the most divergence between the points among evaluators, which were responsible for the different shapes of the C-flap. Points 6 also had dissimilar markings, and such difference accounts for varying resection areas among evaluators. The largest distances observed were 11.66 mm and 7 mm on pictures 1 and 2, respectively. In summary, much has changed since Millard's initial procedure, but his basic principles have survived the inexorable test of time, proving that his idea has found place among the greatest concepts of modern plastic surgery.
PMCID: PMC3523606  PMID: 23304488
5.  Construction and Analysis of an Integrated Regulatory Network Derived from High-Throughput Sequencing Data 
PLoS Computational Biology  2011;7(11):e1002190.
We present a network framework for analyzing multi-level regulation in higher eukaryotes based on systematic integration of various high-throughput datasets. The network, namely the integrated regulatory network, consists of three major types of regulation: TF→gene, TF→miRNA and miRNA→gene. We identified the target genes and target miRNAs for a set of TFs based on the ChIP-Seq binding profiles, the predicted targets of miRNAs using annotated 3′UTR sequences and conservation information. Making use of the system-wide RNA-Seq profiles, we classified transcription factors into positive and negative regulators and assigned a sign for each regulatory interaction. Other types of edges such as protein-protein interactions and potential intra-regulations between miRNAs based on the embedding of miRNAs in their host genes were further incorporated. We examined the topological structures of the network, including its hierarchical organization and motif enrichment. We found that transcription factors downstream of the hierarchy distinguish themselves by expressing more uniformly at various tissues, have more interacting partners, and are more likely to be essential. We found an over-representation of notable network motifs, including a FFL in which a miRNA cost-effectively shuts down a transcription factor and its target. We used data of C. elegans from the modENCODE project as a primary model to illustrate our framework, but further verified the results using other two data sets. As more and more genome-wide ChIP-Seq and RNA-Seq data becomes available in the near future, our methods of data integration have various potential applications.
Author Summary
The precise control of gene expression lies at the heart of many biological processes. In eukaryotes, the regulation is performed at multiple levels, mediated by different regulators such as transcription factors and miRNAs, each distinguished by different spatial and temporal characteristics. These regulators are further integrated to form a complex regulatory network responsible for the orchestration. The construction and analysis of such networks is essential for understanding the general design principles. Recent advances in high-throughput techniques like ChIP-Seq and RNA-Seq provide an opportunity by offering a huge amount of binding and expression data. We present a general framework to combine these types of data into an integrated network and perform various topological analyses, including its hierarchical organization and motif enrichment. We find that the integrated network possesses an intrinsic hierarchical organization and is enriched in several network motifs that include both transcription factors and miRNAs. We further demonstrate that the framework can be easily applied to other species like human and mouse. As more and more genome-wide ChIP-Seq and RNA-Seq data are going to be generated in the near future, our methods of data integration have various potential applications.
PMCID: PMC3219617  PMID: 22125477
6.  AlleleSeq: analysis of allele-specific expression and binding in a network framework 
A computational pipeline for constructing a personal diploid genome and determining sites of allele-specific activity is developed. Using a regulatory network framework, allele-specific binding and expression are found to be significantly coordinated across the genome.
Software was developed for building a personal diploid genome sequence, and determining sites of allele-specific binding and expression (AlleleSeq).This computational pipeline was used to analyze variation data, and deeply sequenced RNA-Seq and ChIP-Seq datasets, for individual NA12878 from the 1000 Genomes Project.The interaction between allele-specific binding and allele-specific expression are investigated, revealing clear coordination.
To study allele-specific expression (ASE) and binding (ASB), that is, differences between the maternally and paternally derived alleles, we have developed a computational pipeline (AlleleSeq). Our pipeline initially constructs a diploid personal genome sequence (and corresponding personalized gene annotation) using genomic sequence variants (SNPs, indels, and structural variants), and then identifies allele-specific events with significant differences in the number of mapped reads between maternal and paternal alleles. There are many technical challenges in the construction and alignment of reads to a personal diploid genome sequence that we address, for example, bias of reads mapping to the reference allele. We have applied AlleleSeq to variation data for NA12878 from the 1000 Genomes Project as well as matched, deeply sequenced RNA-Seq and ChIP-Seq data sets generated for this purpose. In addition to observing fairly widespread allele-specific behavior within individual functional genomic data sets (including results consistent with X-chromosome inactivation), we can study the interaction between ASE and ASB. Furthermore, we investigate the coordination between ASE and ASB from multiple transcription factors events using a regulatory network framework. Correlation analyses and network motifs show mostly coordinated ASB and ASE.
PMCID: PMC3208341  PMID: 21811232
allele-specific; ChIP-Seq; networks; RNA-Seq
7.  Integrative Analysis of the Caenorhabditis elegans Genome by the modENCODE Project 
Gerstein, Mark B. | Lu, Zhi John | Van Nostrand, Eric L. | Cheng, Chao | Arshinoff, Bradley I. | Liu, Tao | Yip, Kevin Y. | Robilotto, Rebecca | Rechtsteiner, Andreas | Ikegami, Kohta | Alves, Pedro | Chateigner, Aurelien | Perry, Marc | Morris, Mitzi | Auerbach, Raymond K. | Feng, Xin | Leng, Jing | Vielle, Anne | Niu, Wei | Rhrissorrakrai, Kahn | Agarwal, Ashish | Alexander, Roger P. | Barber, Galt | Brdlik, Cathleen M. | Brennan, Jennifer | Brouillet, Jeremy Jean | Carr, Adrian | Cheung, Ming-Sin | Clawson, Hiram | Contrino, Sergio | Dannenberg, Luke O. | Dernburg, Abby F. | Desai, Arshad | Dick, Lindsay | Dosé, Andréa C. | Du, Jiang | Egelhofer, Thea | Ercan, Sevinc | Euskirchen, Ghia | Ewing, Brent | Feingold, Elise A. | Gassmann, Reto | Good, Peter J. | Green, Phil | Gullier, Francois | Gutwein, Michelle | Guyer, Mark S. | Habegger, Lukas | Han, Ting | Henikoff, Jorja G. | Henz, Stefan R. | Hinrichs, Angie | Holster, Heather | Hyman, Tony | Iniguez, A. Leo | Janette, Judith | Jensen, Morten | Kato, Masaomi | Kent, W. James | Kephart, Ellen | Khivansara, Vishal | Khurana, Ekta | Kim, John K. | Kolasinska-Zwierz, Paulina | Lai, Eric C. | Latorre, Isabel | Leahey, Amber | Lewis, Suzanna | Lloyd, Paul | Lochovsky, Lucas | Lowdon, Rebecca F. | Lubling, Yaniv | Lyne, Rachel | MacCoss, Michael | Mackowiak, Sebastian D. | Mangone, Marco | McKay, Sheldon | Mecenas, Desirea | Merrihew, Gennifer | Miller, David M. | Muroyama, Andrew | Murray, John I. | Ooi, Siew-Loon | Pham, Hoang | Phippen, Taryn | Preston, Elicia A. | Rajewsky, Nikolaus | Rätsch, Gunnar | Rosenbaum, Heidi | Rozowsky, Joel | Rutherford, Kim | Ruzanov, Peter | Sarov, Mihail | Sasidharan, Rajkumar | Sboner, Andrea | Scheid, Paul | Segal, Eran | Shin, Hyunjin | Shou, Chong | Slack, Frank J. | Slightam, Cindie | Smith, Richard | Spencer, William C. | Stinson, E. O. | Taing, Scott | Takasaki, Teruaki | Vafeados, Dionne | Voronina, Ksenia | Wang, Guilin | Washington, Nicole L. | Whittle, Christina M. | Wu, Beijing | Yan, Koon-Kiu | Zeller, Georg | Zha, Zheng | Zhong, Mei | Zhou, Xingliang | Ahringer, Julie | Strome, Susan | Gunsalus, Kristin C. | Micklem, Gos | Liu, X. Shirley | Reinke, Valerie | Kim, Stuart K. | Hillier, LaDeana W. | Henikoff, Steven | Piano, Fabio | Snyder, Michael | Stein, Lincoln | Lieb, Jason D. | Waterston, Robert H.
Science (New York, N.Y.)  2010;330(6012):1775-1787.
We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor–binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor–binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.
PMCID: PMC3142569  PMID: 21177976
8.  An shRNA-Based Screen of Splicing Regulators Identifies SFRS3 as a Negative Regulator of IL-1β Secretion 
PLoS ONE  2011;6(5):e19829.
The generation of diversity and plasticity of transcriptional programs are key components of effective vertebrate immune responses. The role of Alternative Splicing has been recognized, but it is underappreciated and poorly understood as a critical mechanism for the regulation and fine-tuning of physiological immune responses. Here we report the generation of loss-of-function phenotypes for a large collection of genes known or predicted to be involved in the splicing reaction and the identification of 19 novel regulators of IL-1β secretion in response to E. coli challenge of THP-1 cells. Twelve of these genes are required for IL-1β secretion, while seven are negative regulators of this process. Silencing of SFRS3 increased IL-1β secretion due to elevation of IL-1β and caspase-1 mRNA in addition to active caspase-1 levels. This study points to the relevance of splicing in the regulation of auto-inflammatory diseases.
PMCID: PMC3096647  PMID: 21611201
9.  Nested-multiplex PCR detection of Orthopoxvirus and Parapoxvirus directly from exanthematic clinical samples 
Virology Journal  2009;6:140.
Orthopoxvirus (OPV) and Parapoxvirus (PPV) have been associated with worldwide exanthematic outbreaks. Some species of these genera are able to infect humans and domestic animals, causing serious economic losses and public health impact. Rapid, useful and highly specific methods are required to detect and epidemiologically monitor such poxviruses. In the present paper, we describe the development of a nested-multiplex PCR method for the simultaneous detection of OPV and PPV species directly from exanthematic lesions, with no previous viral isolation or DNA extraction.
Methods and Results
The OPV/PPV nested-multiplex PCR was developed based on the evaluation and combination of published primer sets, and was applied to the detection of the target pathogens. The method showed high sensitivity, and the specificity was confirmed by amplicon sequencing. Exanthematic lesion samples collected during bovine vaccinia or contagious ecthyma outbreaks were submitted to OPV/PPV nested-multiplex PCR and confirmed its applicability.
These results suggest that the presented multiplex PCR provides a highly robust and sensitive method to detect OPV and PPV directly from clinical samples. The method can be used for viral identification and monitoring, especially in areas where OPV and PPV co-circulate.
PMCID: PMC2749831  PMID: 19747382
10.  mRNA expression profiles show differential regulatory effects of microRNAs between estrogen receptor-positive and estrogen receptor-negative breast cancer 
Genome Biology  2009;10(9):R90.
Most microRNAs have a stronger inhibitory effect in estrogen receptor-negative than in estrogen receptor-positive breast cancers
Recent studies have shown that the regulatory effect of microRNAs can be investigated by examining expression changes of their target genes. Given this, it is useful to define an overall metric of regulatory effect for a specific microRNA and see how this changes across different conditions.
Here, we define a regulatory effect score (RE-score) to measure the inhibitory effect of a microRNA in a sample, essentially the average difference in expression of its targets versus non-targets. Then we compare the RE-scores of various microRNAs between two breast cancer subtypes: estrogen receptor positive (ER+) and negative (ER-). We applied this approach to five microarray breast cancer datasets and found that the expression of target genes of most microRNAs was more repressed in ER- than ER+; that is, microRNAs appear to have higher RE-scores in ER- breast cancer. These results are robust to the microRNA target prediction method. To interpret these findings, we analyzed the level of microRNA expression in previous studies and found that higher microRNA expression was not always accompanied by higher inhibitory effects. However, several key microRNA processing genes, especially Ago2 and Dicer, were differentially expressed between ER- and ER+ breast cancer, which may explain the different regulatory effects of microRNAs in these two breast cancer subtypes.
The RE-score is a promising indicator to measure microRNAs' inhibitory effects. Most microRNAs exhibit higher RE-scores in ER- than in ER+ samples, suggesting that they have stronger inhibitory effects in ER- breast cancers.
PMCID: PMC2768979  PMID: 19723326
11.  Systematic identification of transcription factors associated with patient survival in cancers 
BMC Genomics  2009;10:225.
Aberrant activation or expression of transcription factors has been implicated in the tumorigenesis of various types of cancer. In spite of the prevalent application of microarray experiments for profiling gene expression in cancer samples, they provide limited information regarding the activities of transcription factors. However, the association between transcription factors and cancers is largely dependent on the transcription regulatory activities rather than mRNA expression levels.
In this paper, we propose a computational approach that integrates microarray expression data with the transcription factor binding site information to systematically identify transcription factors associated with patient survival given a specific cancer type. This approach was applied to two gene expression data sets for breast cancer and acute myeloid leukemia. We found that two transcription factor families, the steroid nuclear receptor family and the ATF/CREB family, are significantly correlated with the survival of patients with breast cancer; and that a transcription factor named T-cell acute lymphocytic leukemia 1 is significantly correlated with acute myeloid leukemia patient survival.
Our analysis identifies transcription factors associating with patient survival and provides insight into the regulatory mechanism underlying the breast cancer and leukemia. The transcription factors identified by our method are biologically meaningful and consistent with prior knowledge. As an insightful tool, this approach can also be applied to other microarray cancer data sets to help researchers better understand the intricate relationship between transcription factors and diseases.
PMCID: PMC2686740  PMID: 19442316
12.  Enriching PubMed Related Article Search with Sentence Level Co-citations 
PubMed related article links identify closely related articles and enhance our ability to navigate the biomedical literature. They are derived by calculating the word similarity between two articles, relating articles with overlapping word content. In this paper, we propose to enrich PubMed with a new type of related article link based on citations within a single sentence (i.e. sentence level co-citations or SLCs). Using different similarity metrics, we demonstrated that articles linked by SLCs are highly related. We also showed that only half of SLCs are found among PubMed related article links. Additionally, we discuss how the citing sentence of an SLC explains the connection between two articles.
PMCID: PMC2815371  PMID: 20351935
13.  Virulence in Murine Model Shows the Existence of Two Distinct Populations of Brazilian Vaccinia virus Strains 
PLoS ONE  2008;3(8):e3043.
Brazilian Vaccinia virus had been isolated from sentinel mice, rodents and recently from humans, cows and calves during outbreaks on dairy farms in several rural areas in Brazil, leading to high economic and social impact. Some phylogenetic studies have demonstrated the existence of two different populations of Brazilian Vaccinia virus strains circulating in nature, but little is known about their biological characteristics. Therefore, our goal was to study the virulence pattern of seven Brazilian Vaccinia virus strains. Infected BALB/c mice were monitored for morbidity, mortality and viral replication in organs as trachea, lungs, heart, kidneys, liver, brain and spleen. Based on the virulence potential, the Brazilian Vaccinia virus strains were grouped into two groups. One group contained GP1V, VBH, SAV and BAV which caused disease and death in infected mice and the second one included ARAV, GP2V and PSTV which did not cause any clinical signals or death in infected BALB/c mice. The subdivision of Brazilian Vaccinia virus strains into two groups is in agreement with previous genetic studies. Those data reinforce the existence of different populations circulating in Brazil regarding the genetic and virulence characteristics.
PMCID: PMC2518622  PMID: 18725979
14.  Identification of tumor-associated antigens by large-scale analysis of genes expressed in human colorectal cancer 
Despite the high prevalence of colon cancer in the world and the great interest in targeted anti-cancer therapy, only few tumor-specific gene products have been identified that could serve as targets for the immunological treatment of colorectal cancers. The aim of our study was therefore to identify frequently expressed colon cancer-specific antigens. We performed a large-scale analysis of genes expressed in normal colon and colon cancer tissues isolated from colorectal cancer patients using massively parallel signal sequencing (MPSS). Candidates were additionally subjected to experimental evaluation by semi-quantitative RT-PCR on a cohort of colorectal cancer patients. From a pool of more than 6000 genes identified unambiguously in the analysis, we found 2124 genes that were selectively expressed in colon cancer tissue and 147 genes that were differentially expressed to a significant degree between normal and cancer cells. Differential expression of many genes was confirmed by RT-PCR on a cohort of patients. Despite the fact that deregulated genes were involved in many different cellular pathways, we found that genes expressed in the extracellular space were significantly over-represented in colorectal cancer. Strikingly, we identified a transcript from a chromosome X-linked member of the human endogenous retrovirus (HERV) H family that was frequently and selectively expressed in colon cancer but not in normal tissues. Our data suggest that this sequence should be considered as a target of immunological interventions against colorectal cancer.
PMCID: PMC2935784  PMID: 18581998
human; colorectal cancer; gene expression profiling; massively parallel signature sequencing; HERVs
15.  High vaccination efficiency of low-affinity epitopes in antitumor immunotherapy 
Journal of Clinical Investigation  2004;113(3):425-433.
Most of the human tumor-associated antigens (TAAs) characterized thus far are derived from nonmutated “self”-proteins. Numerous strategies have been developed to break tolerance to TAAs, combining various forms of antigens with different vectors and adjuvants. However, no study has yet determined how to select epitopes within a given TAA to induce the highest antitumor effector response. We addressed this question by evaluating in HLA-A*0201-transgenic HHD mice the antitumor vaccination efficacy of high- and low-affinity epitopes from the naturally expressed murine telomerase reverse transcriptase (mTERT). Immunity against low-affinity epitopes was induced with heteroclitical variants. We show here that the CTL repertoire against high-affinity epitopes is partially tolerized, while that against low-affinity epitopes is composed of frequent CTLs with high avidity. The high-affinity p797 and p545 mTERT epitopes are not able to protect mice from a lethal challenge with the mTERT-expressing EL4-HHD tumor. In contrast, mice developing CTL responses against the p572 and p988 low-affinity epitopes exhibit potent antitumor immunity and no sign of autoimmune reactivity against TERT-expressing normal tissues. Our results strongly argue for new TAA epitope selection and modification strategies in antitumor immunotherapy applications in humans.
PMCID: PMC324537  PMID: 14755339

