Search tips
Search criteria

Results 1-25 (48)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Interpreting the regulatory genome: the genomics of transcription factor function in Drosophila melanogaster 
Briefings in Functional Genomics  2012;11(5):336-346.
Researchers have now had access to the fully sequenced Drosophila melanogaster genome for over a decade, and the sequenced genomes of 11 additional Drosophila species have been available for almost 5 years, with more species’ genomes becoming available every year [Adams MD, Celniker SE, Holt RA, et al. The genome sequence of Drosophila melanogaster. Science 2000;287:2185–95; Clark AG, Eisen MB, Smith DR, et al. Evolution of genes and genomes on the Drosophila phylogeny. Nature 2007;450:203–18]. Although the best studied of the D. melanogaster transcription factors (TFs) were cloned before sequencing of the genome, the availability of sequence data promised to transform our understanding of TFs and gene regulatory networks. Sequenced genomes have allowed researchers to generate tools for high-throughput characterization of gene expression levels, genome-wide TF localization and analyses of evolutionary constraints on DNA elements across multiple species. With an estimated 700 DNA-binding proteins in the Drosophila genome, it will be many years before each potential sequence-specific TF is studied in detail, yet the last decade of functional genomics research has already impacted our view of gene regulatory networks and TF DNA recognition.
PMCID: PMC3459015  PMID: 23023663
Drosophila; transcription factor; genomics; enhancer; Zelda
2.  Comparison of the Genome Sequences of “Candidatus Portiera aleyrodidarum” Primary Endosymbionts of the Whitefly Bemisia tabaci B and Q Biotypes 
“Candidatus Portiera aleyrodidarum” is the primary endosymbiont of whiteflies. We report two complete genome sequences of this bacterium from the worldwide invasive B and Q biotypes of the whitefly Bemisia tabaci. Differences in the two genome sequences may add insights into the complex differences in the biology of both biotypes.
PMCID: PMC3591977  PMID: 23315735
3.  Genome Sequences of the Primary Endosymbiont “Candidatus Portiera aleyrodidarum” in the Whitefly Bemisia tabaci B and Q Biotypes 
Journal of Bacteriology  2012;194(23):6678-6679.
“Candidatus Portiera aleyrodidarum” is the obligate primary endosymbiotic bacterium of whiteflies, including the sweet potato whitefly Bemisia tabaci, and provides essential nutrients to its host. Here we report two complete genome sequences of this bacterium from the B and Q biotypes of B. tabaci.
PMCID: PMC3497481  PMID: 23144417
4.  Systematic evaluation of factors influencing ChIP-seq fidelity 
Nature methods  2012;9(6):609-614.
We performed a systematic evaluation of how variations in sequencing depth and other parameters influence interpretation of Chromatin immunoprecipitation (ChIP) followed by sequencing (ChIP-seq) experiments. Using Drosophila S2 cells, we generated ChIP-seq datasets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin state bias, open chromatin regions yielded higher coverage, which led to false positives if not corrected and had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP library complexity at high coverage. The removal of reads originating at the same base reduced false-positives while having little effect on detection sensitivity. Even at a depth of ~1 read/bp coverage of mappable genome, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely-used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle datasets with deep coverage.
PMCID: PMC3477507  PMID: 22522655
5.  Yorkie promotes transcription by recruiting a Histone methyltransferase complex 
Cell reports  2014;8(2):449-459.
Hippo signaling limits organ growth by inhibiting the transcriptional coactivator Yorkie. Despite the key role of Yorkie in both normal and oncogenic growth, the mechanism by which it activates transcription has not been defined. We report that Yorkie binding to chromatin correlates with histone H3K4 methylation, and is sufficient to locally increase it. We show that Yorkie can recruit a histone methyltransferase complex, through binding between WW domains of Yorkie and PPxY sequence motifs of NcoA6, a subunit of the Trithorax-related (Trr) methyltransferase complex. Cell culture and in vivo assays establish that this recruitment of NcoA6 contributes to Yorkie’s ability to activate transcription. Mammalian NcoA6, a subunit of Trr-homologous methyltransferase complexes, can similarly interact with Yorkie’s mammalian homologue YAP. Our results implicate direct recruitment of a histone methyltransferase complex as central to transcriptional activation by Yorkie, linking the control of cell proliferation by Hippo signaling to chromatin modification.
PMCID: PMC4152371  PMID: 25017066
6.  miR-9a minimizes the phenotypic impact of genomic diversity by buffering a transcription factor 
Cell  2013;155(7):1556-1567.
Gene expression has to withstand stochastic, environmental and genomic perturbations. For example, in the latter case, 0.5–1% of the human genome is typically variable between any two unrelated individuals. Such diversity might create problematic variability in the activity of gene regulatory networks, and ultimately, in cell behaviors. Using multigenerational selection experiments, we find that for the Drosophila proneural network, the effect of genomic diversity is dampened by miR-9a-mediated regulation of senseless expression. Reducing miR-9a regulation of the Senseless transcription factor frees the genomic landscape to exert greater phenotypic influence. Whole genome sequencing identified genomic loci that potentially exert such effects. A larger set of sequence variants, including variants within proneural network genes, exhibit these characteristics when miR-9a concentration is reduced. These findings reveal that microRNA-target interactions may be a key mechanism by which the impact of genomic diversity on cell behavior is dampened.
PMCID: PMC3891883  PMID: 24360277
8.  A Non-Degenerate Code of Deleterious Variants in Mendelian Loci Contributes to Complex Disease Risk 
Cell  2013;155(1):10.1016/j.cell.2013.08.030.
Whereas countless highly penetrant variants have been associated with Mendelian disorders, the genetic etiologies underlying complex diseases remain largely unresolved. Here, we examine the extent to which Mendelian variation contributes to complex disease risk by mining the medical records of over 110 million patients. We detect thousands of associations between Mendelian and complex diseases, revealing a non-degenerate, phenotypic code that links each complex disorder to a unique collection of Mendelian loci. Using genome-wide association results, we demonstrate that common variants associated with complex diseases are enriched in the genes indicated by this “Mendelian code.” Finally, we detect hundreds of comorbidity associations among Mendelian disorders, and we use probabilistic genetic modeling to demonstrate that Mendelian variants likely contribute non-additively to the risk for a subset of complex diseases. Overall, this study illustrates a complementary approach for mapping complex disease loci and provides unique predictions concerning the etiologies of specific diseases.
PMCID: PMC3844554  PMID: 24074861
9.  Assessment of Copy Number Status of Chromosomes 6 and 11 by FISH Provides Independent Prognostic Information in Primary Melanoma 
Melanoma incidence has been rising steadily for decades, while mortality rates have remained flat. This type of discordant pattern between incidence and mortality has been linked to diagnostic drift in cancers of the thyroid, breast, and prostate. Ancillary tests such as fluorescent in situ hybridization (FISH) are now being used to help differentiate melanomas from melanocytic nevi. Multicolor FISH has been shown to distinguish between these two with 86.7% sensitivity and 95.4% specificity. To assess the ability of FISH to differentiate melanomas with metastatic or lethal potential from those with an indolent disease course, we performed FISH with probes targeting 6p25, centromere 6, 6q23, and 11q13 on 144 primary melanomas with a minimal tumor thickness of 2 mm and compared the development of metastatic disease and melanoma-specific mortality as well as relapse-free and disease-specific survival between FISH-positive and negative cases. 82% of melanomas were positive by FISH according to previously defined criteria. The percentage was significantly higher (93%) in cases that developed systemic metastases (n=43) than in patients that did not (77%, n=101). FISH-positive primaries had a significantly increased risk of metastasis or melanoma-related death compared to FISH-negative cases (odds ratio 4.11, confidence interval (CI) 1.14-22.7 and 7.0, CI 1.03-300.4, respectively). FISH status remained an independent parameter when controlling for known prognostic factors. This data indicates that the group of melanomas diagnosed with routine histopathology that lack aberrations detected by FISH is enriched for melanomas with a more indolent disease course. This suggests that molecular techniques can assist in a more accurate identification of tumors with metastatic potential.
PMCID: PMC4153784  PMID: 21716079
10.  Protein Quantitative Trait Loci Identify Novel Candidates Modulating Cellular Response to Chemotherapy 
PLoS Genetics  2014;10(4):e1004192.
Annotating and interpreting the results of genome-wide association studies (GWAS) remains challenging. Assigning function to genetic variants as expression quantitative trait loci is an expanding and useful approach, but focuses exclusively on mRNA rather than protein levels. Many variants remain without annotation. To address this problem, we measured the steady state abundance of 441 human signaling and transcription factor proteins from 68 Yoruba HapMap lymphoblastoid cell lines to identify novel relationships between inter-individual protein levels, genetic variants, and sensitivity to chemotherapeutic agents. Proteins were measured using micro-western and reverse phase protein arrays from three independent cell line thaws to permit mixed effect modeling of protein biological replicates. We observed enrichment of protein quantitative trait loci (pQTLs) for cellular sensitivity to two commonly used chemotherapeutics: cisplatin and paclitaxel. We functionally validated the target protein of a genome-wide significant trans-pQTL for its relevance in paclitaxel-induced apoptosis. GWAS overlap results of drug-induced apoptosis and cytotoxicity for paclitaxel and cisplatin revealed unique SNPs associated with the pharmacologic traits (at p<0.001). Interestingly, GWAS SNPs from various regions of the genome implicated the same target protein (p<0.0001) that correlated with drug induced cytotoxicity or apoptosis (p≤0.05). Two genes were functionally validated for association with drug response using siRNA: SMC1A with cisplatin response and ZNF569 with paclitaxel response. This work allows pharmacogenomic discovery to progress from the transcriptome to the proteome and offers potential for identification of new therapeutic targets. This approach, linking targeted proteomic data to variation in pharmacologic response, can be generalized to other studies evaluating genotype-phenotype relationships and provide insight into chemotherapeutic mechanisms.
Author Summary
The central dogma of biology explains that DNA is transcribed to mRNA that is further translated into protein. Many genome-wide studies have implicated genetic variation that influences gene expression and that ultimately affect downstream complex traits including response to drugs. However, because of technical limitations, few studies have evaluated the contribution of genetic variation on protein expression and ensuing effects on downstream phenotypes. To overcome this challenge, we used a novel technology to simultaneously measure the baseline expression of 441 proteins in lymphoblastoid cell lines and compared them with publicly available genetic data. To further illustrate the utility of this approach, we compared protein-level measurements with chemotherapeutic induced apoptosis and cell-growth inhibition data. This study demonstrates the importance of using protein information to understand the functional consequences of genetic variants identified in genome-wide association studies. This protein data set will also have broad utility for understanding the relationship between other genome-wide studies of complex traits.
PMCID: PMC3974641  PMID: 24699359
11.  Renewable, recombinant antibodies to histone post-translational modifications 
Nature methods  2013;10(10):10.1038/nmeth.2605.
Variability in the quality of antibodies to histone post-translational modifications (PTMs) presents widely recognized hindrance in epigenetics research. Here, by using antibody engineering technologies we produced recombinant antibodies directed to the trimethylated lysine residues of histone H3 with high specificity and affinity and no lot-to-lot variation. These recombinant antibodies performed well in common epigenetics applications, and their high specificity enabled us to identify positive and negative correlations among histone PTMs.
PMCID: PMC3828030  PMID: 23955773
12.  Bionimbus: a cloud for managing, analyzing and sharing large genomics datasets 
As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it.
Bionimbus is an open source cloud-computing platform that is based primarily upon OpenStack, which manages on-demand virtual machines that provide the required computational resources, and GlusterFS, which is a high-performance clustered file system. Bionimbus also includes Tukey, which is a portal, and associated middleware that provides a single entry point and a single sign on for the various Bionimbus resources; and Yates, which automates the installation, configuration, and maintenance of the software infrastructure required.
Bionimbus is used by a variety of projects to process genomics and phenotypic data. For example, it is used by an acute myeloid leukemia resequencing project at the University of Chicago. The project requires several computational pipelines, including pipelines for quality control, alignment, variant calling, and annotation. For each sample, the alignment step requires eight CPUs for about 12 h. BAM file sizes ranged from 5 GB to 10 GB for each sample.
Most members of the research community have difficulty downloading large genomics datasets and obtaining sufficient storage and computer resources to manage and analyze the data. Cloud computing platforms, such as Bionimbus, with data commons that contain large genomics datasets, are one choice for broadening access to research data in genomics.
PMCID: PMC4215034  PMID: 24464852
cloud computing; biomedical clouds; genomic clouds
13.  A conserved eEF2 coding variant in SCA26 leads to loss of translational fidelity and increased susceptibility to proteostatic insult 
Human Molecular Genetics  2012;21(26):5472-5483.
The autosomal dominant spinocerebellar ataxias (SCAs) are a genetically heterogeneous group of disorders exhibiting cerebellar atrophy and Purkinje cell degeneration whose subtypes arise from 31 distinct genetic loci. Our group previously published the locus for SCA26 on chromosome 19p13.3. In this study, we performed targeted deep sequencing of the critical interval in order to identify candidate causative variants in individuals from the SCA26 family. We identified a single variant that co-segregates with the disease phenotype that produces a single amino acid substitution in eukaryotic elongation factor 2. This substitution, P596H, sits in a domain critical for maintaining reading frame during translation. The yeast equivalent, P580H EF2, demonstrated impaired translocation, detected as an increased rate of −1 programmed ribosomal frameshift read-through in a dual-luciferase assay for observing translational recoding. This substitution also results in a greater susceptibility to proteostatic disruption, as evidenced by a more robust activation of a reporter gene driven by unfolded protein response activation upon challenge with dithiothreitol or heat shock in our yeast model system. Our results present a compelling candidate mutation and mechanism for the pathogenesis of SCA26 and further support the role of proteostatic disruption in neurodegenerative diseases.
PMCID: PMC3516132  PMID: 23001565
14.  Response 
Canadian Family Physician  1994;40:1089-1090.
PMCID: PMC2380226
15.  Divergent Transcriptional Regulatory Logic at the Intersection of Tissue Growth and Developmental Patterning 
PLoS Genetics  2013;9(9):e1003753.
The Yorkie/Yap transcriptional coactivator is a well-known regulator of cellular proliferation in both invertebrates and mammals. As a coactivator, Yorkie (Yki) lacks a DNA binding domain and must partner with sequence-specific DNA binding proteins in the nucleus to regulate gene expression; in Drosophila, the developmental regulators Scalloped (Sd) and Homothorax (Hth) are two such partners. To determine the range of target genes regulated by these three transcription factors, we performed genome-wide chromatin immunoprecipitation experiments for each factor in both the wing and eye-antenna imaginal discs. Strong, tissue-specific binding patterns are observed for Sd and Hth, while Yki binding is remarkably similar across both tissues. Binding events common to the eye and wing are also present for Sd and Hth; these are associated with genes regulating cell proliferation and “housekeeping” functions, and account for the majority of Yki binding. In contrast, tissue-specific binding events for Sd and Hth significantly overlap enhancers that are active in the given tissue, are enriched in Sd and Hth DNA binding sites, respectively, and are associated with genes that are consistent with each factor's previously established tissue-specific functions. Tissue-specific binding events are also significantly associated with Polycomb targeted chromatin domains. To provide mechanistic insights into tissue-specific regulation, we identify and characterize eye and wing enhancers of the Yki-targeted bantam microRNA gene and demonstrate that they are dependent on direct binding by Hth and Sd, respectively. Overall these results suggest that both Sd and Hth use distinct strategies – one shared between tissues and associated with Yki, the other tissue-specific, generally Yki-independent and associated with developmental patterning – to regulate distinct gene sets during development.
Author Summary
The Hippo tumor suppressor pathway controls proliferation in a tissue-nonspecific fashion in Drosophila epithelial progenitor tissues via the transcriptional coactivator Yorkie (Yki). However, despite the tissue-nonspecific role that Yki plays in tissue growth, the transcription factors that recruit Yki to DNA, most notably Scalloped (Sd) and Homothorax (Hth), are important regulators of developmental patterning with many tissue-specific functions. Thus, these three transcriptional regulators – Yki, Sd, and Hth – provide a model for exploring the properties of protein-DNA interactions that regulate both tissue-shared and tissue-specific functions. With this goal in mind, we identified the positions in the fly genome that are bound by Yki, Sd, and Hth in the progenitors of the wing and eye-antenna structures of the fly. These data not only provide a global view of the Yki gene regulatory network, they reveal an unusual amount of tissue specificity in the genomic regions targeted by Sd and Hth, but not Yki. The data also reveal that tissue-specific binding is very likely to overlap tissue-specific enhancer regions, provide important clues for how tissue-specific Sd and Hth binding occurs, and support the idea that gene regulatory networks are plastic, with spatial differences in binding significantly impacting network structures.
PMCID: PMC3764184  PMID: 24039600
16.  Genome-wide analyses of Shavenbaby target genes reveals distinct features of enhancer organization 
Genome Biology  2013;14(8):R86.
Developmental programs are implemented by regulatory interactions between Transcription Factors (TFs) and their target genes, which remain poorly understood. While recent studies have focused on regulatory cascades of TFs that govern early development, little is known about how the ultimate effectors of cell differentiation are selected and controlled. We addressed this question during late Drosophila embryogenesis, when the finely tuned expression of the TF Ovo/Shavenbaby (Svb) triggers the morphological differentiation of epidermal trichomes.
We defined a sizeable set of genes downstream of Svb and used in vivo assays to delineate 14 enhancers driving their specific expression in trichome cells. Coupling computational modeling to functional dissection, we investigated the regulatory logic of these enhancers. Extending the repertoire of epidermal effectors using genome-wide approaches showed that the regulatory models learned from this first sample are representative of the whole set of trichome enhancers. These enhancers harbor remarkable features with respect to their functional architectures, including a weak or non-existent clustering of Svb binding sites. The in vivo function of each site relies on its intimate context, notably the flanking nucleotides. Two additional cis-regulatory motifs, present in a broad diversity of composition and positioning among trichome enhancers, critically contribute to enhancer activity.
Our results show that Svb directly regulates a large set of terminal effectors of the remodeling of epidermal cells. Further, these data reveal that trichome formation is underpinned by unexpectedly diverse modes of regulation, providing fresh insights into the functional architecture of enhancers governing a terminal differentiation program.
PMCID: PMC4053989  PMID: 23972280
17.  Genome-wide association of Yorkie with chromatin and chromatin remodeling complexes 
Cell reports  2013;3(2):309-318.
The Hippo pathway regulates growth through the transcriptional co-activator Yorkie, but how Yorkie promotes transcription remains poorly understood. We address this by characterizing Yorkie’s association with chromatin, and by identifying nuclear partners that effect transcriptional activation. Co-immunoprecipitation and mass spectrometry identify GAGA Factor (GAF), Brahma complex, and Mediator complex as Yorkie-associated nuclear protein complexes. All three are required for Yorkie’s transcriptional activation of downstream genes, and GAF and the Brahma complex subunit Moira interact directly with Yorkie. Genome-wide chromatin binding experiments identify thousands of Yorkie sites, most of which are associated with elevated transcription, based on genome-wide analysis of mRNA and histone H3K4Me3 modification. Chromatin binding also supports extensive functional overlap between Yorkie and GAF. Our studies suggest a widespread role for Yorkie as a regulator of transcription, and identify recruitment of the chromatin modifying GAF protein and BRM complex as a molecular mechanism for transcriptional activation by Yorkie.
PMCID: PMC3633442  PMID: 23395637
18.  Discovering transcription factor regulatory targets using gene expression and binding data 
Bioinformatics  2011;28(2):206-213.
Motivation: Identifying the target genes regulated by transcription factors (TFs) is the most basic step in understanding gene regulation. Recent advances in high-throughput sequencing technology, together with chromatin immunoprecipitation (ChIP), enable mapping TF binding sites genome wide, but it is not possible to infer function from binding alone. This is especially true in mammalian systems, where regulation often occurs through long-range enhancers in gene-rich neighborhoods, rather than proximal promoters, preventing straightforward assignment of a binding site to a target gene.
Results: We present EMBER (Expectation Maximization of Binding and Expression pRofiles), a method that integrates high-throughput binding data (e.g. ChIP-chip or ChIP-seq) with gene expression data (e.g. DNA microarray) via an unsupervised machine learning algorithm for inferring the gene targets of sets of TF binding sites. Genes selected are those that match overrepresented expression patterns, which can be used to provide information about multiple TF regulatory modes. We apply the method to genome-wide human breast cancer data and demonstrate that EMBER confirms a role for the TFs estrogen receptor alpha, retinoic acid receptors alpha and gamma in breast cancer development, whereas the conventional approach of assigning regulatory targets based on proximity does not. Additionally, we compare several predicted target genes from EMBER to interactions inferred previously, examine combinatorial effects of TFs on gene regulation and illustrate the ability of EMBER to discover multiple modes of regulation.
Availability: All code used for this work is available at
Supplementary Information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3259433  PMID: 22084256
19.  Next-Generation Sequencing of Disseminated Tumor Cells 
Frontiers in Oncology  2013;3:320.
Disseminated tumor cells (DTCs) detected in the bone marrow have been shown as an independent prognostic factor for women with breast cancer. However, the mechanisms behind the tumor cell dissemination are still unclear and more detailed knowledge is needed to fully understand why some cells remain dormant and others metastasize. Sequencing of single cells has opened for the possibility to dissect the genetic content of subclones of a primary tumor, as well as DTCs. Previous studies of genetic changes in DTCs have employed single-cell array comparative genomic hybridization which provides information about larger aberrations. To date, next-generation sequencing provides the possibility to discover new, smaller, and copy neutral genetic changes. In this study, we performed whole-genome amplification and subsequently next-generation sequencing to analyze DTCs from two breast cancer patients. We compared copy-number profiles of the DTCs and the corresponding primary tumor generated from sequencing and SNP-comparative genomic hybridization (CGH) data, respectively. While one tumor revealed mostly whole-arm gains and losses, the other had more complex alterations, as well as subclonal amplification and deletions. Whole-arm gains or losses in the primary tumor were in general also observed in the corresponding DTC. Both primary tumors showed amplification of chromosome 1q and deletion of parts of chromosome 16q, which was recaptured in the corresponding DTCs. Interestingly, clear differences were also observed, indicating that the DTC underwent further evolution at the copy-number level. This study provides a proof-of-principle for sequencing of DTCs and correlation with primary copy-number profiles. The analyses allow insight into tumor cell dissemination and show ongoing copy-number evolution in DTCs compared to the primary tumors.
PMCID: PMC3876274  PMID: 24427740
single tumor cell sequencing; disseminating tumor cells; circulating tumor cells; tumor heterogeneity; clonal evolution
20.  Adaptive Evolution and the Birth of CTCF Binding Sites in the Drosophila Genome 
PLoS Biology  2012;10(11):e1001420.
Comparative ChIP-seq data reveal adaptive evolution of insulator protein CTCF binding in multiple Drosophila species.
Changes in the physical interaction between cis-regulatory DNA sequences and proteins drive the evolution of gene expression. However, it has proven difficult to accurately quantify evolutionary rates of such binding change or to estimate the relative effects of selection and drift in shaping the binding evolution. Here we examine the genome-wide binding of CTCF in four species of Drosophila separated by between ∼2.5 and 25 million years. CTCF is a highly conserved protein known to be associated with insulator sequences in the genomes of human and Drosophila. Although the binding preference for CTCF is highly conserved, we find that CTCF binding itself is highly evolutionarily dynamic and has adaptively evolved. Between species, binding divergence increased linearly with evolutionary distance, and CTCF binding profiles are diverging rapidly at the rate of 2.22% per million years (Myr). At least 89 new CTCF binding sites have originated in the Drosophila melanogaster genome since the most recent common ancestor with Drosophila simulans. Comparing these data to genome sequence data from 37 different strains of Drosophila melanogaster, we detected signatures of selection in both newly gained and evolutionarily conserved binding sites. Newly evolved CTCF binding sites show a significantly stronger signature for positive selection than older sites. Comparative gene expression profiling revealed that expression divergence of genes adjacent to CTCF binding site is significantly associated with the gain and loss of CTCF binding. Further, the birth of new genes is associated with the birth of new CTCF binding sites. Our data indicate that binding of Drosophila CTCF protein has evolved under natural selection, and CTCF binding evolution has shaped both the evolution of gene expression and genome evolution during the birth of new genes.
Author Summary
A large proportion of the diversity of living organisms results from differential regulation of gene transcription. Transcriptional regulation is thought to differ between species because of evolutionary changes in the physical interactions between regulatory DNA elements and DNA-binding proteins; these can generate variation in the spatial and temporal patterns of gene expression. The mechanisms by which these protein–DNA interactions evolve is therefore an important question in evolutionary biology. Does adaptive evolution play a role, or is the process dominated by neutral genetic drift? Insulator proteins are a special group of DNA-binding proteins—instead of directly serving to activate or repress genes, they can function to coordinate the interactions between other regulatory elements (such as enhancers and promoters). Additionally, insulator proteins can limit the spreading of chromatin condensation and help to demarcate the boundaries of regulatory domains in the genome. In spite of their critical role in genome regulation, little is known about the evolution of interactions between insulator proteins and DNA. Here, we use ChIP-seq to examine the distribution of binding sites for CTCF, a highly conserved insulator protein, in four closely related Drosophila species. We find that genome-wide binding profiles of CTCF are highly dynamic across evolutionary time, with frequent births of new CTCF-DNA interactions, and we demonstrate that this evolutionary process is driven by natural selection. By comparing these with RNA-seq data, we find that gain or loss of CTCF binding impacts the expression levels of nearby genes and correlates with structural evolution of the genome. Together these results suggest a potential mechanism of regulatory re-wiring through adaptive evolution of CTCF binding.
PMCID: PMC3491045  PMID: 23139640
21.  Evidence for Autoregulation and Cell Signaling Pathway Regulation From Genome-Wide Binding of the Drosophila Retinoblastoma Protein 
G3: Genes|Genomes|Genetics  2012;2(11):1459-1472.
The retinoblastoma (RB) tumor suppressor protein is a transcriptional cofactor with essential roles in cell cycle and development. Physical and functional targets of RB and its paralogs p107/p130 have been studied largely in cultured cells, but the full biological context of this family of proteins’ activities will likely be revealed only in whole organismal studies. To identify direct targets of the major Drosophila RB counterpart in a developmental context, we carried out ChIP-Seq analysis of Rbf1 in the embryo. The association of the protein with promoters is developmentally controlled; early promoter access is globally inhibited, whereas later in development Rbf1 is found to associate with promoter-proximal regions of approximately 2000 genes. In addition to conserved cell-cycle–related genes, a wholly unexpected finding was that Rbf1 targets many components of the insulin, Hippo, JAK/STAT, Notch, and other conserved signaling pathways. Rbf1 may thus directly affect output of these essential growth-control and differentiation pathways by regulation of expression of receptors, kinases and downstream effectors. Rbf1 was also found to target multiple levels of its own regulatory hierarchy. Bioinformatic analysis indicates that different classes of genes exhibit distinct constellations of motifs associated with the Rbf1-bound regions, suggesting that the context of Rbf1 recruitment may vary within the Rbf1 regulon. Many of these targeted genes are bound by Rbf1 homologs in human cells, indicating that a conserved role of RB proteins may be to adjust the set point of interlinked signaling networks essential for growth and development.
PMCID: PMC3484676  PMID: 23173097
retinoblastoma; Rbf1; cell-cycle; Drosophila
22.  Ultrabithorax confers spatial identity in a context-specific manner in the Drosophila postembryonic ventral nervous system 
Neural Development  2012;7:31.
In holometabolous insects such as Drosophila melanogaster, neuroblasts produce an initial population of diverse neurons during embryogenesis and a much larger set of adult-specific neurons during larval life. In the ventral CNS, many of these secondary neuronal lineages differ significantly from one body segment to another, suggesting a role for anteroposterior patterning genes.
Here we systematically characterize the expression pattern and function of the Hox gene Ultrabithorax (Ubx) in all 25 postembryonic lineages. We find that Ubx is expressed in a segment-, lineage-, and hemilineage-specific manner in the thoracic and anterior abdominal segments. When Ubx is removed from neuroblasts via mitotic recombination, neurons in these segments exhibit the morphologies and survival patterns of their anterior thoracic counterparts. Conversely, when Ubx is ectopically expressed in anterior thoracic segments, neurons exhibit complementary posterior transformation phenotypes.
Our findings demonstrate that Ubx plays a critical role in conferring segment-appropriate morphology and survival on individual neurons in the adult-specific ventral CNS. Moreover, while always conferring spatial identity in some sense, Ubx has been co-opted during evolution for distinct and even opposite functions in different neuronal hemilineages.
PMCID: PMC3520783  PMID: 22967828
Hox; Programmed cell death; CNS; Neuroblast lineages
23.  A Genomic Mechanism for Antagonism Between Retinoic Acid and Estrogen Signaling in Breast Cancer 
Cell  2009;137(7):1259-1271.
Retinoic acid (RA) triggers growth-suppressive effects in tumor cells and therefore RA has and its synthetic analogs have great potential as anti-carcinogenic agent. RA effects are mediated by Retinoic Acid Receptors (RARs), which regulate gene expression in an RA-dependent manner. To define the genetic network regulated by RARs in breast cancer, we identified RAR genomic targets using chromatin immunoprecipitation and expression analysis. We found that RAR binding throughout the genome is highly co-incident with estrogen receptor α (ERα) binding, and identified a widespread crosstalk of RA and estrogen signaling to antagonistically regulate breast cancer-associated genes. ERα and RAR binding sites appear to be co-evolved on a large scale throughout the human genome, allowing for competitive binding between these transcription factors via nearby or overlapping cis-regulatory elements. Together these data indicate the existence of a highly coordinated intersection between these two critical nuclear hormone receptor signaling pathways providing a global mechanism for balancing gene expression output via local regulatory interactions dispersed throughout the genome.
PMCID: PMC3374131  PMID: 19563758
24.  The Transcription Factor Ultraspiracle Influences Honey Bee Social Behavior and Behavior-Related Gene Expression 
PLoS Genetics  2012;8(3):e1002596.
Behavior is among the most dynamic animal phenotypes, modulated by a variety of internal and external stimuli. Behavioral differences are associated with large-scale changes in gene expression, but little is known about how these changes are regulated. Here we show how a transcription factor (TF), ultraspiracle (usp; the insect homolog of the Retinoid X Receptor), working in complex transcriptional networks, can regulate behavioral plasticity and associated changes in gene expression. We first show that RNAi knockdown of USP in honey bee abdominal fat bodies delayed the transition from working in the hive (primarily “nursing” brood) to foraging outside. We then demonstrate through transcriptomics experiments that USP induced many maturation-related transcriptional changes in the fat bodies by mediating transcriptional responses to juvenile hormone. These maturation-related transcriptional responses to USP occurred without changes in USP's genomic binding sites, as revealed by ChIP–chip. Instead, behaviorally related gene expression is likely determined by combinatorial interactions between USP and other TFs whose cis-regulatory motifs were enriched at USP's binding sites. Many modules of JH– and maturation-related genes were co-regulated in both the fat body and brain, predicting that usp and cofactors influence shared transcriptional networks in both of these maturation-related tissues. Our findings demonstrate how “single gene effects” on behavioral plasticity can involve complex transcriptional networks, in both brain and peripheral tissues.
Author Summary
Animals use behavior as one of the principal means of meeting their basic needs and responding flexibly to changes in their environment. An emerging insight is that changes in behavior are associated with massive changes in gene expression in the brain, but we know relatively little about how these changes are regulated. One important class of gene regulators are transcription factors (TF), proteins that orchestrate the expression of tens to thousands of genes. We discovered that ultraspiracle (USP), a TF previously known primarily for its role in development, regulates behavioral change in the honey bee; and we show that USP causes behaviorally related changes in gene expression by mediating responses to an endocrine regulator, juvenile hormone. We present evidence that these effects on gene expression occur through combinatorial interactions between USP and other TFs, and that these hormonally related transcriptional networks are preserved between two tissues with causal roles in behavioral plasticity: the brain and the fat body, a peripheral nutrient-sensing organ. These results suggest that behavior is subserved by complex interactions between genes and gene networks, occurring both in the brain and in peripheral tissues. More generally our results suggest that molecular systems biology is a promising paradigm by which to understand the mechanistic basis for behavior.
PMCID: PMC3315457  PMID: 22479195
25.  CARM1 is an important determinant of ERα-dependent breast cancer cell differentiation and proliferation in breast cancer cells 
Cancer research  2011;71(6):2118-2128.
Breast cancers expressing estrogen receptor α (ERα) are often more differentiated histologically than ERα-negative tumors, but the reasons for this difference are poorly understood. One possible explanation is that transcriptional co-factors associated with ERα determine the expression of genes which promote a more differentiated phenotype. In this study, we identify one such cofactor as coactivator associated arginine methyltransferase 1 (CARM1), a unique co-activator of ERα that can simultaneously block cell proliferation and induce differentiation through global regulation of ERα-regulated genes. CARM1 was evidenced as an ERα co-activator in cell-based assays, gene expression microarrays, and mouse xenograft models. In human breast tumors, CARM1 expression positively correlated with ERα levels in ER+ tumors but was inversely correlated with tumor grade. Our findings suggest that co-expression of CARM1 and ERα may provide a better biomarker of well-differentiated breast cancer. Further, our findings define an important functional role of this histone arginine methyltransferase in re-programming ERα-regulated cellular processes, implicating CARM1 as a putative epigenetic target in ER-positive breast cancers.
PMCID: PMC3076802  PMID: 21282336
CARM1; histone methylation; breast cancer; differentiation; epigenetics

Results 1-25 (48)