1.  Coordinated genomic control of ciliogenesis and cell movement by RFX2 
eLife  2014;3:e01439.
The mechanisms linking systems-level programs of gene expression to discrete cell biological processes in vivo remain poorly understood. In this study, we have defined such a program for multi-ciliated epithelial cells (MCCs), a cell type critical for proper development and homeostasis of the airway, brain and reproductive tracts. Starting from genomic analysis of the cilia-associated transcription factor Rfx2, we used bioinformatics and in vivo cell biological approaches to gain insights into the molecular basis of cilia assembly and function. Moreover, we discovered a previously un-recognized role for an Rfx factor in cell movement, finding that Rfx2 cell-autonomously controls apical surface expansion in nascent MCCs. Thus, Rfx2 coordinates multiple, distinct gene expression programs in MCCs, regulating genes that control cell movement, ciliogenesis, and cilia function. As such, the work serves as a paradigm for understanding genomic control of cell biological processes that span from early cell morphogenetic events to terminally differentiated cellular functions.
eLife digest
Cells that have hundreds of tiny hair-like structures called cilia on their surface have important roles in our airways and also in the brain and reproductive system. By beating in a coordinated manner, the cilia cause fluid to flow in a particular direction. The development of these multiciliated cells is a complex process in which genes are expressed as proteins, with this gene expression being regulated by other proteins called transcription factors.
In invertebrates the development of the cilia is controlled by transcription factors from the RFX family, which also appear to be important for development of cilia in vertebrates. However, the details of this process—in particular, the identities of the genes that are involved and how their functions are related—are not well understood in vertebrates.
Chung et al. have sought to remedy this by analyzing the network of genes whose expression is controlled by the transcription factor Rfx2 in vertebrates. The results showed that the genes controlled by Rfx2 were involved in all aspects of cilia, including several genes that are known to be mutated in diseases caused by abnormal cilia. Chung et al. also identified genes that were not previously thought to be relevant to cilia.
As multiciliated cells are developing, but before they can generate cilia, they must first migrate from the bottom of the epithelium, the layer of tissue in which they function, to the top of this layer. Chung et al. found that Rfx2 was also involved in this process.
The approach taken by Chung et al.—which involved a combination of RNA sequence analysis, examination of Rfx2 binding sites on chromosomes, computational predictions of protein interactions and in vivo cellular imaging—could be used to perform similar systems-level analyses of other developmental and biological processes.
PMCID: PMC3889689  PMID: 24424412
cilia; multiciliated cells; mucociliary epithelium; cilia beating; Rfx2; genomics; ttc29; ribc2; nme5; protofilament ribbon; Xenopus
2.  Identifying direct targets of transcription factor Rfx2 that coordinate ciliogenesis and cell movement 
Genomics data  2014;2:192-194.
Recently, using the frog Xenopus laevis as a model system, we showed that transcription factor Rfx2 coordinates many genes involved in ciliogenesis and cell movement in multiciliated cells (Chung et al., 2014). To our knowledge, it was the first paper to utilize the genomic resources, including genome sequences and interim gene annotations, from the ongoing Xenopus laevis genome project. For researchers who are interested in the application of genomics and systems biology approaches in Xenopus studies, here we provide additional details about our dataset (NCBI GEO accession number GSE50593) and describe how we analyzed RNA-seq and ChIP-seq data to identify direct targets of Rfx2.
PMCID: PMC4236849  PMID: 25419512
Xenopus laevis; Rfx2; Ciliogenesis
3.  Intrinsic Antimicrobial Resistance Determinants in the Superbug Pseudomonas aeruginosa 
mBio  2015;6(6):e01603-15.
Antimicrobial-resistant bacteria pose a serious threat in the clinic. This is particularly true for opportunistic pathogens that possess high intrinsic resistance. Though many studies have focused on understanding the acquisition of bacterial resistance upon exposure to antimicrobials, the mechanisms controlling intrinsic resistance are not well understood. In this study, we subjected the model opportunistic superbug Pseudomonas aeruginosa to 14 antimicrobials under highly controlled conditions and assessed its response using expression- and fitness-based genomic approaches. Our results reveal that gene expression changes and mutant fitness in response to sub-MIC antimicrobials do not correlate on a genomewide scale, indicating that gene expression is not a good predictor of fitness determinants. In general, fewer fitness determinants were identified for antiseptics and disinfectants than for antibiotics. Analysis of gene expression and fitness data together allowed the prediction of antagonistic interactions between antimicrobials and insight into the molecular mechanisms controlling these interactions.
Infections involving multidrug-resistant pathogens are difficult to treat because the therapeutic options are limited. These infections impose a significant financial burden on infected patients and on health care systems. Despite years of antimicrobial resistance research, we lack a comprehensive understanding of the intrinsic mechanisms controlling antimicrobial resistance. This work uses two fine-scale genomic approaches to identify genetic loci important for antimicrobial resistance of the opportunistic pathogen Pseudomonas aeruginosa. Our results reveal that antibiotics have more resistance determinants than antiseptics/disinfectants and that gene expression upon exposure to antimicrobials is not a good predictor of these resistance determinants. In addition, we show that when used together, genomewide gene expression and fitness profiling can provide mechanistic insights into multidrug resistance mechanisms.
PMCID: PMC4626858  PMID: 26507235
4.  Protein-to-mRNA Ratios Are Conserved between Pseudomonas aeruginosa Strains 
Journal of Proteome Research  2014;13(5):2370-2380.
Recent studies have shown that the concentrations of proteins expressed from orthologous genes are often conserved across organisms and to a greater extent than the abundances of the corresponding mRNAs. However, such studies have not distinguished between evolutionary (e.g., sequence divergence) and environmental (e.g., growth condition) effects on the regulation of steady-state protein and mRNA abundances. Here, we systematically investigated the transcriptome and proteome of two closely related Pseudomonas aeruginosa strains, PAO1 and PA14, under identical experimental conditions, thus controlling for environmental effects. For 703 genes observed by both shotgun proteomics and microarray experiments, we found that the protein-to-mRNA ratios are highly correlated between orthologous genes in the two strains to an extent comparable to protein and mRNA abundances. In spite of this high molecular similarity between PAO1 and PA14, we found that several metabolic, virulence, and antibiotic resistance genes are differentially expressed between the two strains, mostly at the protein but not at the mRNA level. Our data demonstrate that the magnitude and direction of the effect of protein abundance regulation occurring after the setting of mRNA levels is conserved between bacterial strains and is important for explaining the discordance between mRNA and protein abundances.
PMCID: PMC4012837  PMID: 24742327
Transcriptomics; proteomics; Pseudomonas aeruginosa
6.  Pseudomonas aeruginosa Enhances Production of a Non-Alginate Exopolysaccharide during Long-Term Colonization of the Cystic Fibrosis Lung 
PLoS ONE  2013;8(12):e82621.
The gram-negative opportunistic pathogen Pseudomonas aeruginosa is the primary cause of chronic respiratory infections in individuals with the heritable disease cystic fibrosis (CF). These infections can last for decades, during which time P. aeruginosa has been proposed to acquire beneficial traits via adaptive evolution. Because CF lacks an animal model that can acquire chronic P. aeruginosa infections, identifying genes important for long-term in vivo fitness remains difficult. However, since clonal, chronological samples can be obtained from chronically infected individuals, traits undergoing adaptive evolution can be identified. Recently we identified 24 P. aeruginosa gene expression traits undergoing parallel evolution in vivo in multiple individuals, suggesting they are beneficial to the bacterium. The goal of this study was to determine if these genes impact P. aeruginosa phenotypes important for survival in the CF lung. By using a gain-of-function genetic screen, we found that 4 genes and 2 operons undergoing parallel evolution in vivo promote P. aeruginosa biofilm formation. These genes/operons promote biofilm formation by increasing levels of the non-alginate exopolysaccharide Psl. One of these genes, phaF, enhances Psl production via a post-transcriptional mechanism, while the other 5 genes/operons do not act on either psl transcription or translation. Together, these data demonstrate that P. aeruginosa has evolved at least two pathways to over-produce a non-alginate exopolysaccharide during long-term colonization of the CF lung. More broadly, this approach allowed us to attribute a biological significance to genes with unknown function, demonstrating the power of using evolution as a guide for targeted genetic studies.
PMCID: PMC3855792  PMID: 24324811
7.  Id2a functions to limit Notch pathway activity and thereby influence the transition from proliferation to differentiation of retinoblasts during zebrafish retinogenesis 
Developmental biology  2012;371(2):280-292.
During vertebrate retinogenesis, the precise balance between retinoblast proliferation and differentiation is spatially and temporally regulated through a number of intrinsic factors and extrinsic signaling pathways. Moreover, there are complex gene regulatory network interactions between these intrinsic factors and extrinsic pathways, which ultimately function to determine when retinoblasts exit the cell cycle and terminally differentiate. We recently uncovered a cell non-autonomous role for the intrinsic HLH factor, Id2a, in regulating retinoblast proliferation and differentiation, with Id2a-deficient retinae containing an abundance of proliferative retinoblasts and an absence of terminally differentiated retinal neurons and glia. Here, we report that Id2a function is necessary and sufficient to limit Notch pathway activity during retinogenesis. Id2a-deficient retinae possess elevated levels of Notch pathway component gene expression, while retinae overexpressing id2a possess reduced expression of Notch pathway component genes. Attenuation of Notch signaling activity by DAPT or by morpholino knockdown of Notch1a is sufficient to rescue both the proliferative and differentiation defects in Id2a-deficient retinae. In addition to regulating Notch pathway activity, through a novel RNA-Seq and differential gene expression analysis of Id2a-deficient retinae, we identify a number of additional intrinsic and extrinsic regulatory pathway components whose expression is regulated by Id2a. These data highlight the integral role played by Id2a in the gene regulatory network governing the transition from retinoblast proliferation to terminal differentiation during vertebrate retinogenesis.
PMCID: PMC3477674  PMID: 22981606
Id2a; Id2; retinogenesis; Notch; zebrafish
8.  Role of Pseudomonas aeruginosa Peptidoglycan-Associated Outer Membrane Proteins in Vesicle Formation 
Journal of Bacteriology  2013;195(2):213-219.
Gram-negative bacteria produce outer membrane vesicles (OMVs) that package and deliver proteins, small molecules, and DNA to prokaryotic and eukaryotic cells. The molecular details of OMV biogenesis have not been fully elucidated, but peptidoglycan-associated outer membrane proteins that tether the outer membrane to the underlying peptidoglycan have been shown to be critical for OMV formation in multiple Enterobacteriaceae. In this study, we demonstrate that the peptidoglycan-associated outer membrane proteins OprF and OprI, but not OprL, impact production of OMVs by the opportunistic pathogen Pseudomonas aeruginosa. Interestingly, OprF does not appear to be important for tethering the outer membrane to peptidoglycan but instead impacts OMV formation through modulation of the levels of the Pseudomonas quinolone signal (PQS), a quorum signal previously shown by our laboratory to be critical for OMV formation. Thus, the mechanism by which OprF impacts OMV formation is distinct from that for other peptidoglycan-associated outer membrane proteins, including OprI.
PMCID: PMC3553829  PMID: 23123904
9.  Protein abundances are more conserved than mRNA abundances across diverse taxa 
Proteomics  2010;10(23):4209-4212.
Proteins play major roles in most biological processes; as a consequence, protein expression levels are highly regulated. While extensive post-transcriptional, translational and protein degradation control clearly influence protein concentration and functionality, it is often thought that protein abundances are primarily determined by the abundances of the corresponding mRNAs. Hence surprisingly, a recent study showed that abundances of orthologous nematode and fly proteins correlate better than their corresponding mRNA abundances. We tested if this phenomenon is general by collecting and testing matching large-scale protein and mRNA expression datasets from seven different species: two bacteria, yeast, nematode, fly, human, and plant. We find that steady-state abundances of proteins show significantly higher correlation across these diverse phylogenetic taxa than the abundances of their corresponding mRNAs (p=0.0008, paired Wilcoxon). These data support the presence of strong selective pressure to maintain protein abundances during evolution, even when mRNA abundances diverge.
PMCID: PMC3113407  PMID: 21089048
10.  MSblender: a probabilistic approach for integrating peptide identifications from multiple database search engines 
Journal of proteome research  2011;10(7):2949-2958.
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.
PMCID: PMC3128686  PMID: 21488652
integrative analysis; database search; peptide identification
11.  Parallel Evolution in Pseudomonas aeruginosa over 39,000 Generations In Vivo 
mBio  2010;1(4):e00199-10.
The Gram-negative bacterium Pseudomonas aeruginosa is a common cause of chronic airway infections in individuals with the heritable disease cystic fibrosis (CF). After prolonged colonization of the CF lung, P. aeruginosa becomes highly resistant to host clearance and antibiotic treatment; therefore, understanding how this bacterium evolves during chronic infection is important for identifying beneficial adaptations that could be targeted therapeutically. To identify potential adaptive traits of P. aeruginosa during chronic infection, we carried out global transcriptomic profiling of chronological clonal isolates obtained from 3 individuals with CF. Isolates were collected sequentially over periods ranging from 3 months to 8 years, representing up to 39,000 in vivo generations. We identified 24 genes that were commonly regulated by all 3 P. aeruginosa lineages, including several genes encoding traits previously shown to be important for in vivo growth. Our results reveal that parallel evolution occurs in the CF lung and that at least a proportion of the traits identified are beneficial for P. aeruginosa chronic colonization of the CF lung.
Deadly diseases like AIDS, malaria, and tuberculosis are the result of long-term chronic infections. Pathogens that cause chronic infections adapt to the host environment, avoiding the immune response and resisting antimicrobial agents. Studies of pathogen adaptation are therefore important for understanding how the efficacy of current therapeutics may change upon prolonged infection. One notorious chronic pathogen is Pseudomonas aeruginosa, a bacterium that causes long-term infections in individuals with the heritable disease cystic fibrosis (CF). We used gene expression profiles to identify 24 genes that commonly changed expression over time in 3 P. aeruginosa lineages, indicating that these changes occur in parallel in the lungs of individuals with CF. Several of these genes have previously been shown to encode traits critical for in vivo-relevant processes, suggesting that they are likely beneficial adaptations important for chronic colonization of the CF lung.
PMCID: PMC2939680  PMID: 20856824
12.  Mining gene functional networks to improve mass-spectrometry-based protein identification 
Bioinformatics  2009;25(22):2955-2961.
Motivation: High-throughput protein identification experiments based on tandem mass spectrometry (MS/MS) often suffer from low sensitivity and low-confidence protein identifications. In a typical shotgun proteomics experiment, it is assumed that all proteins are equally likely to be present. However, there is often other evidence to suggest that a protein is present and confidence in individual protein identification can be updated accordingly.
Results: We develop a method that analyzes MS/MS experiments in the larger context of the biological processes active in a cell. Our method, MSNet, improves protein identification in shotgun proteomics experiments by considering information on functional associations from a gene functional network. MSNet substantially increases the number of proteins identified in the sample at a given error rate. We identify 8–29% more proteins than the original MS experiment when applied to yeast grown in different experimental conditions analyzed on different MS/MS instruments, and 37% more proteins in a human sample. We validate up to 94% of our identifications in yeast by presence in ground-truth reference sets.
Availability and Implementation: Software and datasets are available at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC2773251  PMID: 19633097

