Syntrophies are metabolic cooperations, whereby two organisms co-metabolize a substrate in an interdependent manner. Many of the observed natural syntrophic interactions are mandatory in the absence of strong electron acceptors, such that one species in the syntrophy has to assume the role of electron sink for the other. While this presents an ecological setting for syntrophy to be beneficial, the potential genetic drivers of syntrophy remain unknown to date. Here, we show that the syntrophic sulfate reducing species Desulfovibrio vulgaris displays a stable genetic polymorphism, where only a specific genotype is able to engage in syntrophy with the hydrogenotrophic methanogen Methanococcus maripaludis. This “syntrophic” genotype is characterized by two genetic alterations, one of which is an in-frame deletion in the gene encoding for the ion translocating subunit cooK of the membrane-bound COO-hydrogenase. We show that this genotype presents a specific physiology, in which reshaping of energy conservation in the lactate oxidation pathway enables it to produce sufficient intermediate hydrogen for sustained M. maripaludis growth and thus, syntrophy. To our knowledge, these findings provide for the first time a genetic basis for syntrophy in nature and bring us closer to the rational engineering of syntrophy in synthetic microbial communities.
Syntrophies are metabolic cooperations, whereby two organisms co-metabolize a substrate in an interdependent manner. Many of the observed natural syntrophic interactions are mandatory in the absence of strong electron acceptors, such that one species in the syntrophy has to assume the role of electron sink for the other. While this presents an ecological setting for syntrophy to be beneficial, the potential genetic drivers of syntrophy remain unknown to date. Here, we show that the syntrophic sulfate-reducing species Desulfovibrio vulgaris displays a stable genetic polymorphism, where only a specific genotype is able to engage in syntrophy with the hydrogenotrophic methanogen Methanococcus maripaludis. This 'syntrophic' genotype is characterized by two genetic alterations, one of which is an in-frame deletion in the gene encoding for the ion-translocating subunit cooK of the membrane-bound COO hydrogenase. We show that this genotype presents a specific physiology, in which reshaping of energy conservation in the lactate oxidation pathway enables it to produce sufficient intermediate hydrogen for sustained M. maripaludis growth and thus, syntrophy. To our knowledge, these findings provide for the first time a genetic basis for syntrophy in nature and bring us closer to the rational engineering of syntrophy in synthetic microbial communities.
In Primula vulgaris outcrossing is promoted through reciprocal herkogamy with insect‐mediated cross‐pollination between pin and thrum form flowers. Development of heteromorphic flowers is coordinated by genes at the S locus. To underpin construction of a genetic map facilitating isolation of these S locus genes, we have characterised Oakleaf, a novel S locus‐linked mutant phenotype.We combine phenotypic observation of flower and leaf development, with classical genetic analysis and next‐generation sequencing to address the molecular basis of Oakleaf.
Oakleaf is a dominant mutation that affects both leaf and flower development; plants produce distinctive lobed leaves, with occasional ectopic meristems on the veins. This phenotype is reminiscent of overexpression of Class I KNOX‐homeodomain transcription factors. We describe the structure and expression of all eight P. vulgaris
PvKNOX genes in both wild‐type and Oakleaf plants, and present comparative transcriptome analysis of leaves and flowers from Oakleaf and wild‐type plants.
Oakleaf provides a new phenotypic marker for genetic analysis of the Primula S locus. We show that none of the Class I PvKNOX genes are strongly upregulated in Oakleaf leaves and flowers, and identify cohorts of 507 upregulated and 314 downregulated genes in the Oakleaf mutant.
heterostyly; KNOX genes; Oakleaf; Primula vulgaris; S locus
We previously identified and characterized an intramolecular trans-sialidase (IT-sialidase) in the gut symbiont Ruminococcus gnavus ATCC 29149, which is associated to the ability of the strain to grow on mucins. In this work we have obtained and analyzed the draft genome sequence of another R. gnavus mucin-degrader, ATCC 35913, isolated from a healthy individual. Transcriptomics analyses of both ATCC 29149 and ATCC 35913 strains confirmed that the strategy utilized by R. gnavus for mucin-degradation is focused on the utilization of terminal mucin glycans. R. gnavus ATCC 35913 also encodes a predicted IT-sialidase and harbors a Nan cluster dedicated to sialic acid utilization. We showed that the Nan cluster was upregulated when the strains were grown in presence of mucin. In addition we demonstrated that both R. gnavus strains were able to grow on 2,7-anyhydro-Neu5Ac, the IT-sialidase transglycosylation product, as a sole carbon source. Taken together these data further support the hypothesis that IT-sialidase expressing gut microbes, provide commensal bacteria such as R. gnavus with a nutritional competitive advantage, by accessing and transforming a source of nutrient to their own benefit.
gut bacteria; glycoside hydrolase; intestinal mucin; intramolecular trans-sialidase; mucin glycans; Ruminococcus gnavus; sialic acid
The domestic dog, Canis familiaris, is a valuable model for studying human diseases. The publication of the latest Canine genome build and annotation, CanFam3.1 provides an opportunity to enhance our understanding of gene regulation across tissues in the dog model system. In this study, we used the latest dog genome assembly and small RNA sequencing data from 9 different dog tissues to predict novel miRNAs in the dog genome, as well as to annotate conserved miRNAs from the miRBase database that were missing from the current dog annotation. We used both miRCat and miRDeep2 algorithms to computationally predict miRNA loci. The resulting, putative hairpin sequences were analysed in order to discard false positives, based on predicted secondary structures and patterns of small RNA read alignments. Results were further divided into high and low confidence miRNAs, using the same criteria. We generated tissue specific expression profiles for the resulting set of 811 loci: 720 conserved miRNAs, (207 of which had not been previously annotated in the dog genome) and 91 novel miRNA loci. Comparative analyses revealed 8 putative homologues of some novel miRNA in ferret, and one in microbat. All miRNAs were also classified into the genic and intergenic categories, based on the Ensembl RefSeq gene annotation for CanFam3.1. This additionally allowed us to identify four previously undescribed MiRtrons among our total set of miRNAs. We additionally annotated piRNAs, using proTRAC on the same input data. We thus identified 263 putative clusters, most of which (211 clusters) were found to be expressed in testis. Our results represent an important improvement of the dog genome annotation, paving the way to further research on the evolution of gene regulation, as well as on the contribution of post-transcriptional regulation to pathological conditions.
Lactobacillus reuteri is a gut symbiont of a wide variety of vertebrate species that has diversified into distinct phylogenetic clades which are to a large degree host-specific. Previous work demonstrated host specificity in mice and begun to determine the mechanisms by which gut colonisation and host restriction is achieved. However, how L. reuteri strains colonise the gastrointestinal (GI) tract of pigs is unknown.
To gain insight into the ecology of L. reuteri in the pig gut, the genome sequence of the porcine small intestinal isolate L. reuteri ATCC 53608 was completed and consisted of a chromosome of 1.94 Mbp and two plasmids of 138.5 kbp and 9.09 kbp, respectively. Furthermore, we generated draft genomes of four additional L. reuteri strains isolated from pig faeces or lower GI tract, lp167-67, pg-3b, 20-2 and 3c6, and subjected all five genomes to a comparative genomic analysis together with the previously completed genome of strain I5007. A phylogenetic analysis based on whole genomes showed that porcine L. reuteri strains fall into two distinct clades, as previously suggested by multi-locus sequence analysis. These six pig L. reuteri genomes contained a core set of 1364 orthologous gene clusters, as determined by OrthoMCL analysis, that contributed to a pan-genome totalling 3373 gene clusters. Genome comparisons of the six pig L. reuteri strains with 14 L. reuteri strains from other host origins gave a total pan-genome of 5225 gene clusters that included a core genome of 851 gene clusters but revealed that there were no pig-specific genes per se. However, genes specific for and conserved among strains of the two pig phylogenetic lineages were detected, some of which encoded cell surface proteins that could contribute to the diversification of the two lineages and their observed host specificity.
This study extends the phylogenetic analysis of L. reuteri strains at a genome-wide level, pointing to distinct evolutionary trajectories of porcine L. reuteri lineages, and providing new insights into the genomic events in L. reuteri that occurred during specialisation to their hosts. The occurrence of two distinct pig-derived clades may reflect differences in host genotype, environmental factors such as dietary components or to evolution from ancestral strains of human and rodent origin following contact with pig populations.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-2216-7) contains supplementary material, which is available to authorized users.
Lactobacillus reuteri; Pig; Host-specificity; Comparative genomics; Clade-specific genes; Surface adhesins; Serine-rich repeat proteins; Auxiliary secretion system
In 2013, in response to an epidemic of ash dieback disease in England the previous year, we launched a Facebook-based game called Fraxinus to enable non-scientists to contribute to genomics studies of the pathogen that causes the disease and the ash trees that are devastated by it. Over a period of 51 weeks players were able to match computational alignments of genetic sequences in 78% of cases, and to improve them in 15% of cases. We also found that most players were only transiently interested in the game, and that the majority of the work done was performed by a small group of dedicated players. Based on our experiences we have built a linear model for the length of time that contributors are likely to donate to a crowd-sourced citizen science project. This model could serve a guide for the design and implementation of future crowd-sourced citizen science initiatives.
Hymenoscyphus fraxinea; ash; crowdsourcing; citizen science; other
Motivation: The de novo assembly of genomes from whole- genome shotgun sequence data is a computationally intensive, multi-stage task and it is not known a priori which methods and parameter settings will produce optimal results. In current de novo assembly projects, a popular strategy involves trying many approaches, using different tools and settings, and then comparing and contrasting the results in order to select a final assembly for publication.
Results: Herein, we present RAMPART, a configurable workflow management system for de novo genome assembly, which helps the user identify combinations of third-party tools and settings that provide good results for their particular genome and sequenced reads. RAMPART is designed to exploit High performance computing environments, such as clusters and shared memory systems, where available.
Availability and implementation: RAMPART is available under the GPLv3 license at: https://github.com/TGAC/RAMPART.
Supplementary data are available at Bioinformatics online. In addition, the user manual is available online at: http://rampart.readthedocs.org/en/latest.
Plant–microbe interactions in the rhizosphere have important roles in biogeochemical cycling, and maintenance of plant health and productivity, yet remain poorly understood. Using RNA-based metatranscriptomics, the global active microbiomes were analysed in soil and rhizospheres of wheat, oat, pea and an oat mutant (sad1) deficient in production of anti-fungal avenacins. Rhizosphere microbiomes differed from bulk soil and between plant species. Pea (a legume) had a much stronger effect on the rhizosphere than wheat and oat (cereals), resulting in a dramatically different rhizosphere community. The relative abundance of eukaryotes in the oat and pea rhizospheres was more than fivefold higher than in the wheat rhizosphere or bulk soil. Nematodes and bacterivorous protozoa were enriched in all rhizospheres, whereas the pea rhizosphere was highly enriched for fungi. Metabolic capabilities for rhizosphere colonisation were selected, including cellulose degradation (cereals), H2 oxidation (pea) and methylotrophy (all plants). Avenacins had little effect on the prokaryotic community of oat, but the eukaryotic community was strongly altered in the sad1 mutant, suggesting that avenacins have a broader role than protecting from fungal pathogens. Profiling microbial communities with metatranscriptomics allows comparison of relative abundance, from multiple samples, across all domains of life, without polymerase chain reaction bias. This revealed profound differences in the rhizosphere microbiome, particularly at the kingdom level between plants.
rhizosphere; metatranscriptomics; microbiome; wheat; oat; pea
Lactobacillus salivarius is part of the vertebrate indigenous microbiota of the gastrointestinal tract, oral cavity, and milk. The properties associated with some L. salivarius strains have led to their use as probiotics. Here we describe the draft genome of the pig isolate L. salivarius cp400, providing insights into host-niche specialization.
Chronic polymicrobial infections of the lung are the foremost cause of morbidity and mortality in cystic fibrosis (CF) patients. The composition of the microbial flora of the airway alters considerably during infection, particularly during patient exacerbation. An understanding of which organisms are growing, their environment and their behaviour in the airway is of importance for designing antibiotic treatment regimes and for patient prognosis. To this end, we have analysed sputum samples taken from separate cohorts of CF and non-CF subjects for metabolites and in parallel, and we have examined both isolated DNA and RNA for the presence of 16S rRNA genes and transcripts by high-throughput sequencing of amplicon or cDNA libraries. This analysis revealed that although the population size of all dominant orders of bacteria as measured by DNA- and RNA- based methods are similar, greater discrepancies are seen with less prevalent organisms, some of which we associated with CF for the first time. Additionally, we identified a strong relationship between the abundance of specific anaerobes and fluctuations in several metabolites including lactate and putrescine during patient exacerbation. This study has hence identified organisms whose occurrence within the CF microbiome has been hitherto unreported and has revealed potential metabolic biomarkers for exacerbation.
Bdellovibrio bacteriovorus are facultatively predatory bacteria that grow within gram-negative prey, using pili to invade their periplasmic niche. They also grow prey-independently on organic nutrients after undergoing a reversible switch. The nature of the growth switching mechanism has been elusive, but several independent reports suggested mutations in the hit (host-interaction) locus on the Bdellovibrio genome were associated with the transition to prey-independent growth. Pili are essential for prey entry by Bdellovibrio and sequence analysis of the hit locus predicted that it was part of a cluster of Type IVb pilus-associated genes, containing bd0108 and bd0109. In this study we have deleted the whole bd0108 gene, which is unique to Bdellovibrio, and compared its phenotype to strains containing spontaneous mutations in bd0108 and the common natural 42 bp deletion variant of bd0108. We find that deletion of the whole bd0108 gene greatly reduced the extrusion of pili, whereas the 42 bp deletion caused greater pilus extrusion than wild-type. The pili isolated from these strains were comprised of the Type IVa pilin protein; PilA. Attempts to similarly delete gene bd0109, which like bd0108 encodes a periplasmic/secreted protein, were not successful, suggesting that it is likely to be essential for Bdellovibrio viability in any growth mode. Bd0109 has a sugar binding YD- repeat motif and an N-terminus with a putative pilin-like fold and was found to interact directly with Bd0108. These results lead us to propose that the Bd0109/Bd0108 interaction regulates pilus production in Bdellovibrio (possibly by interaction with the pilus fibre at the cell wall), and that the presence (and possibly retraction state) of the pilus feeds back to alter the growth state of the Bdellovibrio cell. We further identify a novel small RNA encoded by the hit locus, the transcription of which is altered in different bd0108 mutation backgrounds.
Cyclic guanosine 3′,5′-monophosphate (cyclic GMP) is a second messenger whose role in bacterial signalling is poorly understood. A genetic screen in the plant pathogen Xanthomonas campestris (Xcc) identified that XC_0250, which encodes a protein with a class III nucleotidyl cyclase domain, is required for cyclic GMP synthesis. Purified XC_0250 was active in cyclic GMP synthesis in vitro. The linked gene XC_0249 encodes a protein with a cyclic mononucleotide-binding (cNMP) domain and a GGDEF diguanylate cyclase domain. The activity of XC_0249 in cyclic di-GMP synthesis was enhanced by addition of cyclic GMP. The isolated cNMP domain of XC_0249 bound cyclic GMP and a structure–function analysis, directed by determination of the crystal structure of the holo-complex, demonstrated the site of cyclic GMP binding that modulates cyclic di-GMP synthesis. Mutation of either XC_0250 or XC_0249 led to a reduced virulence to plants and reduced biofilm formation in vitro. These findings describe a regulatory pathway in which cyclic GMP regulates virulence and biofilm formation through interaction with a novel effector that directly links cyclic GMP and cyclic di-GMP signalling.
A cyclic GMP-dependent signalling pathway regulates bacterial phytopathogenesis
In the plant pathogen X. campestris, the second messenger cGMP controls bacterial virulence and biofilm formation through direct regulation of XC_0249, a novel diguanylate cyclase that synthesises the signalling molecule cyclic di-GMP.
biofilm; cyclic di-GMP; signal transduction; virulence; Xanthomonas campestris
Ash dieback is a devastating fungal disease of ash trees that has swept across Europe and recently reached the UK. This emergent pathogen has received little study in the past and its effect threatens to overwhelm the ash population. In response to this we have produced some initial genomics datasets and taken the unusual step of releasing them to the scientific community for analysis without first performing our own. In this manner we hope to ‘crowdsource’ analyses and bring the expertise of the community to bear on this problem as quickly as possible. Our data has been released through our website at oadb.tsl.ac.uk and a public GitHub repository.
Crowdsource; Genomics; Ash dieback; Open source; Altmetrics
We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam’s capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.
Storage triacylglycerols in castor bean seeds are enriched in the hydroxylated fatty acid ricinoleate. Extensive tissue-specific RNA-Seq transcriptome and lipid analysis will help identify components important for its biosynthesis.
Storage triacylglycerols (TAGs) in the endosperm of developing castor (Ricinus communis) seeds are highly enriched in ricinoleic acid (18:1-OH). We have analysed neutral lipid fractions from other castor tissues using TLC, GLC and mass spectrometry. Cotyledons, like the endosperm, contain high levels of 18:1-OH in TAG. Pollen and male developing flowers accumulate TAG but do not contain 18:1-OH and leaves do not contain TAG or 18:1-OH. Analysis of acyl-CoAs in developing endosperm shows that ricinoleoyl-CoA is not the dominant acyl-CoA, indicating that either metabolic channelling or enzyme substrate selectivity are important in the synthesis of tri-ricinolein in this tissue. RNA-Seq transcriptomic analysis, using Illumina sequencing by synthesis technology, has been performed on mRNA isolated from two stages of developing seeds, germinating seeds, leaf and pollen-producing male flowers in order to identify differences in lipid-metabolic pathways and enzyme isoforms which could be important in the biosynthesis of TAG enriched in 18:1-OH. This study gives comprehensive coverage of gene expression in a variety of different castor tissues. The potential role of differentially expressed genes is discussed against a background of proteins identified in the endoplasmic reticulum, which is the site of TAG biosynthesis, and transgenic studies aimed at increasing the ricinoleic acid content of TAG.
Several of the genes identified in this tissue-specific whole transcriptome study have been used in transgenic plant research aimed at increasing the level of ricinoleic acid in TAG. New candidate genes have been identified which might further improve the level of ricinoleic acid in transgenic crops.
The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is a genome database for Arabidopsis thaliana, an important reference organism for many fundamental aspects of biology as well as basic and applied plant biology research. TAIR serves as a central access point for Arabidopsis data, annotates gene function and expression patterns using controlled vocabulary terms, and maintains and updates the A. thaliana genome assembly and annotation. TAIR also provides researchers with an extensive set of visualization and analysis tools. Recent developments include several new genome releases (TAIR8, TAIR9 and TAIR10) in which the A. thaliana assembly was updated, pseudogenes and transposon genes were re-annotated, and new data from proteomics and next generation transcriptome sequencing were incorporated into gene models and splice variants. Other highlights include progress on functional annotation of the genome and the release of several new tools including Textpresso for Arabidopsis which provides the capability to carry out full text searches on a large body of research literature.
The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.
Chromosome 17 is unusual among the human chromosomes in many respects. It is the largest human autosome with orthology to only a single mouse chromosome1, mapping entirely to the distal half of mouse chromosome 11. Chromosome 17 is rich in protein-coding genes, having the second highest gene density in the genome2,3. It is also enriched in segmental duplications, ranking third in density among the autosomes4. Here we report a finished sequence for human chromosome 17, as well as a structural comparison with the finished sequence for mouse chromosome 11, the first finished mouse chromosome. Comparison of the orthologous regions reveals striking differences. In contrast to the typical pattern seen in mammalian evolution5,6, the human sequence has undergone extensive intrachromosomal rearrangement, whereas the mouse sequence has been remarkably stable. Moreover, although the human sequence has a high density of segmental duplication, the mouse sequence has a very low density. Notably, these segmental duplications correspond closely to the sites of structural rearrangement, demonstrating a link between duplication and rearrangement. Examination of the main classes of duplicated segments provides insight into the dynamics underlying expansion of chromosome-specific, low-copy repeats in the human genome.
The Arabidopsis Information Resource (TAIR, http://arabidopsis.org) is the model organism database for the fully sequenced and intensively studied model plant Arabidopsis thaliana. Data in TAIR is derived in large part from manual curation of the Arabidopsis research literature and direct submissions from the research community. New developments at TAIR include the addition of the GBrowse genome viewer to the TAIR site, a redesigned home page, navigation structure and portal pages to make the site more intuitive and easier to use, the launch of several TAIR web services and a new genome annotation release (TAIR7) in April 2007. A combination of manual and computational methods were used to generate this release, which contains 27 029 protein-coding genes, 3889 pseudogenes or transposable elements and 1123 ncRNAs (32 041 genes in all, 37 019 gene models). A total of 681 new genes and 1002 new splice variants were added. Overall, 10 098 loci (one-third of all loci from the previous TAIR6 release) were updated for the TAIR7 release.
The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results.
The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions.
In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.
The bacterium Xanthomonas campestris is an economically important pathogen of many crop species and a model for the study of bacterial phytopathogenesis. In X. campestris, a regulatory system mediated by the signal molecule DSF controls virulence to plants. The synthesis and recognition of the DSF signal depends upon different Rpf proteins. DSF signal generation requires RpfF whereas signal perception and transduction depends upon a system comprising the sensor RpfC and regulator RpfG. Here we have addressed the action and role of Rpf/DSF signalling in phytopathogenesis by high-resolution transcriptional analysis coupled to functional genomics. We detected transcripts for many genes that were unidentified by previous computational analysis of the genome sequence. Novel transcribed regions included intergenic transcripts predicted as coding or non-coding as well as those that were antisense to coding sequences. In total, mutation of rpfF, rpfG and rpfC led to alteration in transcript levels (more than fourfold) of approximately 480 genes. The regulatory influence of RpfF and RpfC demonstrated considerable overlap. Contrary to expectation, the regulatory influence of RpfC and RpfG had limited overlap, indicating complexities of the Rpf signalling system. Importantly, functional analysis revealed over 160 new virulence factors within the group of Rpf-regulated genes.