Advances in sequencing technologies have driven an explosion in our knowledge of the non-coding genetic repertoire of bacterial species. This study illustrates the first example of a global approach to both sRNA identification and pathogenesis profiling, an amalgamation of RNA-seq and Tn-seq. The RNA-seq tactic identified 89 putative pneumococcal sRNAs, capturing both sRNAs previously detected by sequencing and tiling arrays and many additional previously unknown sRNAs 
. Use of RNA-seq has certain advantages for the identification of sRNAs. The mean level of sequence coverage was over 100-fold on both the forward and reverse strands, with each sRNA corresponding to a minimum of 10x coverage allowing for high confidence in the data. It should be noted that low abundance sRNAs identified in other studies from a single read will likely be missed by our analysis 
. Unlike tiling arrays, RNA-seq identifies the origin of transcription. This permits the precise mapping of sRNAs that contain highly repetitive regions, such as the over 100 BOX elements found in intergenic regions of the pneumococcal genome. BOX elements are short AT-rich repeats that are highly transcribed and were also detected in sRNA searches using tiling arrays, though precise locations could not be mapped 
. Eighteen BOX element containing sRNAs were mapped, a finding particularly important as the Tn-seq analysis implicated a subset of four BOX-element sRNAs in pathogenesis. Although BOX elements have traditionally been thought to be parasitic sequences mobilized by transposases 
, recent evidence supporting their placement in sRNAs indicates that they can form RNA structures with riboswitches 
. In addition, BOX elements can stimulate expression of downstream genes by increasing the half-lives of the mRNA 
Another important aspect of this study was the identification of five novel shared sRNA sequence motifs that were conserved at multiple locations in the pneumococcal genome. Upon closer examination of the sequence read depth in the areas surrounding these motifs, we identified 17 with increased signal compared to the surrounding region. All 17 of these predicted sRNAs were subsequently validated by expression analysis underscoring the robustness of the predictions. While members of the T-box, Pyr, TPP, and tmRNA sRNA families described in other bacteria were also found in pneumococcus, a majority of the predicted pneumococcal sRNAs could not be assigned to a functional family. These data indicate that the pneumococcus is a rich source of new motifs that can expand sRNA prediction algorithms in Gram-positive bacteria.
Although numerous sRNAs have been identified in the pneumococcus, there have been no sRNAs implicated in pathogenesis and more broadly, there have been no attempts to apply transposon-mediated mutagenesis to determine the role of sRNAs in bacterial virulence in specific host tissues. This study represents the first use of transposon-mediated mutagenesis to address the global role of sRNAs in discrete host tissues during disease. Using a comprehensive list of sRNAs identified in this study together with those found by others, we identified a number of sRNAs that played distinct roles in pathogenesis in the nasophaynx, the lung, or the bloodstream. The lungs provided the most comprehensive analysis of the contribution of sRNAs to virulence, since bottleneck constraints in the nasophaynx and the blood imposed by a limitation of bacterial binding sites and clearance by the spleen, respectively, may have impaired detection in these sites. A number of sRNAs had no inserts in the Tn-seq deletion library (n.i. in Table S4 in Text S1
) and it is tempting to speculate that there is a selective pressure against the loss of these sRNAs; however this observation could be random due to their small size. All three body sites had a distinct list of sRNA candidates that were involved in pathogenesis. The Tn-seq analysis proved to be robust, as mutants predicted to be attenuated in their respective host niches were confirmed in in vivo
competition experiments pitting each sRNA mutant individually against wild type (). Thus the multi-organ Tn-Seq approach captured this diversity as exemplified by R12 that did not have a significant virulence defect in overall survival in our initial studies but was attenuated both during colonization of the nasophaynx and in the blood following intraperitoneal infection. The Tn-seq analysis also provides insight into the organ-specific defects of the sRNAs found to have reduced virulence in . Both the ΔF41 and ΔF25 strain had greatly reduced fitness in the blood, in agreement with their inability to progress to sepsis. The ΔF7 and ΔF32/tmRNA strains were both defective in the lung infection, indicating that this might be the most crucial site for clearance of these mutants. This comprehensive analysis of the contribution of all the identified sRNAs to pneumococcal pathogenesis in discrete host sites can provide a framework for future investigations elucidating the precise functions of these sRNAs. These data add to the growing understanding of the contribution of sRNA in the virulence of bacterial pathogens 
The sRNA mutants displaying defects in virulence exhibited a number of characteristics that could potentially explain an inability to cause disease. Several of the attenuated sRNA mutants had defects in adhesion and invasion of nasopharyngeal or endothelial cells, capabilities important to the progression of invasive disease. ΔF20 and ΔF32/tmRNA showed decreased adhesion/invasion of nasopharyngeal or endothelial cells, respectively, in concert with Tn-seq and competitive index data indicating lack of fitness in the nasopharynx and lung. F32 encodes a tmRNA and these have been implicated in the pathogenesis of other bacteria 
. The central role of tmRNA in the rescue of ribosomes on stalled mRNA as well as targeting defective mRNA for degradation, is consistent with the strong defect in pathogenesis observed in the ΔF32 strain 
. In the case of the ΔF20 mutant, proteomic analysis indicated proteins responsible for purine metabolism were strongly down regulated whereas DNA synthesis and repair pathways were greatly increased. Thus deletion of F20 had pleiotropic effects on DNA metabolism that could explain attenuation of the mutant. Taken together, these data provide compelling evidence that sRNAs play important roles in virulence, that their affects can arise at several levels of control of virulence gene/protein expression, and that these roles can be restricted to specific host tissues.
Our study expanded the search for sRNAs and their role in gene regulation to three mutants in TCSs. Control over gene networks by TCSs is typically mediated by a direct interaction of the response regulator with a target sequence shared by many genes dispersed over a genome. However, TCSs have also been found to control the expression of sRNAs in pneumococcus and other bacteria 
. For example, control of porin expression in E. coli
involves multiple sRNAs that exert posttranscriptional control over the targets of TCSs 
. The prospect of sRNA functioning as an intermediary, finely tuning the control of and expanding the regulatory scope by a TCS, would allow for another layer of control for more precise regulation. Our observation that the abundance of sRNAs was altered when each of the three TCSs were disrupted is consistent with TCSs acting through sRNAs to broadly control gene expression. This is further supported by the observed alterations of the global transcriptome as well as the abundance of multiple protein targets upon deletion of an individual sRNA (Tables S5 and S6 in Text S1
, ). These data suggest that the impact of sRNAs on multiple aspects of pneumococcal biology and pathogenesis could potentially be exerted by an additional layer of posttranscriptional control over the gene networks controlled by TCSs.
The widespread utilization of RNA-mediated regulation of diverse processes has a number of potential advantages for bacteria 
. Protein regulators incur greater metabolic costs to the cell, being encoded by larger segments of the genome and requiring translation. In contrast, sRNAs do not require translation and occupy a very limited amount of the genome. The additional layer of regulation conferred by sRNAs may also allow for more precise control of gene expression, as evidenced by the fact that sRNAs can have multiple targets as well as the fact that multiple sRNAs can regulate a single target under different conditions 
. Additionally, sRNAs can have dramatically different half-lives in the cell, ranging from under 2 minutes to greater than 30 minutes 
. Such differences in stability could potentially mediate the duration of control mediated by sRNAs. The challenging task that remains following the identification and characterization of sRNAs in pathogenesis is assigning discrete functional roles to these molecules. We have shown the feasibility of applying Tn-seq to identify changes in bacterial fitness in response to deletion of the corresponding sRNA in various host tissues. The feasibility of this approach to investigate the gene networks and functional roles of sRNAs suggest the combination of RNA-seq and Tn-seq will be a unique and powerful tool for future investigations of the precise functional roles of these sRNAs in the pneumococcus.