Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.
The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.