To our knowledge, we have presented here the first analysis showing i) a representation of whole-genome transcriptional data in
Buchnera and ii) the links between mRNA abundances and chromosomal organization in a highly reduced bacterial genome, on the basis of experimental data. The large number of prokaryotic genome sequences available in databases has made it possible to study these links in many bacterial chromosomes. Using codon adaptation index (CAI) values computed from ribosomal proteins, Rocha and Danchin have shown that, in
B. subtilis and in
E. coli, the importance of the expressiveness in determining the localization of the genes on the leading strand is negligible, or even absent, when essentiality is taken into account [
36]. They have confirmed these results for other sequenced genomes, with the exception of some non-free-living bacteria among which
Buchnera was one of the most important [
4]. However, the authors underlined the difficulty of obtaining correct CAI values in the genomes of intracellular bacteria, which generally do not show sufficient codon usage bias. Moreover, another explanation of the exceptions they found for
Buchnera can be attributed to authors' assignment of gene essentiality, which was exclusively based on the homology with
E. coli. Indeed, while Rocha
et al. [
4] found no essential genes strand distribution bias in
Buchnera, we found a bias of 60% using the minimal gene set proposed by Gil
et al. [
25]. More recently, Price
et al. [
37] have revisited the question of gene-strand bias in bacteria, using gene expression microarray data, and they have shown that, in
B. subtilis and
E. coli, the genes in operons on the leading strand DNA are more highly expressed than genes in operons on the lagging strand. This observation was true for both essential and non-essential genes.
In Buchnera, we found that highly expressed genes are generally the essential genes within pTU. Independently of the essentiality factor, the genes within pTU are more highly expressed than singletons. We also found that for genes within pTU the essential genes are more highly expressed than non essential ones whereas this effect was not observed for singletons. These results underline the conservation of a coherent relationship between mRNA abundances and gene essential functions in the reduced genome of Buchnera.
In this work we also analysed the relationship between mRNA abundances and genes GC ratio in
Buchnera (taken as an indicator of gene evolution rate). By combining transcription analysis, evolution rates and comparative genomics, we were able to define new candidates for the essential gene set of
Buchnera. In bacteria, genes encoding ribosomal, cell division and chaperone/protease proteins are considered as essential and they are also known to be constitutively highly expressed. As we expected, our data showed that almost all genes encoding ribosomal proteins, and genes encoding chaperonins, are relatively well conserved and also highly expressed in
Buchnera. Interestingly, among the most highly conserved and expressed genes, we found genes involved in the biosynthesis of EAAs and, in particular, all the genes of the isoleucine and valine pathway. Also remarkable is that the genes encoding the aminoacyl-tRNA synthetases for 8 out of the 10 aphid EAAs are weakly expressed, whereas genes encoding aminoacyl-tRNA synthetases for non-EAAs are either moderately or highly expressed. A possible explanation for this observation is that
Buchnera, which is a relatively slow-growing bacterium, does not necessitate high rates of protein production and constitutively synthesizes EAAs in order to furnish them to the aphid. Reducing the abundance of specific aminoacyl-tRNA synthetases might increase the concentration of free EAAs in
Buchnera cells facilitating the transport of these amino acids to the aphid host cell. This observation is reminiscent of a similar putative adaptive response of
Buchnera, which selectively underexpresses
pheT under aromatic EEAs shortage [
26].
In our study, we also found that the orphan genes
yba3 and
yba4 seem to evolve rapidly and are highly expressed,
yba4 being one of the ten highest expressed genes in
Buchnera. The conservation of these genes in BAp and in the
Buchnera harboured by the aphid
Schizaphis graminum (BSg), coupled with their high expression level, suggest that they could be of particular relevance in the symbiosis of
Buchnera with the aphids
S. graminum and
A. pisum, as it has also been proposed by Shimomura
et al. [
38] for
A. pisum.
Among the genes rapidly evolving (this work, Tamas
et al. [
19] and Reymond
et al. [
26]) and highly expressed in
Buchnera, we found most of the flagellar genes. These data, taken together with recent experimental evidence that the
Buchnera incomplete flagellar apparatus can function as a "protein transporter" [
39], support the idea that flagellar genes are taking on new important functions in the symbiotic context. The flagellar gene set of BAp is composed of 26 genes of which
fliEFGHIJKMNPQR and
flhAB genes are located on the leading strand (except
fliE), and
flgNABCDEFGHIJK genes are located on the lagging strand DNA (except
flgN and
flgA). These flagellar genes are also conserved in BSg. However,
Buchnera from a distant aphid lineage,
Baizongia pistaciae (BBp), has lost five flagellar genes (
flgA,
flgD,
flgE,
flgK,
flgN) but not any of the
fli ones. Finally,
Buchnera with the most dramatically reduced genome, from the aphid
Cinara cedri (BCc), has lost the same genes as BBp but also four other
flg genes (
flgB,
flgC,
flgG,
flgJ) and four
fli genes (
fliE,
fliJ,
fliK,
fliM), hence preserving only a minimal type III virulence secretion system [
21]. The evolutionary selection of the majority of the
fli genes also suggests the possible importance of the new putative transport function of these genes in
Buchnera. It is to note, that the most conserved
fli genes among the
Buchnera lineages, probably involved in the new function, are located on the leading strand, whereas most of the gene losses occurred on the lagging strand. Previous comparative genomics analyses [
40,
41] had tempted to dissect the evolutionary forces driving the genome organization in several
Buchnera lineages (i.e., gene strand bias, gene and protein composition, gene expressivity, gene evolution rate and gene loss). Our results are partly consistent with these previous analyses as we found (1) highly conserved genes are highly expressed, (2) essential genes in pTU are highly expressed and probably preserved from mutations by purifying selection, and (3) positive selection may shape new "symbiotic" functions for some genes highly expressed and highly evolving. However, we reject the former idea, based on CAI analyses, that expressiveness is a factor driving gene strand bias in
Buchnera.
Finally, an important result of this study was the discovery of spatial patterns of transcriptional activity in the chromosome of Buchnera: i.e., the transcription of the genes along the chromosome is determined according to spatial constraints. From autocorrelation and spectral analysis, four groups of spatial patterns can be defined: (i) autocorrelated short-range (between 2 and 8 genes), (ii) periodic short-range (up to 17 genes), (iii) periodic medium-range (between 23 and 61 genes) and (iv) periodic long-range (over 87 genes) structural components.
Autocorrelated short-range patterns, determined by the autocorrelation function of gene transcription levels, showed that genes spaced by a gene-to-gene distance of less than 8 have highly correlated expressions. As has been suggested for
E. coli and
B. subtilis, we propose for
Buchnera that these correlations reflect the co-ordinated transcription of genes within operons [
14,
15]. This observation reinforces the result mentioned above concerning the conservation of functional transcription units in
Buchnera. However, by permuting gene position along the chromosome, we observed that the organization of genes into putative operons is not sufficient to fully explain the observed periodic spatial patterns of transcription. Indeed, if the presence of spatial periodic components in
Buchnera gene expression was only due to the conservation of operon structures, the modification of the order and/or the number of the singletons located between the pTU should not affect the periodicities and the autocorrelation values. We have shown, however, that these modifications reduced the spatial patterns, alleviating the importance of a high-order arrangement of all the genes along the chromosome of the endosymbiont.
Jeong
et al. [
14] have classified the transcriptional periods that they have found in
E. coli into three categories: short-range (up to 16 genes); medium-range (100–125 genes); and long-range (600–800 genes). The existence of short periods, up to 17 genes in
Buchnera, allows us to corroborate the hypothesis, proposed in a previous study on
B. subtilis and on
E. coli, that this short range element could be a property of the structural nucleoid common to other bacteria, corresponding to large DNA spirals on the nucleoid surface [
15]. The medium and the long-range periods are shorter in
Buchnera than those identified for free-living bacteria, which is probably due to the greatly reduced size of its genome. However, these two kinds of periods are not yet understood and do not correspond to the domains identified so far in the nucleoid [
15]. The effect of the second and third simulated gene permutations (Perm. 2 and Perm. 3 on Figure ) on the medium and long-range periods respectively, could be explained by the importance of the spatial location of the operons along the chromosome and by the neighbourhood of singletons that form "supra-operonic" structures in
Buchnera. Moreover, the decrease of the maximum size of the autocorrelated groups of genes (Table ), for the different permutations of gene positions, corroborates the hypothesis of "supra-operonic" structures in
Buchnera. But this speculation needs to be more studied and experimentally confirmed. Finally, the observation of transcriptional periodic patterns, coupled with the conservation in its genome of some Nucleoid Associated Proteins (NAP) such as H-NS, IHF and Fis, suggest that
Buchnera has maintained a nucleoid structure responsible for the differences in gene transcription levels in basal conditions. These three NAPs were previously found to be differentially expressed in
Buchnera [
26] facing nutritional constraints. However, their role in transcriptional regulation remains presently speculative in the aphid endosymbiont.