Prokaryote genomes are considered compacted genomes with only small fractions of their genomic DNA assigned to intergenic regions [
1]. The percentage of these non-coding regions varies across the prokaryote species, and does not depend on genome size or gene content, even though the latter variables strongly correlate [
2]. The spacers between a pair of genes were divided into three types according to their transcriptional direction: i) unidirectional, ii) convergent and iii) divergent [
1]. Here we use the co-directional term rather than the unidirectional one. These three types of spacers differ in the type of regulatory signals they contain. In prokaryotes, most of the co-directional genes are involved in operons [
3,
4]. The spacers between these genes may contain translational signals such as the Shine-Dalgarno (SD) sequence. The intergenic spacers between convergent genes may contain terminators for both genes while the divergent ones have only promoters and other upstream transcriptional signals. The different types of intergenic regions in prokaryotes, including the convergent and divergent ones (all of which are inter-operonic) and the co-directional ones (which are mainly intra-operonic), evolve under the same evolutionary pressures. Selection pressure [
1] and deletional bias [
2] have been proposed as the main forces responsible for minimizing the amount of non-functional DNA in prokaryote genomes. Deletion bias is the mechanism that shapes the prokaryote genomes, while selection pressure may establish an equilibrium with deletional bias in order to maintain minimally required amounts of non-coding DNA. These minimally required amounts of non-coding sequences are required to accommodate essential regulatory signals [
1] and DNA replication sequences [
5,
6]. According to the genomic compactness, prokaryote genomes have intergenic distances that are much shorter than gene lengths and relatively shorter than those in eukaryote genomes [
7]. Eukaryote genomes have a much wider range of genome sizes and contain protein-coding genes that are typically interrupted by introns and have longer intergenic regions.
One of the regulatory sequences affected by the short distances between prokaryote genes is the SD sequence. In 1974, Shine and Dalgarno found a sequence (5'-GGAGGU-3') at the 5' of the initiation codons in several messenger RNAs (mRNAs) of
Escherichia coli that was complementary to the 3'-CCUCCA-5' sequence located at the tail of the 3'-end of the 16S ribosomal RNA (rRNA) [
8]. It has been suggested that a strong SD sequence, though not mandatory in translation initiation, may compensate for a weak start codon and counteract mRNA secondary structures that hinder access to the start codon [
9,
10]. Although the genes with a SD sequence are widely found in prokaryote genomes, previous studies have also shown that there is a significantly and previously underestimated population of genes without a SD sequence [
11-
14]. Moreover, the exponential increase of the fully sequenced genomes has provided thousands of examples of leaderless genes or genes without a SD sequence in prokaryote genomes [
15]. It has been suggested that the leaderless genes could use an independent pathway in their gene translation, while leader genes without a SD sequence must use alternative unknown mechanisms in their translation initiation [
16,
17].
Among the genes that have a SD motif, the ribosome does not need a perfect distance between the SD sequence and the start codon for the initiation of translation. However, when the SD sequence is located within four or as far as 13 nucleotides from the start codon, the gene expression decreases dramatically [
18-
20]. Therefore, there are apparently structural constraints that require an optimal space between the SD motif and the start codon. This sequence has mainly been found 7 to 12 nucleotides upstream of the start codon [
12,
21,
22]. Taking this into account, the intergenic distances are an important feature of the prokaryote genomes that may correlate with the SD presence [
12]. Many genes are sufficiently close together that the end of one gene may overlap the SD sequence or the coding sequence of the next gene. Eyre-Walker and Bulmer showed that there is a change in composition at the end of genes, which is consistent with selection against the formation of mRNA secondary structures around the SD sequence [
23]. Eyre-Walker also showed that the strength and location of the SD sequence do not vary significantly because of the close proximity of the prokaryote genes [
24]. It seems, therefore, that the spacing lengths and stop codon usage adapt themselves to the presence of SD. Recently, in the fusellovirus SSV4, which has a compactly organized genome, a preference for the TGA stop codon has been found in genes that overlap their stop codon with the SD sequence of the next gene to form the pattern GG
TGA as a SD motif [
25]. In prokaryotes, it seems that some intergenic distances are less favored because of the presence of the SD sequence. A certain stop codon usage is therefore required to form the SD motifs, as has been described in viruses. In this paper we assess how the presence of the SD sequence affects the spacing lengths between adjacent genes and the stop codon usage.