There are countless ways in which prokaryotes influence our daily life. For example, they mediate the chemical cycles that convert key elements of life into biologically accessible forms; they make certain nutrients/metals/vitamins available to their biological hosts; and they can also be used to breakdown hydrocarbons and treat crude oil leakages. On the other hand, the environment has undoubtedly left footprints on the morphological, physiological and biochemical diversities of prokaryotes during the evolution. The close interactions between prokaryotes and the environment, especially driven by horizontal gene transfer and homologous recombination, have made prokaryotes the most genetically diverse superkingdoms of life [1
Temperature is one of the elements characterizing the ecological contexts of prokaryotic organisms. The National Center for Biotechnology Information (NCBI) Microbial Genome Project Database (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi
) uses five terms to categorize the temperature range an organism grows at, where cryophilic refers to -30° to -2°C, psychrophilic refers to -1° to +10°C, mesophilic refers to +11° to +45°C, thermophilic refers to +46 to +75°C, hyperthermophilic refers to above +75°C, and organisms that live at ranges that overlap with more than one category are labeled as the one corresponding to the largest overlap. It has been reported that temperature can possibly influence the ecological, physiological and genomic properties of a prokaryotic organism in multiple aspects. For instance, at the population-level, temperature was shown to have caused compositional and functional shifts in microbial communities [2
]; at the cellular level, temperature was shown to have a significant effect on a variety of growth parameters (e.g., optical density, viable cell numbers, and cell dry mass) [3
], structure and ion permeability of cell membranes [4
], affinity for substrates (e.g., glycerol and nitrate) [5
], circadian rhythms [7
], and virulence functions [8
]; and, at the molecular level, temperature was shown to be correlated with the nucleotide content, codon usage and amino acid composition [9
], structure/function/stability of proteins [13
], topological properties of metabolic networks [15
], expression of certain genes (e.g., heat-and cold-shock response genes) [16
Guanine-Cytosine (GC) content is one of the genomic traits that have been hypothesized to be correlated with the temperature condition of an organism. Since the GC pair is bound by three hydrogen bonds while the adenine-thymine (AT) pair is bound by two hydrogen bonds, it has long been expected that organisms growing at higher temperature would have a higher proportion of GC than AT pairs. However, it remains ambiguous whether the whole-genome GC content level of an organism is correlated with its temperature condition. Analysis on 368 bacterial species seemed to have confirmed the existence of the positive correlation between the whole-genome GC content level and the optimal growth temperature of prokaryotes [19
]. However, later analysis indicated that the sample size, presence of outliers, as well as some other factors that may affect whole-genome GC content levels (e.g., mutational bias, genome size, oxygen requirement, nitrogen utilization, habitat, salinity and alkalinity) could potentially introduce bias towards the association analysis and lead to questionable conclusions [20
]. Actually, organisms living at high temperature have mechanisms other than increasing GC content, such as thermophile-specific enzymes (e.g., reverse gyrase) [23
] or certain dinucleotides that may contribute to thermostability [24
], to maintain the double stranded structure of the DNA.
It is worthy of notice that the GC content can be substantially variable within the same genome. For instance, the GC content in coding regions is often higher than that of the whole genome [25
]. And, if a DNA fragment is obtained via a recent horizontal gene transfer event, its GC content tends to exhibit different variation patterns from the native parts of the genome. Despite the lack of obvious correlation between the whole-genome GC content level and the optimal growth temperature, studies have shown that the GC content levels of certain genes (e.g., ribosomal and transfer RNA genes) are significantly correlated with the optimal growth temperatures [26
]. Also, as shown in Fig. , the non-coding region surrounding the gene menB (naphthoate synthase) can be drastically different for mesophilic and thermophilic/hyperthermophilic organisms. Inspired by these preliminary investigations, we here focus on the correlation relationships between the genic
GC content levels and the temperature range conditions of prokaryotic organisms.
The distribution of the GC content level of the non-coding region surrounding the menB gene (K01661: naphthoate synthase) for mesophilic (dashed, red) and thermophilic/hyperthermophilic organisms (solid, blue).