Organic-solvent-tolerant bacteria are an amazing group of extremophilic microorganisms that can thrive in the presence of high concentrations of organic solvents (
5,
10). Because of their high adaptability to toxic solvents, they are considered to have great potential in industry, especially in bioremediation and biphasic catalysis (
10–
12). In order to make better use of the bacteria, much research has been performed to investigate the mechanisms of their organic-solvent tolerance (
3,
7,
9). However, the tolerance mechanisms of many strains are still not very clear, and several important questions remain unanswered, for instance, how the signals (presence of organic solvents) are sensed and transmitted (
3,
7,
9). Genome sequencing would accelerate the studies in such scientific fields (
4). In addition, useful industrial enzymes, such as organic-solvent-tolerant lipases and proteases, would also be easily found by analyzing the genome sequences (
4,
13).
Pseudomonas putida Idaho is an organic-solvent-tolerant bacterium that can grow in the presence of more than 50% toluene,
m-xylene,
p-xylene 1,2,4-trimethylbenzene, and 3-ethyltoluene (
2). This strain has been successfully used in developing efficient catalysts for biotechnological applications (
8,
12,
14). Strain Idaho can increase its phospholipid content after exposure to xylene, while an organic-solvent-sensitive strain exhibited a decrease in the total phospholipid content (
2,
6). Therefore, it is suggested that the unique tolerance of strain Idaho to solvents is due to its ability to synthesize membranes rapidly (
2,
6). However, no genetic component responsible for solvent resistance in strain Idaho has been identified until now.
Here, we present, for the first time, the draft genome sequence of
P. putida Idaho, obtained using the Illumina GA system, which was performed by the Helmholtz Center for Infection Research in Germany with a paired-end library. The reads were assembled with VELVET (
15). The draft genome sequence of strain Idaho was annotated using the RAST annotation server (
1). The G+C mole percent was calculated using the genome sequence.
The draft genome sequence includes 6,363,067 bases and is comprised of 5,766 predicted coding sequences (CDSs) with a G+C content of 61.6%, consisting of 839 large contigs (>500 bp in size). We predicted 192 CDSs responsible for the metabolism of aromatic compounds, which is consistent with the degradation ability of P. putida Idaho. About 40 CDSs encoding phospholipid synthesis were annotated whose catalytic ability and regulation should be carefully investigated to discover the molecular mechanisms for the tolerance of strain Idaho. There are 212 CDSs being annotated because the genes are related to stress response. It is logical to suggest that these stress response genes contribute to the high adaptability of strain Idaho. Moreover, we annotated 24 CDSs related to efflux pump systems which may also contribute to the organic solvent tolerance of strain Idaho. About 11 lipases and 50 proteases were also annotated, which should be investigated for industrial applications.
Nucleotide sequence accession numbers. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under accession number AGFJ00000000. The version described in this paper is the first version, AGFJ01000000.