Here, we present the complete genome of the extreme thermophile, Dictyoglomus thermophilum H-6-12 (phylum Dictyoglomi), which consists of 1,959,987 bp.
Here, we present the complete 2,003,803-bp genome of a sulfate-reducing thermophilic bacterium, Thermodesulfovibrio yellowstonii strain DSM 11347T.
Here, we present the complete genome sequence of Thermodesulfobacterium commune DSM 2178T of the phylum Thermodesulfobacteria.
Here we present the draft genome of Synergistes jonesii 78-1, ATCC 49833, a member of the Synergistes phylum. This organism was isolated from the rumen of a Hawaiian goat and ferments pyridinediols. The assembly contains 2,747,397 bp in 61 contigs.
Here we present the complete 1,424,912-bp genome sequence of Coprothermobacter proteolyticus DSM 5265, isolated from a thermophilic digester fermenting tannery wastes and cattle manure.
Bacterial community composition and functional potential change subtly across gradients in the surface ocean. In contrast, while there are significant phylogenetic divergences between communities from freshwater and marine habitats, the underlying mechanisms to this phylogenetic structuring yet remain unknown. We hypothesized that the functional potential of natural bacterial communities is linked to this striking divide between microbiomes. To test this hypothesis, metagenomic sequencing of microbial communities along a 1,800 km transect in the Baltic Sea area, encompassing a continuous natural salinity gradient from limnic to fully marine conditions, was explored. Multivariate statistical analyses showed that salinity is the main determinant of dramatic changes in microbial community composition, but also of large scale changes in core metabolic functions of bacteria. Strikingly, genetically and metabolically different pathways for key metabolic processes, such as respiration, biosynthesis of quinones and isoprenoids, glycolysis and osmolyte transport, were differentially abundant at high and low salinities. These shifts in functional capacities were observed at multiple taxonomic levels and within dominant bacterial phyla, while bacteria, such as SAR11, were able to adapt to the entire salinity gradient. We propose that the large differences in central metabolism required at high and low salinities dictate the striking divide between freshwater and marine microbiomes, and that the ability to inhabit different salinity regimes evolved early during bacterial phylogenetic differentiation. These findings significantly advance our understanding of microbial distributions and stress the need to incorporate salinity in future climate change models that predict increased levels of precipitation and a reduction in salinity.
Whole genome amplification by the multiple displacement amplification (MDA) method allows sequencing of genomes from single cells of bacteria that cannot be cultured. However, genome assembly is challenging because of highly non-uniform read coverage generated by MDA. We describe an improved assembly approach tailored for single cell Illumina sequences that incorporates a progressively increasing coverage cutoff. This allows variable coverage datasets to be utilized effectively with assembly of E. coli and S. aureus single cell reads capturing >91% of genes within contigs, approaching the 95% captured from a multi-cell E. coli assembly. We apply this method to assemble a single cell genome of the uncultivated SAR324 clade of Deltaproteobacteria, a cosmopolitan bacterial lineage in the global ocean. Metabolic reconstruction suggests that SAR324 is aerobic, motile and chemotaxic. These new methods enable acquisition of genome assemblies for individual uncultivated bacteria, providing cell-specific genetic information absent from metagenomic studies.
A variety of microbial communities and their genes (microbiome) exist throughout the human body, playing fundamental roles in human health and disease. The NIH funded Human Microbiome Project (HMP) Consortium has established a population-scale framework which catalyzed significant development of metagenomic protocols resulting in a broad range of quality-controlled resources and data including standardized methods for creating, processing and interpreting distinct types of high-throughput metagenomic data available to the scientific community. Here we present resources from a population of 242 healthy adults sampled at 15 to 18 body sites up to three times, which to date, have generated 5,177 microbial taxonomic profiles from 16S rRNA genes and over 3.5 Tb of metagenomic sequence. In parallel, approximately 800 human-associated reference genomes have been sequenced. Collectively, these data represent the largest resource to date describing the abundance and variety of the human microbiome, while providing a platform for current and future studies.
The goal of the Human Microbiome Project (HMP) is to generate a comprehensive catalog of human-associated microorganisms including reference genomes representing the most common species. Toward this goal, the HMP has characterized the microbial communities at 18 body habitats in a cohort of over 200 healthy volunteers using 16S rRNA gene (16S) sequencing and has generated nearly 1,000 reference genomes from human-associated microorganisms. To determine how well current reference genome collections capture the diversity observed among the healthy microbiome and to guide isolation and future sequencing of microbiome members, we compared the HMP’s 16S data sets to several reference 16S collections to create a ‘most wanted’ list of taxa for sequencing. Our analysis revealed that the diversity of commonly occurring taxa within the HMP cohort microbiome is relatively modest, few novel taxa are represented by these OTUs and many common taxa among HMP volunteers recur across different populations of healthy humans. Taken together, these results suggest that it should be possible to perform whole-genome sequencing on a large fraction of the human microbiome, including the ‘most wanted’, and that these sequences should serve to support microbiome studies across multiple cohorts. Also, in stark contrast to other taxa, the ‘most wanted’ organisms are poorly represented among culture collections suggesting that novel culture- and single-cell-based methods will be required to isolate these organisms for sequencing.
Metagenomic data sets were generated from samples collected along a coastal to open ocean transect between Southern California Bight and California Current waters during a seasonal upwelling event, providing an opportunity to examine the impact of episodic pulses of cold nutrient-rich water into surface ocean microbial communities. The data set consists of ∼5.8 million predicted proteins across seven sites, from three different size classes: 0.1–0.8, 0.8–3.0 and 3.0–200.0 μm. Taxonomic and metabolic analyses suggest that sequences from the 0.1–0.8 μm size class correlated with their position along the upwelling mosaic. However, taxonomic profiles of bacteria from the larger size classes (0.8–200 μm) were less constrained by habitat and characterized by an increase in Cyanobacteria, Bacteroidetes, Flavobacteria and double-stranded DNA viral sequences. Functional annotation of transmembrane proteins indicate that sites comprised of organisms with small genomes have an enrichment of transporters with substrate specificities for amino acids, iron and cadmium, whereas organisms with larger genomes have a higher percentage of transporters for ammonium and potassium. Eukaryotic-type glutamine synthetase (GS) II proteins were identified and taxonomically classified as viral, most closely related to the GSII in Mimivirus, suggesting that marine Mimivirus-like particles may have played a role in the transfer of GSII gene functions. Additionally, a Planctomycete bloom was sampled from one upwelling site providing a rare opportunity to assess the genomic composition of a marine Planctomycete population. The significant correlations observed between genomic properties, community structure and nutrient availability provide insights into habitat-driven dynamics among oligotrophic versus upwelled marine waters adjoining each other spatially.
marine; metagenomics; upwelling; California Current
New bioinformatic tools are needed to analyze the growing volume of DNA sequence data. This is especially true in the case of secondary metabolite biosynthesis, where the highly repetitive nature of the associated genes creates major challenges for accurate sequence assembly and analysis. Here we introduce the web tool Natural Product Domain Seeker (NaPDoS), which provides an automated method to assess the secondary metabolite biosynthetic gene diversity and novelty of strains or environments. NaPDoS analyses are based on the phylogenetic relationships of sequence tags derived from polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) genes, respectively. The sequence tags correspond to PKS-derived ketosynthase domains and NRPS-derived condensation domains and are compared to an internal database of experimentally characterized biosynthetic genes. NaPDoS provides a rapid mechanism to extract and classify ketosynthase and condensation domains from PCR products, genomes, and metagenomic datasets. Close database matches provide a mechanism to infer the generalized structures of secondary metabolites while new phylogenetic lineages provide targets for the discovery of new enzyme architectures or mechanisms of secondary metabolite assembly. Here we outline the main features of NaPDoS and test it on four draft genome sequences and two metagenomic datasets. The results provide a rapid method to assess secondary metabolite biosynthetic gene diversity and richness in organisms or environments and a mechanism to identify genes that may be associated with uncharacterized biochemistry.
Ichthyophthirius multifiliis, commonly known as Ich, is a highly pathogenic ciliate responsible for 'white spot', a disease causing significant economic losses to the global aquaculture industry. Options for disease control are extremely limited, and Ich's obligate parasitic lifestyle makes experimental studies challenging. Unlike most well-studied protozoan parasites, Ich belongs to a phylum composed primarily of free-living members. Indeed, it is closely related to the model organism Tetrahymena thermophila. Genomic studies represent a promising strategy to reduce the impact of this disease and to understand the evolutionary transition to parasitism.
We report the sequencing, assembly and annotation of the Ich macronuclear genome. Compared with its free-living relative T. thermophila, the Ich genome is reduced approximately two-fold in length and gene density and three-fold in gene content. We analyzed in detail several gene classes with diverse functions in behavior, cellular function and host immunogenicity, including protein kinases, membrane transporters, proteases, surface antigens and cytoskeletal components and regulators. We also mapped by orthology Ich's metabolic pathways in comparison with other ciliates and a potential host organism, the zebrafish Danio rerio.
Knowledge of the complete protein-coding and metabolic potential of Ich opens avenues for rational testing of therapeutic drugs that target functions essential to this parasite but not to its fish hosts. Also, a catalog of surface protein-encoding genes will facilitate development of more effective vaccines. The potential to use T. thermophila as a surrogate model offers promise toward controlling 'white spot' disease and understanding the adaptation to a parasitic lifestyle.
Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh-) or (tdh-, trh+). The sixth pandemic strain sequenced in this study was serotype O4:K68.
Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP) analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59%) of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different functional categories.
Our data support the idea that the pandemic strains are closely related and that recent South American outbreaks of foodborne disease caused by V. parahaemolyticus are closely linked to outbreaks in India. Serotype conversion from O3:K6 to O4:K68 was likely due to a recombination event involving a region much larger than the O-antigen- and K-antigen-encoding gene clusters. Major differences between pathogenicity islands and mobile elements are also likely driving the evolution of V. parahaemolyticus. In addition, our analyses categorized genes that may be useful in differentiating pathogenic Vibrios at the species level.
The Yellowstone caldera contains the most numerous and diverse geothermal systems on Earth, yielding an extensive array of unique high-temperature environments that host a variety of deeply-rooted and understudied Archaea, Bacteria and Eukarya. The combination of extreme temperature and chemical conditions encountered in geothermal environments often results in considerably less microbial diversity than other terrestrial habitats and offers a tremendous opportunity for studying the structure and function of indigenous microbial communities and for establishing linkages between putative metabolisms and element cycling. Metagenome sequence (14–15,000 Sanger reads per site) was obtained for five high-temperature (>65°C) chemotrophic microbial communities sampled from geothermal springs (or pools) in Yellowstone National Park (YNP) that exhibit a wide range in geochemistry including pH, dissolved sulfide, dissolved oxygen and ferrous iron. Metagenome data revealed significant differences in the predominant phyla associated with each of these geochemical environments. Novel members of the Sulfolobales are dominant in low pH environments, while other Crenarchaeota including distantly-related Thermoproteales and Desulfurococcales populations dominate in suboxic sulfidic sediments. Several novel archaeal groups are well represented in an acidic (pH 3) Fe-oxyhydroxide mat, where a higher O2 influx is accompanied with an increase in archaeal diversity. The presence or absence of genes and pathways important in S oxidation-reduction, H2-oxidation, and aerobic respiration (terminal oxidation) provide insight regarding the metabolic strategies of indigenous organisms present in geothermal systems. Multiple-pathway and protein-specific functional analysis of metagenome sequence data corroborated results from phylogenetic analyses and clearly demonstrate major differences in metabolic potential across sites. The distribution of functional genes involved in electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, Fe, O2) control microbial community structure and function in YNP geothermal springs.
Haloferax volcanii is an easily culturable moderate halophile that grows on simple defined media, is readily transformable, and has a relatively stable genome. This, in combination with its biochemical and genetic tractability, has made Hfx. volcanii a key model organism, not only for the study of halophilicity, but also for archaeal biology in general.
We report here the sequencing and analysis of the genome of Hfx. volcanii DS2, the type strain of this species. The genome contains a main 2.848 Mb chromosome, three smaller chromosomes pHV1, 3, 4 (85, 438, 636 kb, respectively) and the pHV2 plasmid (6.4 kb).
The completed genome sequence, presented here, provides an invaluable tool for further in vivo and in vitro studies of Hfx. volcanii.
Here we report the complete genome sequence of Teredinibacter turnerae T7901. T. turnerae is a marine gamma proteobacterium that occurs as an intracellular endosymbiont in the gills of wood-boring marine bivalves of the family Teredinidae (shipworms). This species is the sole cultivated member of an endosymbiotic consortium thought to provide the host with enzymes, including cellulases and nitrogenase, critical for digestion of wood and supplementation of the host's nitrogen-deficient diet. T. turnerae is closely related to the free-living marine polysaccharide degrading bacterium Saccharophagus degradans str. 2–40 and to as yet uncultivated endosymbionts with which it coexists in shipworm cells. Like S. degradans, the T. turnerae genome encodes a large number of enzymes predicted to be involved in complex polysaccharide degradation (>100). However, unlike S. degradans, which degrades a broad spectrum (>10 classes) of complex plant, fungal and algal polysaccharides, T. turnerae primarily encodes enzymes associated with deconstruction of terrestrial woody plant material. Also unlike S. degradans and many other eubacteria, T. turnerae dedicates a large proportion of its genome to genes predicted to function in secondary metabolism. Despite its intracellular niche, the T. turnerae genome lacks many features associated with obligate intracellular existence (e.g. reduced genome size, reduced %G+C, loss of genes of core metabolism) and displays evidence of adaptations common to free-living bacteria (e.g. defense against bacteriophage infection). These results suggest that T. turnerae is likely a facultative intracellular ensosymbiont whose niche presently includes, or recently included, free-living existence. As such, the T. turnerae genome provides insights into the range of genomic adaptations associated with intracellular endosymbiosis as well as enzymatic mechanisms relevant to the recycling of plant materials in marine environments and the production of cellulose-derived biofuels.
In order to enrich the phylogenetic diversity represented in the available sequenced bacterial genomes and as part of an “Assembling the Tree of Life” project, we determined the genome sequence of Thermomicrobium roseum DSM 5159. T. roseum DSM 5159 is a red-pigmented, rod-shaped, Gram-negative extreme thermophile isolated from a hot spring that possesses both an atypical cell wall composition and an unusual cell membrane that is composed entirely of long-chain 1,2-diols. Its genome is composed of two circular DNA elements, one of 2,006,217 bp (referred to as the chromosome) and one of 919,596 bp (referred to as the megaplasmid). Strikingly, though few standard housekeeping genes are found on the megaplasmid, it does encode a complete system for chemotaxis including both chemosensory components and an entire flagellar apparatus. This is the first known example of a complete flagellar system being encoded on a plasmid and suggests a straightforward means for lateral transfer of flagellum-based motility. Phylogenomic analyses support the recent rRNA-based analyses that led to T. roseum being removed from the phylum Thermomicrobia and assigned to the phylum Chloroflexi. Because T. roseum is a deep-branching member of this phylum, analysis of its genome provides insights into the evolution of the Chloroflexi. In addition, even though this species is not photosynthetic, analysis of the genome provides some insight into the origins of photosynthesis in the Chloroflexi. Metabolic pathway reconstructions and experimental studies revealed new aspects of the biology of this species. For example, we present evidence that T. roseum oxidizes CO aerobically, making it the first thermophile known to do so. In addition, we propose that glycosylation of its carotenoids plays a crucial role in the adaptation of the cell membrane to this bacterium's thermophilic lifestyle. Analyses of published metagenomic sequences from two hot springs similar to the one from which this strain was isolated, show that close relatives of T. roseum DSM 5159 are present but have some key differences from the strain sequenced.
The availability of complete genomic sequences for hundreds of organisms promises to make obtaining genome-wide estimates of substitution rates, selective constraints and other molecular evolution variables of interest an increasingly important approach to addressing broad evolutionary questions. Two of the programs most widely used for this purpose are codeml and baseml, parts of the PAML (Phylogenetic Analysis by Maximum Likelihood) suite. A significant drawback of these programs is their lack of a graphical user interface, which can limit their user base and considerably reduce their efficiency.
We have developed IDEA (Interactive Display for Evolutionary Analyses), an intuitive graphical input and output interface which interacts with PHYLIP for phylogeny reconstruction and with codeml and baseml for molecular evolution analyses. IDEA's graphical input and visualization interfaces eliminate the need to edit and parse text input and output files, reducing the likelihood of errors and improving processing time. Further, its interactive output display gives the user immediate access to results. Finally, IDEA can process data in parallel on a local machine or computing grid, allowing genome-wide analyses to be completed quickly.
IDEA provides a graphical user interface that allows the user to follow a codeml or baseml analysis from parameter input through to the exploration of results. Novel options streamline the analysis process, and post-analysis visualization of phylogenies, evolutionary rates and selective constraint along protein sequences simplifies the interpretation of results. The integration of these functions into a single tool eliminates the need for lengthy data handling and parsing, significantly expediting access to global patterns in the data.
We report here the sequencing and analysis of the genome of the nitrogen-fixing endophyte, Klebsiella pneumoniae 342. Although K. pneumoniae 342 is a member of the enteric bacteria, it serves as a model for studies of endophytic, plant-bacterial associations due to its efficient colonization of plant tissues (including maize and wheat, two of the most important crops in the world), while maintaining a mutualistic relationship that encompasses supplying organic nitrogen to the host plant. Genomic analysis examined K. pneumoniae 342 for the presence of previously identified genes from other bacteria involved in colonization of, or growth in, plants. From this set, approximately one-third were identified in K. pneumoniae 342, suggesting additional factors most likely contribute to its endophytic lifestyle. Comparative genome analyses were used to provide new insights into this question. Results included the identification of metabolic pathways and other features devoted to processing plant-derived cellulosic and aromatic compounds, and a robust complement of transport genes (15.4%), one of the highest percentages in bacterial genomes sequenced. Although virulence and antibiotic resistance genes were predicted, experiments conducted using mouse models showed pathogenicity to be attenuated in this strain. Comparative genomic analyses with the presumed human pathogen K. pneumoniae MGH78578 revealed that MGH78578 apparently cannot fix nitrogen, and the distribution of genes essential to surface attachment, secretion, transport, and regulation and signaling varied between each genome, which may indicate critical divergences between the strains that influence their preferred host ranges and lifestyles (endophytic plant associations for K. pneumoniae 342 and presumably human pathogenesis for MGH78578). Little genome information is available concerning endophytic bacteria. The K. pneumoniae 342 genome will drive new research into this less-understood, but important category of bacterial-plant host relationships, which could ultimately enhance growth and nutrition of important agricultural crops and development of plant-derived products and biofuels.
Bacterial endophytes are capable of inhabiting the living tissues of plants without causing them significant harm. Klebsiella pneumoniae 342 (Kp342) is a model for this plant host-bacterial association, in part due to its capacity to colonize in high numbers the interior of plants including wheat and maize, two of the most important crops in the world. Kp342 possesses the ability to capture atmospheric nitrogen gas and turn it into an organic form (a process known as nitrogen fixation), of which part may be used as fertilizer by its plant host. Here, we describe the genome sequence and analysis of this model endophyte. When the Kp342 genome is compared to the genome of a closely related pathogenic relative, we can begin to surmise that its preference to engage in a harmonious relationship with plants is a result of many interacting factors. These include differences in its protein secretion systems, the manner in which its genes are regulated, and its ability to sense and respond to its environment. The study of endophytes is increasing in intensity due to the roles they may play in multiple biotechnological applications, including enhancing crop growth and nutrition, bioremediation, and development of plant-derived products and biofuels.
We present the genome sequences of a new clinical isolate of the important human pathogen, Aspergillus fumigatus, A1163, and two closely related but rarely pathogenic species, Neosartorya fischeri NRRL181 and Aspergillus clavatus NRRL1. Comparative genomic analysis of A1163 with the recently sequenced A. fumigatus isolate Af293 has identified core, variable and up to 2% unique genes in each genome. While the core genes are 99.8% identical at the nucleotide level, identity for variable genes can be as low 40%. The most divergent loci appear to contain heterokaryon incompatibility (het) genes associated with fungal programmed cell death such as developmental regulator rosA. Cross-species comparison has revealed that 8.5%, 13.5% and 12.6%, respectively, of A. fumigatus, N. fischeri and A. clavatus genes are species-specific. These genes are significantly smaller in size than core genes, contain fewer exons and exhibit a subtelomeric bias. Most of them cluster together in 13 chromosomal islands, which are enriched for pseudogenes, transposons and other repetitive elements. At least 20% of A. fumigatus-specific genes appear to be functional and involved in carbohydrate and chitin catabolism, transport, detoxification, secondary metabolism and other functions that may facilitate the adaptation to heterogeneous environments such as soil or a mammalian host. Contrary to what was suggested previously, their origin cannot be attributed to horizontal gene transfer (HGT), but instead is likely to involve duplication, diversification and differential gene loss (DDL). The role of duplication in the origin of lineage-specific genes is further underlined by the discovery of genomic islands that seem to function as designated “gene dumps” and, perhaps, simultaneously, as “gene factories”.
Aspergillus is an extremely diverse genus of filamentous ascomycetous fungi (molds) found ubiquitously in soil and decomposing vegetation. Being supreme opportunists, aspergilli have adapted to overcome various chemical, physical, and biological stresses found in heterogeneous environments. While most species in the genus are saprophytes, a surprising number are able to infect wounded plants and animals. Remarkably, the allergic human host also responds abnormally to the aspergilli with lung and sinus disease. The advent of immunosuppressive agents and other medical advances have created a large worldwide pool of human hosts susceptible to some Aspergillus species, including the world's most harmful mold and the causative agent of invasive aspergillosis, Aspergillus fumigatus. In this study, we have used the power of comparative genomics to gain insight into genetic mechanisms that may contribute to the metabolic versatility and pathogenicity of this important human pathogen. Comparison of the genomes of two A. fumigatus clinical isolates and two closely related, but rarely pathogenic species showed that their genomes contain several large isolate- and species-specific chromosomal islands. The metabolic capabilities encoded by these highly labile regions are likely to contribute to their rapid adaptation to heterogeneous environments such as soil or a living host.
The dimorphic prosthecate bacteria (DPB) are α-proteobacteria that reproduce in an asymmetric manner rather than by binary fission and are of interest as simple models of development. Prior to this work, the only member of this group for which genome sequence was available was the model freshwater organism Caulobacter crescentus. Here we describe the genome sequence of Hyphomonas neptunium, a marine member of the DPB that differs from C. crescentus in that H. neptunium uses its stalk as a reproductive structure. Genome analysis indicates that this organism shares more genes with C. crescentus than it does with Silicibacter pomeroyi (a closer relative according to 16S rRNA phylogeny), that it relies upon a heterotrophic strategy utilizing a wide range of substrates, that its cell cycle is likely to be regulated in a similar manner to that of C. crescentus, and that the outer membrane complements of H. neptunium and C. crescentus are remarkably similar. H. neptunium swarmer cells are highly motile via a single polar flagellum. With the exception of cheY and cheR, genes required for chemotaxis were absent in the H. neptunium genome. Consistent with this observation, H. neptunium swarmer cells did not respond to any chemotactic stimuli that were tested, which suggests that H. neptunium motility is a random dispersal mechanism for swarmer cells rather than a stimulus-controlled navigation system for locating specific environments. In addition to providing insights into bacterial development, the H. neptunium genome will provide an important resource for the study of other interesting biological processes including chromosome segregation, polar growth, and cell aging.
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.
The macronuclear genome ofTetrahymena thermophila is sequenced and analyzed. Conservation in this single-celled ciliate of some features normally observed in only multicellular organisms sheds light on early eukaryotic evolution.
Fungi can undergo autophagic- or apoptotic-type programmed cell death (PCD) on exposure to antifungal agents, developmental signals, and stress factors. Filamentous fungi can also exhibit a form of cell death called heterokaryon incompatibility (HI) triggered by fusion between two genetically incompatible individuals. With the availability of recently sequenced genomes of Aspergillus fumigatus and several related species, we were able to define putative components of fungi-specific death pathways and the ancestral core apoptotic machinery shared by all fungi and metazoa.
Phylogenetic profiling of HI-associated proteins from four Aspergilli and seven other fungal species revealed lineage-specific protein families, orphan genes, and core genes conserved across all fungi and metazoa. The Aspergilli-specific domain architectures include NACHT family NTPases, which may function as key integrators of stress and nutrient availability signals. They are often found fused to putative effector domains such as Pfs, SesB/LipA, and a newly identified domain, HET-s/LopB. Many putative HI inducers and mediators are specific to filamentous fungi and not found in unicellular yeasts. In addition to their role in HI, several of them appear to be involved in regulation of cell cycle, development and sexual differentiation. Finally, the Aspergilli possess many putative downstream components of the mammalian apoptotic machinery including several proteins not found in the model yeast, Saccharomyces cerevisiae.
Our analysis identified more than 100 putative PCD associated genes in the Aspergilli, which may help expand the range of currently available treatments for aspergillosis and other invasive fungal diseases. The list includes species-specific protein families as well as conserved core components of the ancestral PCD machinery shared by fungi and metazoa.
Sequencing and comparative genome analysis of four strains of Campylobacter including C. lari RM2100, C. upsaliensis RM3195, and C. coli RM2228 has revealed major structural differences that are associated with the insertion of phage- and plasmid-like genomic islands, as well as major variations in the lipooligosaccharide complex. Poly G tracts are longer, are greater in number, and show greater variability in C. upsaliensis than in the other species. Many genes involved in host colonization, including racR/S, cadF, cdt, ciaB, and flagellin genes, are conserved across the species, but variations that appear to be species specific are evident for a lipooligosaccharide locus, a capsular (extracellular) polysaccharide locus, and a novel Campylobacter putative licABCD virulence locus. The strains also vary in their metabolic profiles, as well as their resistance profiles to a range of antibiotics. It is evident that the newly identified hypothetical and conserved hypothetical proteins, as well as uncharacterized two-component regulatory systems and membrane proteins, may hold additional significant information on the major differences in virulence among the species, as well as the specificity of the strains for particular hosts.
Although the single genome sequence of the human Campylobacter jejuni yielded some insight, comparison of multiple species of the same genus elucidated greater understanding of virulence mechanisms