Search tips
Search criteria

Results 1-11 (11)

Clipboard (0)

Select a Filter Below

Year of Publication
Document Types
1.  Complete Genome Sequence of Pelosinus sp. Strain UFO1 Assembled Using Single-Molecule Real-Time DNA Sequencing Technology 
Genome Announcements  2014;2(5):e00881-14.
Pelosinus species can reduce metals such as Fe(III), U(VI), and Cr(VI) and have been isolated from diverse geographical regions. Five draft genome sequences have been published. We report the complete genome sequence for Pelosinus sp. strain UFO1 using only PacBio DNA sequence data and without manual finishing.
PMCID: PMC4155594  PMID: 25189589
2.  Characterization of Ten Heterotetrameric NDP-Dependent Acyl-CoA Synthetases of the Hyperthermophilic Archaeon Pyrococcus furiosus 
Archaea  2014;2014:176863.
The hyperthermophilic archaeon Pyrococcus furiosus grows by fermenting peptides and carbohydrates to organic acids. In the terminal step, acyl-CoA synthetase (ACS) isoenzymes convert acyl-CoA derivatives to the corresponding acid and conserve energy in the form of ATP. ACS1 and ACS2 were previously purified from P. furiosus and have α2β2 structures but the genome contains genes encoding three additional α-subunits. The ten possible combinations of α and β genes were expressed in E. coli and each resulted in stable and active α2β2 isoenzymes. The α-subunit of each isoenzyme determined CoA-based substrate specificity and between them they accounted for the CoA derivatives of fourteen amino acids. The β-subunit determined preference for adenine or guanine nucleotides. The GTP-generating isoenzymes are proposed to play a role in gluconeogenesis by producing GTP for GTP-dependent phosphoenolpyruvate carboxykinase and for other GTP-dependent processes. Transcriptional and proteomic data showed that all ten isoenzymes are constitutively expressed indicating that both ATP and GTP are generated from the metabolism of most of the amino acids. A phylogenetic analysis showed that the ACSs of P. furiosus and other members of the Thermococcales are evolutionarily distinct from those found throughout the rest of biology, including those of other hyperthermophilic archaea.
PMCID: PMC3942289  PMID: 24669200
3.  Caldicellulosiruptor Core and Pangenomes Reveal Determinants for Noncellulosomal Thermophilic Deconstruction of Plant Biomass 
Journal of Bacteriology  2012;194(15):4015-4028.
Extremely thermophilic bacteria of the genus Caldicellulosiruptor utilize carbohydrate components of plant cell walls, including cellulose and hemicellulose, facilitated by a diverse set of glycoside hydrolases (GHs). From a biofuel perspective, this capability is crucial for deconstruction of plant biomass into fermentable sugars. While all species from the genus grow on xylan and acid-pretreated switchgrass, growth on crystalline cellulose is variable. The basis for this variability was examined using microbiological, genomic, and proteomic analyses of eight globally diverse Caldicellulosiruptor species. The open Caldicellulosiruptor pangenome (4,009 open reading frames [ORFs]) encodes 106 GHs, representing 43 GH families, but only 26 GHs from 17 families are included in the core (noncellulosic) genome (1,543 ORFs). Differentiating the strongly cellulolytic Caldicellulosiruptor species from the others is a specific genomic locus that encodes multidomain cellulases from GH families 9 and 48, which are associated with cellulose-binding modules. This locus also encodes a novel adhesin associated with type IV pili, which was identified in the exoproteome bound to crystalline cellulose. Taking into account the core genomes, pangenomes, and individual genomes, the ancestral Caldicellulosiruptor was likely cellulolytic and evolved, in some cases, into species that lost the ability to degrade crystalline cellulose while maintaining the capacity to hydrolyze amorphous cellulose and hemicellulose.
PMCID: PMC3416521  PMID: 22636774
4.  Genome Sequencing of a Genetically Tractable Pyrococcus furiosus Strain Reveals a Highly Dynamic Genome 
Journal of Bacteriology  2012;194(15):4097-4106.
The model archaeon Pyrococcus furiosus grows optimally near 100°C on carbohydrates and peptides. Its genome sequence (NCBI) was determined 12 years ago. A genetically tractable strain, COM1, was very recently reported, and here we describe its genome sequence. Of 1,909,827 bp in size, it is 1,571 bp longer (0.1%) than the reference NCBI sequence. The COM1 genome contains numerous chromosomal rearrangements, deletions, and single base changes. COM1 also has 45 full or partial insertion sequences (ISs) compared to 35 in the reference NCBI strain, and these have resulted in the direct deletion or insertional inactivation of 13 genes. Another seven genes were affected by chromosomal deletions and are predicted to be nonfunctional. In addition, the amino acid sequences of another 102 of the 2,134 predicted gene products are different in COM1. These changes potentially impact various cellular functions, including carbohydrate, peptide, and nucleotide metabolism; DNA repair; CRISPR-associated defense; transcriptional regulation; membrane transport; and growth at 72°C. For example, the IS-mediated inactivation of riboflavin synthase in COM1 resulted in a riboflavin requirement for growth. Nevertheless, COM1 grew on cellobiose, malto-oligosaccharides, and peptides in complex and minimal media at 98 and 72°C to the same extent as did both its parent strain and a new culture collection strain (DSMZ 3638). This was in spite of COM1 lacking several metabolic enzymes, including nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase and beta-glucosidase. The P. furiosus genome is therefore of high plasticity, and the availability of the COM1 sequence will be critical for the future studies of this model hyperthermophile.
PMCID: PMC3416535  PMID: 22636780
5.  Robust, high-throughput solution structural analyses by small angle X-ray scattering (SAXS) 
Nature methods  2009;6(8):606-612.
We present an efficient pipeline enabling high-throughput analysis of protein structure in solution with small angle X-ray scattering (SAXS). Our SAXS pipeline combines automated sample handling of microliter volumes, temperature and anaerobic control, rapid data collection, data analysis, and couples structural analysis with automated archiving. We subjected 50 representative proteins, mostly from Pyrococcus furiosus, to this pipeline, revealing that 30 were multimeric structures in solution. SAXS analysis allowed us to distinguish aggregated and unfolded proteins, define global structural parameters and oligomeric states for most samples, identify shapes and similar structures for 25 unknown structures, and determine envelopes for 41 proteins. We believe that high throughput SAXS is an enabling technology that may change the way that structural genomics research is done.
PMCID: PMC3094553  PMID: 19620974
6.  A Computational Framework for Proteome-Wide Pursuit and Prediction of Metalloproteins using ICP-MS and MS/MS Data 
BMC Bioinformatics  2011;12:64.
Metal-containing proteins comprise a diverse and sizable category within the proteomes of organisms, ranging from proteins that use metals to catalyze reactions to proteins in which metals play key structural roles. Unfortunately, reliably predicting that a protein will contain a specific metal from its amino acid sequence is not currently possible. We recently developed a generally-applicable experimental technique for finding metalloproteins on a genome-wide scale. Applying this metal-directed protein purification approach (ICP-MS and MS/MS based) to the prototypical microbe Pyrococcus furiosus conclusively demonstrated the extent and diversity of the uncharacterized portion of microbial metalloproteomes since a majority of the observed metal peaks could not be assigned to known or predicted metalloproteins. However, even using this technique, it is not technically feasible to purify to homogeneity all metalloproteins in an organism. In order to address these limitations and complement the metal-directed protein purification, we developed a computational infrastructure and statistical methodology to aid in the pursuit and identification of novel metalloproteins.
We demonstrate that our methodology enables predictions of metal-protein interactions using an experimental data set derived from a chromatography fractionation experiment in which 870 proteins and 10 metals were measured over 2,589 fractions. For each of the 10 metals, cobalt, iron, manganese, molybdenum, nickel, lead, tungsten, uranium, vanadium, and zinc, clusters of proteins frequently occurring in metal peaks (of a specific metal) within the fractionation space were defined. This resulted in predictions that there are from 5 undiscovered vanadium- to 13 undiscovered cobalt-containing proteins in Pyrococcus furiosus. Molybdenum and nickel were chosen for additional assessment producing lists of genes predicted to encode metalloproteins or metalloprotein subunits, 22 for nickel including seven from known nickel-proteins, and 20 for molybdenum including two from known molybdo-proteins. The uncharacterized proteins are prime candidates for metal-based purification or recombinant approaches to validate these predictions.
We conclude that the largely uncharacterized extent of native metalloproteomes can be revealed through analysis of the co-occurrence of metals and proteins across a fractionation space. This can significantly impact our understanding of metallobiochemistry, disease mechanisms, and metal toxicity, with implications for bioremediation, medicine and other fields.
PMCID: PMC3058030  PMID: 21356119
7.  Insights into plant biomass conversion from the genome of the anaerobic thermophilic bacterium Caldicellulosiruptor bescii DSM 6725 
Nucleic Acids Research  2011;39(8):3240-3254.
Caldicellulosiruptor bescii DSM 6725 utilizes various polysaccharides and grows efficiently on untreated high-lignin grasses and hardwood at an optimum temperature of ∼80°C. It is a promising anaerobic bacterium for studying high-temperature biomass conversion. Its genome contains 2666 protein-coding sequences organized into 1209 operons. Expression of 2196 genes (83%) was confirmed experimentally. At least 322 genes appear to have been obtained by lateral gene transfer (LGT). Putative functions were assigned to 364 conserved/hypothetical protein (C/HP) genes. The genome contains 171 and 88 genes related to carbohydrate transport and utilization, respectively. Growth on cellulose led to the up-regulation of 32 carbohydrate-active (CAZy), 61 sugar transport, 25 transcription factor and 234 C/HP genes. Some C/HPs were overproduced on cellulose or xylan, suggesting their involvement in polysaccharide conversion. A unique feature of the genome is enrichment with genes encoding multi-modular, multi-functional CAZy proteins organized into one large cluster, the products of which are proposed to act synergistically on different components of plant cell walls and to aid the ability of C. bescii to convert plant biomass. The high duplication of CAZy domains coupled with the ability to acquire foreign genes by LGT may have allowed the bacterium to rapidly adapt to changing plant biomass-rich environments.
PMCID: PMC3082886  PMID: 21227922
8.  Genome Sequence of the Anaerobic, Thermophilic, and Cellulolytic Bacterium “Anaerocellum thermophilum” DSM 6725▿  
Journal of Bacteriology  2009;191(11):3760-3761.
“Anaerocellum thermophilum” DSM 6725 is a strictly anaerobic bacterium that grows optimally at 75°C. It uses a variety of polysaccharides, including crystalline cellulose and untreated plant biomass, and has potential utility in biomass conversion. Here we report its complete genome sequence of 2.97 Mb, which is contained within one chromosome and two plasmids (of 8.3 and 3.6 kb). The genome encodes a broad set of cellulolytic enzymes, transporters, and pathways for sugar utilization and compared to those of other saccharolytic, anaerobic thermophiles is most similar to that of Caldicellulosiruptor saccharolyticus DSM 8903.
PMCID: PMC2681903  PMID: 19346307
9.  Structure of the hypothetical protein PF0899 from Pyrococcus furiosus at 1.85 Å resolution 
Acta Crystallographica Section F  2007;63(Pt 7):549-552.
The crystal structure of the hypothetical protein PF0899 from P. furiosus has been determined to 1.85 Å resolution.
The hypothetical protein PF0899 is a 95-residue peptide from the hyperthermophilic archaeon Pyrococcus furiosus that represents a gene family with six members. P. furiosus ORF PF0899 has been cloned, expressed and crystallized and its structure has been determined by the Southeast Collaboratory for Structural Genomics ( The structure was solved using the SCA2Structure pipeline from multiple data sets and has been refined to 1.85 Å against the highest resolution data set collected (a presumed gold derivative), with a crystallographic R factor of 21.0% and R free of 24.0%. The refined structure shows some structural similarity to a wedge-shaped domain observed in the structure of the major capsid protein from bacteriophage HK97, suggesting that PF0899 may be a structural protein.
PMCID: PMC2335137  PMID: 17620707
structural genomics; SECSG; Pfu-871755; PF0899; high-throughput structure
10.  Operon prediction in Pyrococcus furiosus 
Nucleic Acids Research  2006;35(1):11-20.
Identification of operons in the hyperthermophilic archaeon Pyrococcus furiosus represents an important step to understanding the regulatory mechanisms that enable the organism to adapt and thrive in extreme environments. We have predicted operons in P.furiosus by combining the results from three existing algorithms using a neural network (NN). These algorithms use intergenic distances, phylogenetic profiles, functional categories and gene-order conservation in their operon prediction. Our method takes as inputs the confidence scores of the three programs, and outputs a prediction of whether adjacent genes on the same strand belong to the same operon. In addition, we have applied Gene Ontology (GO) and KEGG pathway information to improve the accuracy of our algorithm. The parameters of this NN predictor are trained on a subset of all experimentally verified operon gene pairs of Bacillus subtilis. It subsequently achieved 86.5% prediction accuracy when applied to a subset of gene pairs for Escherichia coli, which is substantially better than any of the three prediction programs. Using this new algorithm, we predicted 470 operons in the P.furiosus genome. Of these, 349 were validated using DNA microarray data.
PMCID: PMC1761436  PMID: 17148478
11.  Defining Genes in the Genome of the Hyperthermophilic Archaeon Pyrococcus furiosus: Implications for All Microbial Genomes†  
Journal of Bacteriology  2005;187(21):7325-7332.
The original genome annotation of the hyperthermophilic archaeon Pyrococcus furiosus contained 2,065 open reading frames (ORFs). The genome was subsequently automatically annotated in two public databases by the Institute for Genomic Research (TIGR) and the National Center for Biotechnology Information (NCBI). Remarkably, more than 500 of the originally annotated ORFs differ in size in the two databases, many very significantly. For example, more than 170 of the predicted proteins differ at their N termini by more than 25 amino acids. Similar discrepancies were observed in the TIGR and NCBI databases with the other archaeal and bacterial genomes examined. In addition, the two databases contain 60 (NCBI) and 221 (TIGR) ORFs not present in the original annotation of P. furiosus. In the present study we have experimentally assessed the validity of 88 previously unannotated ORFs. Transcriptional analyses showed that 11 of 61 ORFs examined were expressed in P. furiosus when grown at either 95 or 72°C. In addition, 7 of 54 ORFs examined yielded heat-stable recombinant proteins when they were expressed in Escherichia coli, although only one of the seven ORFs was expressed in P. furiosus under the growth conditions tested. It is concluded that the P. furiosus genome contains at least 17 ORFs not previously recognized in the original annotation. This study serves to highlight the discrepancies in the public databases and the problems of accurately defining the number and sizes of ORFs within any microbial genome.
PMCID: PMC1272981  PMID: 16237015

Results 1-11 (11)