PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (31)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
Document Types
1.  Setting the Stage: The History, Chemistry, and Geobiology behind RNA 
No community-accepted scientific methods are available today to guide studies on what role RNA played in the origin and early evolution of life on Earth. Further, a definition-theory for life is needed to develop hypotheses relating to the “RNA First” model for the origin of life. Four approaches are currently at various stages of development of such a definition-theory to guide these studies. These are (a) paleogenetics, in which inferences about the structure of past life are drawn from the structure of present life; (b) prebiotic chemistry, in which hypotheses with experimental support are sought that get RNA from organic and inorganic species possibly present on early Earth; (c) exploration, hoping to encounter life independent of terran life, which might contain RNA; and (d) synthetic biology, in which laboratories attempt to reproduce biological behavior with unnatural chemical systems.
Studies of how self-replicating genetic material first arose on Earth involve paleogenetics, prebiotic chemistry, exploration, and synthetic biology. All currently support the “RNA first” hypothesis.
doi:10.1101/cshperspect.a003541
PMCID: PMC3249627  PMID: 20880988
2.  Labeled Nucleoside Triphosphates with Reversibly Terminating Aminoalkoxyl Groups 
Nucleosides, nucleotides & nucleic acids  2010;29(11):10.1080/15257770.2010.536191.
Nucleoside triphosphates having a 3′-ONH2 blocking group have been prepared with and without fluorescent tags on their nucleobases. DNA polymerases were identified that accepted these, adding a single nucleotide to the 3′-end of a primer in a template-directed extension reaction that then stops. Nitrite chemistry was developed to cleave the 3′-ONH2 group under mild conditions to allow continued primer extension. Extension-cleavage-extension cycles in solution were demonstrated with untagged nucleotides and mixtures of tagged and untagged nucleotides. Multiple extension-cleavage-extension cycles were demonstrated on an Intelligent Bio-Systems Sequencer, showing the potential of the 3′-ONH2 blocking group in “next generation sequencing”.
doi:10.1080/15257770.2010.536191
PMCID: PMC3858015  PMID: 21128174
triphosphate; aminooxy; mutant polymerases; sequencing technology; fluorescent nucleotide; oligonucleotide microarray; nitrous acid; primer elongation; alpha effect; synthesis; oxime
3.  Aesthetics in Synthesis and Synthetic Biology 
doi:10.1016/j.cbpa.2012.11.004
PMCID: PMC3805259  PMID: 23196149
4.  Resurrecting ancestral alcohol dehydrogenases from yeast 
Nature genetics  2005;37(6):630-635.
Modern yeast living in fleshy fruits rapidly convert sugars into bulk ethanol through pyruvate. Pyruvate loses carbon dioxide to produce acetaldehyde, which is reduced by alcohol dehydrogenase 1 (Adh1) to ethanol, which accumulates. Yeast later consumes the accumulated ethanol, exploiting Adh2, an Adh1 homolog differing by 24 (of 348) amino acids. As many microorganisms cannot grow in ethanol, accumulated ethanol may help yeast defend resources in the fruit1. We report here the resurrection of the last common ancestor2 of Adh1 and Adh2, called AdhA. The kinetic behavior of AdhA suggests that the ancestor was optimized to make (not consume) ethanol. This is consistent with the hypothesis that before the Adh1-Adh2 duplication, yeast did not accumulate ethanol for later consumption but rather used AdhA to recycle NADH generated in the glycolytic pathway. Silent nucleotide dating suggests that the Adh1-Adh2 duplication occurred near the time of duplication of several other proteins involved in the accumulation of ethanol, possibly in the Cretaceous age when fleshy fruits arose. These results help to connect the chemical behavior of these enzymes through systems analysis to a time of global ecosystem change, a small but useful step towards a planetary systems biology.
doi:10.1038/ng1553
PMCID: PMC3618678  PMID: 15864308
5.  Amplification, Mutation, and Sequencing of a Six-Letter Synthetic Genetic System 
Journal of the American Chemical Society  2011;133(38):15105-15112.
The next goals in the development of a synthetic biology that use artificial genetic systems will require chemistry-biology combinations that allow the amplification of DNA containing any number of sequential and non-sequential non-standard nucleotides. This amplification must ensure that the non-standard nucleotides are not unidirectionally lost during PCR amplification (unidirectional loss would cause the artificial system to revert to an all-natural genetic system). Further, technology is needed to sequence artificial genetic DNA molecules. The work reported here meets all three of these goals for a six-letter artificially expanded genetic information system (AEGIS) that comprises four standard nucleotides (G, A, C, and T) and two additional non-standard nucleotides (Z and P). We report polymerases and PCR conditions that amplify a wide range of GACTZP DNA sequences having multiple consecutive unnatural synthetic genetic components with low (0.2% per theoretical cycle) levels of mutation. We demonstrate that residual mutation processes both introduce and remove unnatural nucleotides, allowing the artificial genetic system to evolve as such, rather than revert to a wholly natural system. We then show that mechanisms for these residual mutation processes can be exploited in a strategy to sequence “six-letter” GACTZP DNA. These are all not yet reported for any other synthetic genetic system.
doi:10.1021/ja204910n
PMCID: PMC3427765  PMID: 21842904
6.  The Natural History of Class I Primate Alcohol Dehydrogenases Includes Gene Duplication, Gene Loss, and Gene Conversion 
PLoS ONE  2012;7(7):e41175.
Background
Gene duplication is a source of molecular innovation throughout evolution. However, even with massive amounts of genome sequence data, correlating gene duplication with speciation and other events in natural history can be difficult. This is especially true in its most interesting cases, where rapid and multiple duplications are likely to reflect adaptation to rapidly changing environments and life styles. This may be so for Class I of alcohol dehydrogenases (ADH1s), where multiple duplications occurred in primate lineages in Old and New World monkeys (OWMs and NWMs) and hominoids.
Methodology/Principal Findings
To build a preferred model for the natural history of ADH1s, we determined the sequences of nine new ADH1 genes, finding for the first time multiple paralogs in various prosimians (lemurs, strepsirhines). Database mining then identified novel ADH1 paralogs in both macaque (an OWM) and marmoset (a NWM). These were used with the previously identified human paralogs to resolve controversies relating to dates of duplication and gene conversion in the ADH1 family. Central to these controversies are differences in the topologies of trees generated from exonic (coding) sequences and intronic sequences.
Conclusions/Significance
We provide evidence that gene conversions are the primary source of difference, using molecular clock dating of duplications and analyses of microinsertions and deletions (micro-indels). The tree topology inferred from intron sequences appear to more correctly represent the natural history of ADH1s, with the ADH1 paralogs in platyrrhines (NWMs) and catarrhines (OWMs and hominoids) having arisen by duplications shortly predating the divergence of OWMs and NWMs. We also conclude that paralogs in lemurs arose independently. Finally, we identify errors in database interpretation as the source of controversies concerning gene conversion. These analyses provide a model for the natural history of ADH1s that posits four ADH1 paralogs in the ancestor of Catarrhine and Platyrrhine primates, followed by the loss of an ADH1 paralog in the human lineage.
doi:10.1371/journal.pone.0041175
PMCID: PMC3409193  PMID: 22859968
7.  Planetary Organic Chemistry and the Origins of Biomolecules 
Organic chemistry on a planetary scale is likely to have transformed carbon dioxide and reduced carbon species delivered to an accreting Earth. According to various models for the origin of life on Earth, biological molecules that jump-started Darwinian evolution arose via this planetary chemistry. The grandest of these models assumes that ribonucleic acid (RNA) arose prebiotically, together with components for compartments that held it and a primitive metabolism that nourished it. Unfortunately, it has been challenging to identify possible prebiotic chemistry that might have created RNA. Organic molecules, given energy, have a well-known propensity to form multiple products, sometimes referred to collectively as “tar” or “tholin.” These mixtures appear to be unsuited to support Darwinian processes, and certainly have never been observed to spontaneously yield a homochiral genetic polymer. To date, proposed solutions to this challenge either involve too much direct human intervention to satisfy many in the community, or generate molecules that are unreactive “dead ends” under standard conditions of temperature and pressure. Carbohydrates, organic species having carbon, hydrogen, and oxygen atoms in a ratio of 1:2:1 and an aldehyde or ketone group, conspicuously embody this challenge. They are components of RNA and their reactivity can support both interesting spontaneous chemistry as part of a “carbohydrate world,” but they also easily form mixtures, polymers and tars. We describe here the latest thoughts on how on this challenge, focusing on how it might be resolved using minerals containing borate, silicate, and molybdate, inter alia.
Borates, silicates, and other minerals may have promoted prebiotic chemical reactions in which organic molecules produced RNA, rather than “dead end” polymers and tars.
doi:10.1101/cshperspect.a003467
PMCID: PMC2890202  PMID: 20504964
8.  Experimental Evolution of a Facultative Thermophile from a Mesophilic Ancestor 
Experimental evolution via continuous culture is a powerful approach to the alteration of complex phenotypes, such as optimal/maximal growth temperatures. The benefit of this approach is that phenotypic selection is tied to growth rate, allowing the production of optimized strains. Herein, we demonstrate the use of a recently described long-term culture apparatus called the Evolugator for the generation of a thermophilic descendant from a mesophilic ancestor (Escherichia coli MG1655). In addition, we used whole-genome sequencing of sequentially isolated strains throughout the thermal adaptation process to characterize the evolutionary history of the resultant genotype, identifying 31 genetic alterations that may contribute to thermotolerance, although some of these mutations may be adaptive for off-target environmental parameters, such as rich medium. We undertook preliminary phenotypic analysis of mutations identified in the glpF and fabA genes. Deletion of glpF in a mesophilic wild-type background conferred significantly improved growth rates in the 43-to-48°C temperature range and altered optimal growth temperature from 37°C to 43°C. In addition, transforming our evolved thermotolerant strain (EVG1064) with a wild-type allele of glpF reduced fitness at high temperatures. On the other hand, the mutation in fabA predictably increased the degree of saturation in membrane lipids, which is a known adaptation to elevated temperature. However, transforming EVG1064 with a wild-type fabA allele had only modest effects on fitness at intermediate temperatures. The Evolugator is fully automated and demonstrates the potential to accelerate the selection for complex traits by experimental evolution and significantly decrease development time for new industrial strains.
doi:10.1128/AEM.05773-11
PMCID: PMC3255606  PMID: 22020511
9.  Expanded Genetic Alphabets in the Polymerase Chain Reaction** 
Cleaning up polymerase chain reactions: Artificially expanded genetic information systems (AEGIS) add extra nucleotide "letters" to DNA alphabets; oligonucleotides containing AEGIS nucleotides do not bind to natural DNA. This "orthogonality" is exploited here by placing two AEGIS nucleotides (P and Z) in external tags for primers targeting three cancer genes in a nested PCR architecture. AEGIS tags support multiplexed PCR with fewer primer dimers and off-target amplicons than multiplexed PCR without AEGIS components.
doi:10.1002/anie.200905173
PMCID: PMC3155763  PMID: 19946925
PCR; DNA replication; genetic alphabets; polymerases; nucleobases
10.  Recognition of an expanded genetic alphabet by type-II restriction endonucleases and their application to analyze polymerase fidelity 
Nucleic Acids Research  2011;39(9):3949-3961.
To explore the possibility of using restriction enzymes in a synthetic biology based on artificially expanded genetic information systems (AEGIS), 24 type-II restriction endonucleases (REases) were challenged to digest DNA duplexes containing recognition sites where individual Cs and Gs were replaced by the AEGIS nucleotides Z and P [respectively, 6-amino-5-nitro-3-(1′-β-d-2′-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1′-β-d-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one]. These AEGIS nucleotides implement complementary hydrogen bond donor–donor–acceptor and acceptor–acceptor–donor patterns. Results allowed us to classify type-II REases into five groups based on their performance, and to infer some specifics of their interactions with functional groups in the major and minor grooves of the target DNA. For three enzymes among these 24 where crystal structures are available (BcnI, EcoO109I and NotI), these interactions were modeled. Further, we applied a type-II REase to quantitate the fidelity polymerases challenged to maintain in a DNA duplex C:G, T:A and Z:P pairs through repetitive PCR cycles. This work thus adds tools that are able to manipulate this expanded genetic alphabet in vitro, provides some structural insights into the working of restriction enzymes, and offers some preliminary data needed to take the next step in synthetic biology to use an artificial genetic system inside of living bacterial cells.
doi:10.1093/nar/gkq1274
PMCID: PMC3089450  PMID: 21245035
11.  Defining Life 
Astrobiology  2010;10(10):1021-1030.
Abstract
Any definition is intricately connected to a theory that gives it meaning. Accordingly, this article discusses various definitions of life held in the astrobiology community by considering their connected “theories of life.” These include certain “list” definitions and a popular definition that holds that life is a “self-sustaining chemical system capable of Darwinian evolution.” We then act as “anthropologists,” studying what scientists do to determine which definition-theories of life they constructively hold as they design missions to seek non-terran life. We also look at how constructive beliefs about biosignatures change as observational data accumulate. And we consider how a definition centered on Darwinian evolution might itself be forced to change as supra-Darwinian species emerge, including in our descendents, and consider the chances of our encountering supra-Darwinian species in our exploration of the Cosmos. Last, we ask what chemical structures might support Darwinian evolution universally; these structures might be universal biosignatures. Key Words: Evolution—Life—Life detection—Biosignatures. Astrobiology 10, 1021–1030.
doi:10.1089/ast.2010.0524
PMCID: PMC3005285  PMID: 21162682
12.  Q&A: Life, synthetic biology and risk 
BMC Biology  2010;8:77.
doi:10.1186/1741-7007-8-77
PMCID: PMC2885331  PMID: 20594289
13.  Design of a novel molecular beacon: modification of the stem with artificially genetic alphabet† 
A molecular beacon that incorporates components of an artificially expanded genetic information system (Aegis) in its stem is shown not to be opened by unwanted stem invasion by adventitious standard DNA; this should improve the “darkness” of the beacon in real-world applications.
doi:10.1039/b811159f
PMCID: PMC2763601  PMID: 18956044
14.  Lessons from comparative physiology: could uric acid represent a physiologic alarm signal gone awry in western society? 
Uric acid has historically been viewed as a purine metabolic waste product excreted by the kidney and gut that is relatively unimportant other than its penchant to crystallize in joints to cause the disease gout. In recent years, however, there has been the realization that uric acid is not biologically inert but may have a wide range of actions, including being both a pro- and anti-oxidant, a neurostimulant, and an inducer of inflammation and activator of the innate immune response. In this paper, we present the hypothesis that uric acid has a key role in the foraging response associated with starvation and fasting. We further suggest that there is a complex interplay between fructose, uric acid and vitamin C, with fructose and uric acid stimulating the foraging response and vitamin C countering this response. Finally, we suggest that the mutations in ascorbate synthesis and uricase that characterized early primate evolution were likely in response to the need to stimulate the foraging “survival” response and might have inadvertently had a role in accelerating the development of bipedal locomotion and intellectual development. Unfortunately, due to marked changes in the diet, resulting in dramatic increases in fructose- and purine-rich foods, these identical genotypic changes may be largely responsible for the epidemic of obesity, diabetes and cardiovascular disease in today’s society.
doi:10.1007/s00360-008-0291-7
PMCID: PMC2684327  PMID: 18649082
Uric acid; Fructose; Foraging; Metabolic syndrome; Obesity; Fasting; Hibernation
15.  The potential and challenges of nanopore sequencing 
Nature biotechnology  2008;26(10):1146-1153.
A nanopore-based device provides single-molecule detection and analytical capabilities that are achieved by electrophoretically driving molecules in solution through a nano-scale pore. The nanopore provides a highly confined space within which single nucleic acid polymers can be analyzed at high throughput by one of a variety of means, and the perfect processivity that can be enforced in a narrow pore ensures that the native order of the nucleobases in a polynucleotide is reflected in the sequence of signals that is detected. Kilobase length polymers (single-stranded genomic DNA or RNA) or small molecules (e.g., nucleosides) can be identified and characterized without amplification or labeling, a unique analytical capability that makes inexpensive, rapid DNA sequencing a possibility. Further research and development to overcome current challenges to nanopore identification of each successive nucleotide in a DNA strand offers the prospect of `third generation' instruments that will sequence a diploid mammalian genome for ~$1,000 in ~24 h.
doi:10.1038/nbt.1495
PMCID: PMC2683588  PMID: 18846088
16.  The Planetary Biology of Ascorbate and Uric acid and their Relationship with the Epidemic of Obesity and Cardiovascular Disease 
Medical hypotheses  2008;71(1):22-31.
Humans have relatively low plasma ascorbate levels and high serum uric acid levels compared to most mammals due to the presence of genetic mutations in L-gulonolactone oxidase and uricase, respectively. We review the major hypotheses for why these mutations may have occurred. In particular, we suggest that both mutations may have provided a survival advantage to early primates by helping maintain blood pressure during periods of dietary change and environmental stress. We further propose that these mutations have the inadvertent disadvantage of increasing our risk for hypertension and cardiovascular disease in today’s society characterized by Western diet and increasing physical inactivity. Finally, we suggest that a “planetary biology” approach in which genetic changes are analyzed in relation to their biologic action and historical context may provide the ideal approach towards understanding the biology of the past, present and future.
doi:10.1016/j.mehy.2008.01.017
PMCID: PMC2495042  PMID: 18331782
17.  Multiplexed Genetic Analysis Using an Expanded Genetic Alphabet 
Clinical chemistry  2004;50(11):2019-2027.
Background
All states require some kind of testing for newborns, but the policies are far from standardized. In some states, newborn screening may include genetic tests for a wide range of targets, but the costs and complexities of the newer genetic tests inhibit expansion of newborn screening. We describe the development and technical evaluation of a multiplex platform that may foster increased newborn genetic screening.
Methods
MultiCode® PLx involves three major steps: PCR, target-specific extension, and liquid chip decoding. Each step is performed in the same reaction vessel, and the test is completed in ~3 h. For site-specific labeling and room-temperature decoding, we use an additional base pair constructed from isoguanosine and isocytidine. We used the method to test for mutations within the cystic fibrosis transmembrane conductance regulator (CFTR) gene. The developed test was performed manually and by automated liquid handling. Initially, 225 samples with a range of genotypes were tested retrospectively with the method. A prospective study used samples from >400 newborns.
Results
In the retrospective study, 99.1% of samples were correctly genotyped with no incorrect calls made. In the perspective study, 95% of the samples were correctly genotyped for all targets, and there were no incorrect calls.
Conclusions
The unique genetic multiplexing platform was successfully able to test for 31 targets within the CFTR gene and provides accurate genotype assignments in a clinical setting.
doi:10.1373/clinchem.2004.034330
PMCID: PMC1592527  PMID: 15319316
18.  Enzymatic incorporation of a third nucleobase pair 
Nucleic Acids Research  2007;35(13):4238-4249.
DNA polymerases are identified that copy a non-standard nucleotide pair joined by a hydrogen bonding pattern different from the patterns joining the dA:T and dG:dC pairs. 6-Amino-5-nitro-3-(1′-β-d-2′-deoxyribofuranosyl)-2(1H)-pyridone (dZ) implements the non-standard ‘small’ donor–donor–acceptor (pyDDA) hydrogen bonding pattern. 2-Amino-8-(1′-β-D-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one (dP) implements the ‘large’ acceptor–acceptor–donor (puAAD) pattern. These nucleobases were designed to present electron density to the minor groove, density hypothesized to help determine specificity for polymerases. Consistent with this hypothesis, both dZTP and dPTP are accepted by many polymerases from both Families A and B. Further, the dZ:dP pair participates in PCR reactions catalyzed by Taq, Vent (exo−) and Deep Vent (exo−) polymerases, with 94.4%, 97.5% and 97.5%, respectively, retention per round. The dZ:dP pair appears to be lost principally via transition to a dC:dG pair. This is consistent with a mechanistic hypothesis that deprotonated dZ (presenting a pyDAA pattern) complements dG (presenting a puADD pattern), while protonated dC (presenting a pyDDA pattern) complements dP (presenting a puAAD pattern). This hypothesis, grounded in the Watson–Crick model for nucleobase pairing, was confirmed by studies of the pH-dependence of mismatching. The dZ:dP pair and these polymerases, should be useful in dynamic architectures for sequencing, molecular-, systems- and synthetic-biology.
doi:10.1093/nar/gkm395
PMCID: PMC1934989  PMID: 17576683
19.  Nucleoside alpha-thiotriphosphates, polymerases and the exonuclease III analysis of oligonucleotides containing phosphorothioate linkages 
Nucleic Acids Research  2007;35(9):3118-3127.
The use of DNA polymerases to incorporate phosphorothioate linkages into DNA, and the use of exonuclease III to determine where those linkages have been incorporated, are re-examined in this work. The results presented here show that exonuclease III degrades single-stranded DNA as a substrate and digests through phosphorothioate linkages having one absolute stereochemistry, assigned (assuming inversion in the polymerase reaction) as S, but not the other absolute stereochemistry. This contrasts with a general view in the literature that exonuclease III favors double-stranded nucleic acid as a substrate and stops completely at phosphorothioate linkages. Furthermore, not all DNA polymerases appear to accept exclusively the (R) stereoisomer of nucleoside alpha-thiotriphosphates [and not the (S) diastereomer], a conclusion inferred two decades ago by examination of five Family-A polymerases and a reverse transcriptase. This suggests that caution is appropriate when extrapolating the detailed behavior of one polymerase from the behaviors of other polymerases. Furthermore, these results provide constraints on how exonuclease III–thiotriphosphate–polymerase combinations can be used to analyze the behavior of the components of a synthetic biology.
doi:10.1093/nar/gkm168
PMCID: PMC1888802  PMID: 17452363
20.  Artificially expanded genetic information system: a new base pair with an alternative hydrogen bonding pattern 
Nucleic Acids Research  2006;34(21):6095-6101.
To support efforts to develop a ‘synthetic biology’ based on an artificially expanded genetic information system (AEGIS), we have developed a route to two components of a non-standard nucleobase pair, the pyrimidine analog 6-amino-5-nitro-3-(1′-β-D-2′-deoxyribofuranosyl)-2(1H)-pyridone (dZ) and its Watson–Crick complement, the purine analog 2-amino-8-(1′-β-D-2′-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one (dP). These implement the pyDDA:puAAD hydrogen bonding pattern (where ‘py’ indicates a pyrimidine analog and ‘pu’ indicates a purine analog, while A and D indicate the hydrogen bonding patterns of acceptor and donor groups presented to the complementary nucleobases, from the major to the minor groove). Also described is the synthesis of the triphosphates and protected phosphoramidites of these two nucleosides. We also describe the use of the protected phosphoramidites to synthesize DNA oligonucleotides containing these AEGIS components, verify the absence of epimerization of dZ in those oligonucleotides, and report some hybridization properties of the dZ:dP nucleobase pair, which is rather strong, and the ability of each to effectively discriminate against mismatches in short duplex DNA.
doi:10.1093/nar/gkl633
PMCID: PMC1635279  PMID: 17074747
21.  Dynamic assembly of primers on nucleic acid templates 
Nucleic Acids Research  2006;34(17):4702-4710.
A strategy is presented that uses dynamic equlibria to assemble in situ composite DNA polymerase primers, having lengths of 14 or 16 nt, from DNA fragments that are 6 or 8 nt in length. In this implementation, the fragments are transiently joined under conditions of dynamic equilibrium by an imine linker, which has a dissociation constant of ∼1 μM. If a polymerase is able to extend the composite, but not the fragments, it is possible to prime the synthesis of a target DNA molecule under conditions where two useful specificities are combined: (i) single nucleotide discrimination that is characteristic of short oligonucleotide duplexes (four to six nucleobase pairs in length), which effectively excludes single mismatches, and (ii) an overall specificity of priming that is characteristic of long (14 to 16mers) oligonucleotides, potentially unique within a genome. We report here the screening of a series of polymerases that combine an ability not to accept short primer fragments with an ability to accept the long composite primer held together by an unnatural imine linkage. Several polymerases were found that achieve this combination, permitting the implementation of the dynamic combinatorial chemical strategy.
doi:10.1093/nar/gkl625
PMCID: PMC1635275  PMID: 16963776
22.  Analysis of transitions at two-fold redundant sites in mammalian genomes. Transition redundant approach-to-equilibrium (TREx) distance metrics 
Background
The exchange of nucleotides at synonymous sites in a gene encoding a protein is believed to have little impact on the fitness of a host organism. This should be especially true for synonymous transitions, where a pyrimidine nucleotide is replaced by another pyrimidine, or a purine is replaced by another purine. This suggests that transition redundant exchange (TREx) processes at the third position of conserved two-fold codon systems might offer the best approximation for a neutral molecular clock, serving to examine, within coding regions, theories that require neutrality, determine whether transition rate constants differ within genes in a single lineage, and correlate dates of events recorded in genomes with dates in the geological and paleontological records. To date, TREx analysis of the yeast genome has recognized correlated duplications that established a new metabolic strategies in fungi, and supported analyses of functional change in aromatases in pigs. TREx dating has limitations, however. Multiple transitions at synonymous sites may cause equilibration and loss of information. Further, to be useful to correlate events in the genomic record, different genes within a genome must suffer transitions at similar rates.
Results
A formalism to analyze divergence at two fold redundant codon systems is presented. This formalism exploits two-state approach-to-equilibrium kinetics from chemistry. This formalism captures, in a single equation, the possibility of multiple substitutions at individual sites, avoiding any need to "correct" for these. The formalism also connects specific rate constants for transitions to specific approximations in an underlying evolutionary model, including assumptions that transition rate constants are invariant at different sites, in different genes, in different lineages, and at different times. Therefore, the formalism supports analyses that evaluate these approximations.
Transitions at synonymous sites within two-fold redundant coding systems were examined in the mouse, rat, and human genomes. The key metric (f2), the fraction of those sites that holds the same nucleotide, was measured for putative ortholog pairs. A transition redundant exchange (TREx) distance was calculated from f2 for these pairs. Pyrimidine-pyrimidine transitions at these sites occur approximately 14% faster than purine-purine transitions in various lineages. Transition rate constants were similar in different genes within the same lineages; within a set of orthologs, the f2 distribution is only modest overdispersed. No correlation between disparity and overdispersion is observed. In rodents, evidence was found for greater conservation of TREx sites in genes on the X chromosome, accounting for a small part of the overdispersion, however.
Conclusion
The TREx metric is useful to analyze the history of transition rate constants within these mammals over the past 100 million years. The TREx metric estimates the extent to which silent nucleotide substitutions accumulate in different genes, on different chromosomes, with different compositions, in different lineages, and at different times.
doi:10.1186/1471-2148-6-25
PMCID: PMC1435776  PMID: 16545144
23.  Application of DETECTER, an evolutionary genomic tool to analyze genetic variation, to the cystic fibrosis gene family 
BMC Genomics  2006;7:44.
Background
The medical community requires computational tools that distinguish missense genetic differences having phenotypic impact within the vast number of sense mutations that do not. Tools that do this will become increasingly important for those seeking to use human genome sequence data to predict disease, make prognoses, and customize therapy to individual patients.
Results
An approach, termed DETECTER, is proposed to identify sites in a protein sequence where amino acid replacements are likely to have a significant effect on phenotype, including causing genetic disease. This approach uses a model-dependent tool to estimate the normalized replacement rate at individual sites in a protein sequence, based on a history of those sites extracted from an evolutionary analysis of the corresponding protein family. This tool identifies sites that have higher-than-average, average, or lower-than-average rates of change in the lineage leading to the sequence in the population of interest. The rates are then combined with sequence data to determine the likelihoods that particular amino acids were present at individual sites in the evolutionary history of the gene family. These likelihoods are used to predict whether any specific amino acid replacements, if introduced at the site in a modern human population, would have a significant impact on fitness. The DETECTER tool is used to analyze the cystic fibrosis transmembrane conductance regulator (CFTR) gene family.
Conclusion
In this system, DETECTER retrodicts amino acid replacements associated with the cystic fibrosis disease with greater accuracy than alternative approaches. While this result validates this approach for this particular family of proteins only, the approach may be applicable to the analysis of polymorphisms generally, including SNPs in a human population.
doi:10.1186/1471-2164-7-44
PMCID: PMC1420294  PMID: 16522197
24.  Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins 
BMC Bioinformatics  2006;7:89.
Background
When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use.
Results
The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1) multiple sequence alignments, 2) mapping of alignment sites to crystal structure sites, 3) phylogenetic trees, 4) inferred ancestral sequences at internal tree nodes, and 5) amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures.
Conclusion
We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural bioinformatics resources that are useful for identifying experimentally testable hypotheses about the molecular basis of protein behaviors and functions, as illustrated with the examples from the cellular retinoid binding proteins.
doi:10.1186/1471-2105-7-89
PMCID: PMC1475641  PMID: 16504077
25.  The use of thymidine analogs to improve the replication of an extra DNA base pair: a synthetic biological system 
Nucleic Acids Research  2005;33(17):5640-5646.
Synthetic biology based on a six-letter genetic alphabet that includes the two non-standard nucleobases isoguanine (isoG) and isocytosine (isoC), as well as the standard A, T, G and C, is known to suffer as a consequence of a minor tautomeric form of isoguanine that pairs with thymine, and therefore leads to infidelity during repeated cycles of the PCR. Reported here is a solution to this problem. The solution replaces thymidine triphosphate by 2-thiothymidine triphosphate (2-thioTTP). Because of the bulk and hydrogen bonding properties of the thione unit in 2-thioT, 2-thioT does not mispair effectively with the minor tautomer of isoG. To test whether this might allow PCR amplification of a six-letter artificially expanded genetic information system, we examined the relative rates of misincorporation of 2-thioTTP and TTP opposite isoG using affinity electrophoresis. The concentrations of isoCTP and 2-thioTTP were optimal to best support PCR amplification using thermostable polymerases of a six-letter alphabet that includes the isoC–isoG pair. The fidelity-per-round of amplification was found to be ∼98% in trial PCRs with this six-letter DNA alphabet. The analogous PCR employing TTP had a fidelity-per-round of only ∼93%. Thus, the A, 2-thioT, G, C, isoC, isoG alphabet is an artificial genetic system capable of Darwinian evolution.
doi:10.1093/nar/gki873
PMCID: PMC1236980  PMID: 16192575

Results 1-25 (31)