Search tips
Search criteria

Results 1-25 (28)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  Evolution of extreme resistance to ionizing radiation via genetic adaptation of DNA repair 
eLife  2014;3:e01322.
By directed evolution in the laboratory, we previously generated populations of Escherichia coli that exhibit a complex new phenotype, extreme resistance to ionizing radiation (IR). The molecular basis of this extremophile phenotype, involving strain isolates with a 3-4 order of magnitude increase in IR resistance at 3000 Gy, is now addressed. Of 69 mutations identified in one of our most highly adapted isolates, functional experiments demonstrate that the IR resistance phenotype is almost entirely accounted for by only three of these nucleotide changes, in the DNA metabolism genes recA, dnaB, and yfjK. Four additional genetic changes make small but measurable contributions. Whereas multiple contributions to IR resistance are evident in this study, our results highlight a particular adaptation mechanism not adequately considered in studies to date: Genetic innovations involving pre-existing DNA repair functions can play a predominant role in the acquisition of an IR resistance phenotype.
eLife digest
X-rays and other forms of ionizing radiation can damage DNA and proteins inside cells. The radiation interacts with aqueous solutions to produce reactive forms of oxygen, which then cause the damage. A range of mechanisms exist to moderate and/or repair this damage, with certain species being able to tolerate extraordinary levels of radiation. The bacterium D. radiodurans, for example, can survive radiation levels that are over 1000 times higher than the levels that can kill human cells.
The molecular basis of high-level resistance to ionizing radiation is not well understood, and several mechanisms have been proposed. Recent work has focused on passive mechanisms that are based on changes in cellular levels of certain small molecules that prevent damage by reactive forms of oxygen molecules.
Now, based on experiments on E. coli, Byrne et al. demonstrate that active mechanisms, involving adaptations in the cellular DNA repair systems, can bring about dramatic increases in radiation resistance. The experiments were performed on populations of E. coli cells that had been subjected to an evolutionary selection for extremely high resistance to ionizing radiation. This involved exposing the E. coli cells to ionizing radiation that killed most of the population, and then growing up the survivors. Many repetitions of this process led to a population of cells with a resistance that was comparable to that of the bacterium D. radiodurans. The same evolution experiment was carried out four times, generating four separate populations of bacteria that were resistant to ionizing radiation.
Byrne et al. sequenced the genomes of the E. coli after 20, 40 or 50 rounds of the selection process, and compared mutations found in the four separate evolved populations. This showed that nine genes were particularly prone to mutations. Together, these genes had roles in repairing and copying DNA sequences, in decreasing damage caused by reactive forms of oxygen, and in manufacturing the molecular wall that shields cells.
To assess the importance of the mutations in the nine genes, Byrne et al. took Founder cells from the initial population of E. coli cells–which were not resistant to ionizing radiation–and introduced the very same mutations, one at a time. Then the mutations that had the largest positive effects on resistance to ionizing radiation were combined. Introducing particular mutations into three DNA repair genes resulted in the highest aggregate levels of resistance. Finally, evolved E. coli cells that were already resistant were made more sensitive to radiation by repairing the same individual mutations. Again, the biggest change was observed with the DNA repair genes. Indeed, repairing the mutations in just the three DNA repair genes completely removed the radiation resistance.
The next step is to determine how the properties of the mutated proteins change, and how those changes lead to radiation resistance. Also, there are clues in the work that suggest the presence of additional ways for cells to become radiation resistant, and these remain to be explored.
PMCID: PMC3939492  PMID: 24596148
DNA repair; ionizing radiation; evolution; extremophile; mutation; E. coli
2.  Changes in the Flexor Digitorum Profundus Tendon Geometry in the Carpal Tunnel Due to Force Production and Posture of Metacarpophalangeal Joint of the Index Finger: an MRI Study 
Carpal tunnel syndrome is a disorder caused by increased pressure in the carpal tunnel associated with repetitive, stereotypical finger actions. Little is known about in vivo geometrical changes in the carpal tunnel caused by motion at the finger joints and exerting a fingertip force.
The hands and forearms of five subjects were scanned using a 3.0T magnetic resonance imaging scanner. The metacarpophalangeal joint of the index finger was placed in: flexion, neutral and extension. For each joint posture subjects either produced no active force (passive condition) or exerted a flexion force to resist a load (~4.0 N) at the fingertip (active condition). Changes in the radii of curvature, position and transverse plane area of the flexor digitorum profundus tendons at the carpal tunnel level were measured.
The radius of curvature of the flexor digitorum profundus tendons, at the carpal tunnel level, was significantly affected by posture of the index finger metacarpophalangeal joint (p<0.05) and the radii was significantly different between fingers (p<0.05). Actively producing force caused a significant shift (p<0.05) in the flexor digitorum profundus tendons in the ventral (palmar) direction. No significant change in the area of an ellipse containing the flexor digitorum profundus tendons was observed between conditions.
The results show that relatively small changes in the posture and force production of a single finger can lead to significant changes in the geometry of all the flexor digitorum profundus tendons in the carpal tunnel. Additionally, voluntary force production at the fingertip increases the moment arm of the FDP tendons about the wrist joint.
PMCID: PMC3609902  PMID: 23219762
MRI; carpal tunnel syndrome; moment arm; flexor digitorum profundus
3.  Population Level Analysis of Evolved Mutations Underlying Improvements in Plant Hemicellulose and Cellulose Fermentation by Clostridium phytofermentans 
PLoS ONE  2014;9(1):e86731.
The complexity of plant cell walls creates many challenges for microbial decomposition. Clostridium phytofermentans, an anaerobic bacterium isolated from forest soil, directly breaks down and utilizes many plant cell wall carbohydrates. The objective of this research is to understand constraints on rates of plant decomposition by Clostridium phytofermentans and identify molecular mechanisms that may overcome these limitations.
Experimental evolution via repeated serial transfers during exponential growth was used to select for C. phytofermentans genotypes that grow more rapidly on cellobiose, cellulose and xylan. To identify the underlying mutations an average of 13,600,000 paired-end reads were generated per population resulting in ∼300 fold coverage of each site in the genome. Mutations with allele frequencies of 5% or greater could be identified with statistical confidence. Many mutations are in carbohydrate-related genes including the promoter regions of glycoside hydrolases and amino acid substitutions in ABC transport proteins involved in carbohydrate uptake, signal transduction sensors that detect specific carbohydrates, proteins that affect the export of extracellular enzymes, and regulators of unknown specificity. Structural modeling of the ABC transporter complex proteins suggests that mutations in these genes may alter the recognition of carbohydrates by substrate-binding proteins and communication between the intercellular face of the transmembrane and the ATPase binding proteins.
Experimental evolution was effective in identifying molecular constraints on the rate of hemicellulose and cellulose fermentation and selected for putative gain of function mutations that do not typically appear in traditional molecular genetic screens. The results reveal new strategies for evolving and engineering microorganisms for faster growth on plant carbohydrates.
PMCID: PMC3899296  PMID: 24466216
4.  Effects of the Index Finger Position and Force Production on the Flexor Digitorum Superficialis Moment Arms at the Metacarpophalangeal Joints- an Magnetic Resonance Imaging Study 
The purpose of this study was to use magnetic resonance imaging to measure the moment arm of the flexor digitorum superficialis tendon about the metacarpophalangeal joint of the index, middle, ring, and little fingers when the position and force production level of the index finger was altered. A secondary goal was to create regression models using anthropometric data to predict moment arms of the flexor digitorum superficialis about the metacarpophalangeal joint of each finger.
The hands of subjects were scanned using a 3.0T magnetic resonance imaging scanner. The metacarpophalangeal joint of the index finger was placed in: flexion, neutral, and extension. For each joint configuration subjects produced no active force (passive condition) and exerted a flexion force to resist a load at the fingertip (active condition).
The following was found: (1) The moment arm of the flexor digitorum superficialis at the metacarpophalangeal joint of the index finger (a) increased with the joint flexion and stayed unchanged with finger extension; and (b) decreased with the increase of force at the neutral and extended finger postures and did not change at the flexed posture. (2) The moment arms of the flexor digitorum superficialis tendon of the middle, ring, and little fingers (a) did not change when the index metacarpophalangeal joint position changed (p > 0.20); and (b) The moment arms of the middle and little fingers increased when the index finger actively produced force at the flexed metacarpophalangeal joint posture. (4) The moment arms showed a high correlation with anthropometric measurements.
Moment arms of the flexor digitorum superficialis change due to both changes in joint angle and muscle activation; they scale with various anthropometric measures.
PMCID: PMC3328664  PMID: 22192658
MRI; moment arm; flexor digitorum superficialis; finger interaction
5.  Revised Sequence and Annotation of the Rhodobacter sphaeroides 2.4.1 Genome 
Journal of Bacteriology  2012;194(24):7016-7017.
The DNA sequences of chromosomes I and II of Rhodobacter sphaeroides strain 2.4.1 have been revised, and the annotation of the entire genomic sequence, including both chromosomes and the five plasmids, has been updated. Errors in the originally published sequence have been corrected, and ∼11% of the coding regions in the original sequence have been affected by the revised annotation.
PMCID: PMC3510577  PMID: 23209255
6.  À la Recherche du Temps Perdu: extracting temporal relations from medical text in the 2012 i2b2 NLP challenge 
An analysis of the timing of events is critical for a deeper understanding of the course of events within a patient record. The 2012 i2b2 NLP challenge focused on the extraction of temporal relationships between concepts within textual hospital discharge summaries.
Materials and methods
The team from the National Research Council Canada (NRC) submitted three system runs to the second track of the challenge: typifying the time-relationship between pre-annotated entities. The NRC system was designed around four specialist modules containing statistical machine learning classifiers. Each specialist targeted distinct sets of relationships: local relationships, ‘sectime’-type relationships, non-local overlap-type relationships, and non-local causal relationships.
The best NRC submission achieved a precision of 0.7499, a recall of 0.6431, and an F1 score of 0.6924, resulting in a statistical tie for first place. Post hoc improvements led to a precision of 0.7537, a recall of 0.6455, and an F1 score of 0.6954, giving the highest scores reported on this task to date.
Discussion and conclusions
Methods for general relation extraction extended well to temporal relations, and gave top-ranked state-of-the-art results. Careful ordering of predictions within result sets proved critical to this success.
PMCID: PMC3756270  PMID: 23523875
information extraction; temporal reasoning; natural language processing; relation extraction; clinical text
7.  Comparative Genome Structure, Secondary Metabolite, and Effector Coding Capacity across Cochliobolus Pathogens 
PLoS Genetics  2013;9(1):e1003233.
The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.
Author Summary
The filamentous ascomycete genus Cochliobolus includes highly aggressive necrotrophic and hemibiotrophic pathogens with particular specificity to their host plants, often associated with production of host selective toxins (HST) that allow necrotrophs to trigger host cell death. Hemibiotrophs must keep their hosts alive during initial infection stages and rely on subverting host defenses by secreting small protein effectors. Many Cochliobolus species have emerged rapidly as devastating pathogens due to HSTs. The genomes of Cochliobolus and related pathogens that differ in host preference, host specificity, and virulence strategies have been sequenced. Our comparative results, at the whole-genome level, and with a spotlight on core genes for secondary metabolism and small secreted proteins, touch on how pathogens develop and hone these tools, according to host or lifestyle. We suggest that, while necrotrophs and hemibiotrophs employ fundamentally contrasting mechanisms of promoting disease, the tools they utilize (HSTs and protein effectors) overlap. The suites of secondary metabolite and SSP genes that each possesses reflect astounding diversity among species, hinting that gene products, particularly those associated with unique genomic regions, are candidates for pathogenic lifestyle differences. Manipulations of strain-unique secondary metabolite genes associated with host-specific virulence provide tangible examples.
PMCID: PMC3554632  PMID: 23357949
8.  Comparative Genomics of a Plant-Pathogenic Fungus, Pyrenophora tritici-repentis, Reveals Transduplication and the Impact of Repeat Elements on Pathogenicity and Population Divergence 
G3: Genes|Genomes|Genetics  2013;3(1):41-63.
Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.
PMCID: PMC3538342  PMID: 23316438
wheat (Triticum aestivum); copy number variation; histone H3 transduplication; ToxA; ToxB; anastomosis
9.  Phylogenetics links monster larva to deep-sea shrimp 
Ecology and Evolution  2012;2(10):2367-2373.
Mid-water plankton collections commonly include bizarre and mysterious developmental stages that differ conspicuously from their adult counterparts in morphology and habitat. Unaware of the existence of planktonic larval stages, early zoologists often misidentified these unique morphologies as independent adult lineages. Many such mistakes have since been corrected by collecting larvae, raising them in the lab, and identifying the adult forms. However, challenges arise when the larva is remarkably rare in nature and relatively inaccessible due to its changing habitats over the course of ontogeny. The mid-water marine species Cerataspis monstrosa (Gray 1828) is an armored crustacean larva whose adult identity has remained a mystery for over 180 years. Our phylogenetic analyses, based in part on recent collections from the Gulf of Mexico, provide definitive evidence that the rare, yet broadly distributed larva, C. monstrosa, is an early developmental stage of the globally distributed deepwater aristeid shrimp, Plesiopenaeus armatus. Divergence estimates and phylogenetic relationships across five genes confirm the larva and adult are the same species. Our work demonstrates the diagnostic power of molecular systematics in instances where larval rearing seldom succeeds and morphology and habitat are not indicative of identity. Larval–adult linkages not only aid in our understanding of biodiversity, they provide insights into the life history, distribution, and ecology of an organism.
PMCID: PMC3492765  PMID: 23145324
Cerataspis monstrosa; Decapoda; DNA barcoding; larval–adult linkage; phylogenetics
10.  Fine Mapping of the Bsr1 Barley Stripe Mosaic Virus Resistance Gene in the Model Grass Brachypodium distachyon 
PLoS ONE  2012;7(6):e38333.
The ND18 strain of Barley stripe mosaic virus (BSMV) infects several lines of Brachypodium distachyon, a recently developed model system for genomics research in cereals. Among the inbred lines tested, Bd3-1 is highly resistant at 20 to 25°C, whereas Bd21 is susceptible and infection results in an intense mosaic phenotype accompanied by high levels of replicating virus. We generated an F6∶7 recombinant inbred line (RIL) population from a cross between Bd3-1 and Bd21 and used the RILs, and an F2 population of a second Bd21 × Bd3-1 cross to evaluate the inheritance of resistance. The results indicate that resistance segregates as expected for a single dominant gene, which we have designated Barley stripe mosaic virus resistance 1 (Bsr1). We constructed a genetic linkage map of the RIL population using SNP markers to map this gene to within 705 Kb of the distal end of the top of chromosome 3. Additional CAPS and Indel markers were used to fine map Bsr1 to a 23 Kb interval containing five putative genes. Our study demonstrates the power of using RILs to rapidly map the genetic determinants of BSMV resistance in Brachypodium. Moreover, the RILs and their associated genetic map, when combined with the complete genomic sequence of Brachypodium, provide new resources for genetic analyses of many other traits.
PMCID: PMC3366947  PMID: 22675544
11.  Coordination of Contact Forces During Multifinger Static Prehension 
This study investigated the effects of modifying contact finger forces in one direction—normal or tangential—on the entire set of the contact forces, while statically holding an object. Subjects grasped a handle instrumented with finger force-moment sensors, maintained it at rest in the air, and then slowly: (1) increased the grasping force, (2) tried to spread fingers apart, and (3) tried to squeeze fingers together. Analysis was mostly performed at the virtual finger (VF) level (the VF is an imaginable finger that generates the same force and moment as the four fingers combined). For all three tasks there were statistically significant changes in the VF normal and tangential forces. For finger spreading/squeezing the tangential force neutral point was located between the index and middle fingers. We conclude that the internal forces are regulated as a whole, including adjustments in both normal and tangential force, instead of only a subset of forces (normal or tangential). The effects of such factors as EFFORT and TORQUE were additive; their interaction was not statistically significant, thus supporting the principle of superposition in human prehension.
PMCID: PMC3235002  PMID: 21576716
prehension; grasping; motor control; occupational therapy
12.  Complete genome sequence of Paenibacillus sp. strain JDR-2 
Paenibacillus sp. strain JDR-2, an aggressively xylanolytic bacterium isolated from sweetgum (Liquidambar styraciflua) wood, is able to efficiently depolymerize, assimilate and metabolize 4-O-methylglucuronoxylan, the predominant structural component of hardwood hemicelluloses. A basis for this capability was first supported by the identification of genes and characterization of encoded enzymes and has been further defined by the sequencing and annotation of the complete genome, which we describe. In addition to genes implicated in the utilization of β-1,4-xylan, genes have also been identified for the utilization of other hemicellulosic polysaccharides. The genome of Paenibacillus sp. JDR-2 contains 7,184,930 bp in a single replicon with 6,288 protein-coding and 122 RNA genes. Uniquely prominent are 874 genes encoding proteins involved in carbohydrate transport and metabolism. The prevalence and organization of these genes support a metabolic potential for bioprocessing of hemicellulose fractions derived from lignocellulosic resources.
PMCID: PMC3368403  PMID: 22675593
aerobic; mesophile; Gram-positive; Paenibacillus; xylanolytic; xylan
13.  Binding and neutralization of vascular endothelial growth factor (VEGF) and related ligands by VEGF Trap, ranibizumab and bevacizumab 
Angiogenesis  2012;15(2):171-185.
Pharmacological inhibition of VEGF-A has proven to be effective in inhibiting angiogenesis and vascular leak associated with cancers and various eye diseases. However, little information is currently available on the binding kinetics and relative biological activity of various VEGF inhibitors. Therefore, we have evaluated the binding kinetics of two anti-VEGF antibodies, ranibizumab and bevacizumab, and VEGF Trap (also known as aflibercept), a novel type of soluble decoy receptor, with substantially higher affinity than conventional soluble VEGF receptors. VEGF Trap bound to all isoforms of human VEGF-A tested with subpicomolar affinity. Ranibizumab and bevacizumab also bound human VEGF-A, but with markedly lower affinity. The association rate for VEGF Trap binding to VEGF-A was orders of magnitude faster than that measured for bevacizumab and ranibizumab. Similarly, in cell-based bioassays, VEGF Trap inhibited the activation of VEGFR1 and VEGFR2, as well as VEGF-A induced calcium mobilization and migration in human endothelial cells more potently than ranibizumab or bevacizumab. Only VEGF Trap bound human PlGF and VEGF-B, and inhibited VEGFR1 activation and HUVEC migration induced by PlGF. These data differentiate VEGF Trap from ranibizumab and bevacizumab in terms of its markedly higher affinity for VEGF-A, as well as its ability to bind VEGF-B and PlGF.
Electronic supplementary material
The online version of this article (doi:10.1007/s10456-011-9249-6) contains supplementary material, which is available to authorized users.
PMCID: PMC3338918  PMID: 22302382
VEGF receptor; Aflibercept; Affinity; Age-related macular degeneration; Placental growth factor; Biomedicine; Cardiology; Biomedicine general; Ophthalmology; Cancer Research; Cell Biology; Oncology
16.  Rediscovery by Whole Genome Sequencing: Classical Mutations and Genome Polymorphisms in Neurospora crassa 
G3: Genes|Genomes|Genetics  2011;1(4):303-316.
Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.
PMCID: PMC3276140  PMID: 22384341
single nucleotide polymorphism; SNP; indel; comparative genomics; classical mutant
17.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010 
As clinical text mining continues to mature, its potential as an enabling technology for innovations in patient care and clinical research is becoming a reality. A critical part of that process is rigid benchmark testing of natural language processing methods on realistic clinical narrative. In this paper, the authors describe the design and performance of three state-of-the-art text-mining applications from the National Research Council of Canada on evaluations within the 2010 i2b2 challenge.
The three systems perform three key steps in clinical information extraction: (1) extraction of medical problems, tests, and treatments, from discharge summaries and progress notes; (2) classification of assertions made on the medical problems; (3) classification of relations between medical concepts. Machine learning systems performed these tasks using large-dimensional bags of features, as derived from both the text itself and from external sources: UMLS, cTAKES, and Medline.
Performance was measured per subtask, using micro-averaged F-scores, as calculated by comparing system annotations with ground-truth annotations on a test set.
The systems ranked high among all submitted systems in the competition, with the following F-scores: concept extraction 0.8523 (ranked first); assertion detection 0.9362 (ranked first); relationship detection 0.7313 (ranked second).
For all tasks, we found that the introduction of a wide range of features was crucial to success. Importantly, our choice of machine learning algorithms allowed us to be versatile in our feature design, and to introduce a large number of features without overfitting and without encountering computing-resource bottlenecks.
PMCID: PMC3168309  PMID: 21565856
natural language processing; semantics; classification/*methods; computerized medical records systems; patient discharge/*statistics & numerical data; text mining; concept detection; relation extraction; document coding; machine learning; modeling physiologic and disease processes; linking the genotype and phenotype; identifying genome and protein structure and function; visualization of data and knowledge
18.  ExaCT: automatic extraction of clinical trial characteristics from journal publications 
Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs).
ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study.
We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (first stage) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (second stage) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers.
Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols).
PMCID: PMC2954855  PMID: 20920176
19.  De-identification of primary care electronic medical records free-text data in Ontario, Canada 
Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. While de-identification tools have been developed for free-text, none have been developed or tested for the full range of primary care EMR data
We used deid open source de-identification software and modified it for an Ontario context for use on primary care EMR data. We developed the modified program on a training set of 1000 free-text records from one group practice and then tested it on two validation sets from a random sample of 700 free-text EMR records from 17 different physicians from 7 different practices in 5 different cities and 500 free-text records from a group practice that was in a different city than the group practice that was used for the training set. We measured the sensitivity/recall, precision, specificity, accuracy and F-measure of the modified tool against manually tagged free-text records to remove patient and physician names, locations, addresses, medical record, health card and telephone numbers.
We found that the modified training program performed with a sensitivity of 88.3%, specificity of 91.4%, precision of 91.3%, accuracy of 89.9% and F-measure of 0.90. The validations sets had sensitivities of 86.7% and 80.2%, specificities of 91.4% and 87.7%, precisions of 91.1% and 87.4%, accuracies of 89.0% and 83.8% and F-measures of 0.89 and 0.84 for the first and second validation sets respectively.
The deid program can be modified to reasonably accurately de-identify free-text primary care EMR records while preserving clinical content.
PMCID: PMC2907300  PMID: 20565894
20.  Mutation in SHOC2 promotes aberrant protein N-myristoylation and underlies Noonan-like syndrome with loose anagen hair 
Nature genetics  2009;41(9):1022-1026.
N-myristoylation is a common form of co-translational protein fatty acylation resulting from the attachment of myristate to a required N-terminal glycine residue.1,2 We show that aberrantly acquired N-myristoylation of SHOC2, a leucine-rich repeat-containing protein that positively modulates RAS-MAPK signal flow,3–6 underlies a clinically distinctive condition of the neuro-cardio-facial-cutaneous disorders family. Twenty-five subjects with a relatively consistent phenotype previously termed Noonan-like syndrome with loose anagen hair [OMIM 607721]7 shared the 4A>G missense change (Ser2Gly) in SHOC2 that introduces an N-myristoylation site, resulting in aberrant targeting of SHOC2 to the plasma membrane and impaired translocation to the nucleus upon growth factor stimulation. Expression of SHOC2S2G in vitro enhanced MAPK activation in a cell type-specific fashion. Induction of SHOC2S2G in Caenorhabditis elegans engendered protruding vulva, a neomorphic phenotype previously associated with aberrant signaling. These results document the first example of an acquired N-terminal lipid modification of a protein causing human disease.
PMCID: PMC2765465  PMID: 19684605
21.  Directed Evolution of Ionizing Radiation Resistance in Escherichia coli▿ † 
Journal of Bacteriology  2009;191(16):5240-5252.
We have generated extreme ionizing radiation resistance in a relatively sensitive bacterial species, Escherichia coli, by directed evolution. Four populations of Escherichia coli K-12 were derived independently from strain MG1655, with each specifically adapted to survive exposure to high doses of ionizing radiation. D37 values for strains isolated from two of the populations approached that exhibited by Deinococcus radiodurans. Complete genomic sequencing was carried out on nine purified strains derived from these populations. Clear mutational patterns were observed that both pointed to key underlying mechanisms and guided further characterization of the strains. In these evolved populations, passive genomic protection is not in evidence. Instead, enhanced recombinational DNA repair makes a prominent but probably not exclusive contribution to genome reconstitution. Multiple genes, multiple alleles of some genes, multiple mechanisms, and multiple evolutionary pathways all play a role in the evolutionary acquisition of extreme radiation resistance. Several mutations in the recA gene and a deletion of the e14 prophage both demonstrably contribute to and partially explain the new phenotype. Mutations in additional components of the bacterial recombinational repair system and the replication restart primosome are also prominent, as are mutations in genes involved in cell division, protein turnover, and glutamate transport. At least some evolutionary pathways to extreme radiation resistance are constrained by the temporally ordered appearance of specific alleles.
PMCID: PMC2725583  PMID: 19502398
22.  Interaction of finger enslaving and error compensation in multiple finger force production 
Previous studies have documented two patterns of finger interaction during multi-finger pressing tasks, enslaving and error compensation, which do not agree with each other. Enslaving is characterized by positive correlation between instructed (master) and non-instructed (slave) finger(s) while error compensation can be described as a pattern of negative correlation between master and slave fingers. We hypothesize that pattern of finger interaction, enslaving or compensation, depends on the initial force level and the magnitude of the targeted force change. Subjects were instructed to press with four fingers (I - index, M - middle, R - ring, and L - little) from a specified initial force to a target forces following a ramp target line. Force-force relations between master and each of three slave fingers were analyzed during the ramp phase of trials by calculating correlation coefficients within each master-slave pair and then 2-factor ANOVA was performed to determine effect of initial force and force increase on the correlation coefficients. It was found that, as initial force increased, the value of the correlation coefficient decreased and in some cases became negative, i.e. the enslaving transformed into error compensation. Force increase magnitude had a smaller effect on the correlation coefficients. The observations support the hypothesis that the pattern of inter-finger interaction—enslaving or compensation—depends on the initial force level and, to a smaller degree, on the targeted magnitude of the force increase. They suggest that the controller views tasks with higher steady-state forces and smaller force changes as implying a requirement to avoid large changes in the total force.
PMCID: PMC2648126  PMID: 18985331
fingers; force; enslaving; error compensation; synergy
23.  Automated Information Extraction of Key Trial Design Elements from Clinical Trial Publications 
Clinical trials are one of the most valuable sources of scientific evidence for improving the practice of medicine. The Trial Bank project aims to improve structured access to trial findings by including formalized trial information into a knowledge base. Manually extracting trial information from published articles is costly, but automated information extraction techniques can assist. The current study highlights a single architecture to extract a wide array of information elements from full-text publications of randomized clinical trials (RCTs). This architecture combines a text classifier with a weak regular expression matcher. We tested this two-stage architecture on 88 RCT reports from 5 leading medical journals, extracting 23 elements of key trial information such as eligibility rules, sample size, intervention, and outcome names. Results prove this to be a promising avenue to help critical appraisers, systematic reviewers, and curators quickly identify key information elements in published RCT articles.
PMCID: PMC2655966  PMID: 18999067
24.  Identifying Wrist Fracture Patients with High Accuracy by Automatic Categorization of X-ray Reports 
The authors performed this study to determine the accuracy of several text classification methods to categorize wrist x-ray reports. We randomly sampled 751 textual wrist x-ray reports. Two expert reviewers rated the presence (n = 301) or absence (n = 450) of an acute fracture of wrist. We developed two information retrieval (IR) text classification methods and a machine learning method using a support vector machine (TC-1). In cross-validation on the derivation set (n = 493), TC-1 outperformed the two IR based methods and six benchmark classifiers, including Naive Bayes and a Neural Network. In the validation set (n = 258), TC-1 demonstrated consistent performance with 93.8% accuracy; 95.5% sensitivity; 92.9% specificity; and 87.5% positive predictive value. TC-1 was easy to implement and superior in performance to the other classification methods.
PMCID: PMC1656960  PMID: 16929046
25.  Evidence That a RecQ Helicase Slows Senescence by Resolving Recombining Telomeres  
PLoS Biology  2007;5(6):e160.
RecQ helicases, including Saccharomyces cerevisiae Sgs1p and the human Werner syndrome protein, are important for telomere maintenance in cells lacking telomerase activity. How maintenance is accomplished is only partly understood, although there is evidence that RecQ helicases function in telomere replication and recombination. Here we use two-dimensional gel electrophoresis (2DGE) and telomere sequence analysis to explore why cells lacking telomerase and Sgs1p (tlc1 sgs1 mutants) senesce more rapidly than tlc1 mutants with functional Sgs1p. We find that apparent X-shaped structures accumulate at telomeres in senescing tlc1 sgs1 mutants in a RAD52- and RAD53-dependent fashion. The X-structures are neither Holliday junctions nor convergent replication forks, but instead may be recombination intermediates related to hemicatenanes. Direct sequencing of examples of telomere I-L in senescing cells reveals a reduced recombination frequency in tlc1 sgs1 compared with tlc1 mutants, indicating that Sgs1p is needed for tlc1 mutants to complete telomere recombination. The reduction in recombinants is most prominent at longer telomeres, consistent with a requirement for Sgs1p to generate viable progeny following telomere recombination. We therefore suggest that Sgs1p may be required for efficient resolution of telomere recombination intermediates, and that resolution failure contributes to the premature senescence of tlc1 sgs1 mutants.
Author Summary
Because telomeres are situated at the ends of chromosomes, they are both essential for chromosome integrity and particularly susceptible to processes that lead to loss of their own DNA sequences. The enzyme telomerase can counter these losses, but there are also other means of telomere maintenance, some of which depend on DNA recombination. The RecQ family of DNA helicases process DNA recombination intermediates and also help ensure telomere integrity, but the relationship between these activities is poorly understood. Family members include yeast Sgs1p and human WRN and BLM, which are deficient in the Werner premature aging syndrome and the Bloom cancer predisposition syndrome, respectively. We have found that the telomeres of yeast cells lacking both telomerase and Sgs1p accumulate structures that resemble recombination intermediates. Further, we provide evidence that the inability of cells lacking Sgs1p to process these telomere recombination intermediates leads to the premature arrest of cell division. We predict that similar defects in the processing of recombination intermediates may contribute to telomere defects in human Werner and Bloom syndrome cells.
Yeast cells lacking the RecQ helicase Sgs1p show an accumulation of telomere recombination intermediates associated with premature senescence.
PMCID: PMC1885831  PMID: 17550308

Results 1-25 (28)