The purpose of this study was to use magnetic resonance imaging to measure the moment arm of the flexor digitorum superficialis tendon about the metacarpophalangeal joint of the index, middle, ring, and little fingers when the position and force production level of the index finger was altered. A secondary goal was to create regression models using anthropometric data to predict moment arms of the flexor digitorum superficialis about the metacarpophalangeal joint of each finger.
The hands of subjects were scanned using a 3.0T magnetic resonance imaging scanner. The metacarpophalangeal joint of the index finger was placed in: flexion, neutral, and extension. For each joint configuration subjects produced no active force (passive condition) and exerted a flexion force to resist a load at the fingertip (active condition).
The following was found: (1) The moment arm of the flexor digitorum superficialis at the metacarpophalangeal joint of the index finger (a) increased with the joint flexion and stayed unchanged with finger extension; and (b) decreased with the increase of force at the neutral and extended finger postures and did not change at the flexed posture. (2) The moment arms of the flexor digitorum superficialis tendon of the middle, ring, and little fingers (a) did not change when the index metacarpophalangeal joint position changed (p > 0.20); and (b) The moment arms of the middle and little fingers increased when the index finger actively produced force at the flexed metacarpophalangeal joint posture. (4) The moment arms showed a high correlation with anthropometric measurements.
Moment arms of the flexor digitorum superficialis change due to both changes in joint angle and muscle activation; they scale with various anthropometric measures.
MRI; moment arm; flexor digitorum superficialis; finger interaction
The DNA sequences of chromosomes I and II of Rhodobacter sphaeroides strain 2.4.1 have been revised, and the annotation of the entire genomic sequence, including both chromosomes and the five plasmids, has been updated. Errors in the originally published sequence have been corrected, and ∼11% of the coding regions in the original sequence have been affected by the revised annotation.
The genomes of five Cochliobolus heterostrophus strains, two Cochliobolus sativus strains, three additional Cochliobolus species (Cochliobolus victoriae, Cochliobolus carbonum, Cochliobolus miyabeanus), and closely related Setosphaeria turcica were sequenced at the Joint Genome Institute (JGI). The datasets were used to identify SNPs between strains and species, unique genomic regions, core secondary metabolism genes, and small secreted protein (SSP) candidate effector encoding genes with a view towards pinpointing structural elements and gene content associated with specificity of these closely related fungi to different cereal hosts. Whole-genome alignment shows that three to five percent of each genome differs between strains of the same species, while a quarter of each genome differs between species. On average, SNP counts among field isolates of the same C. heterostrophus species are more than 25× higher than those between inbred lines and 50× lower than SNPs between Cochliobolus species. The suites of nonribosomal peptide synthetase (NRPS), polyketide synthase (PKS), and SSP–encoding genes are astoundingly diverse among species but remarkably conserved among isolates of the same species, whether inbred or field strains, except for defining examples that map to unique genomic regions. Functional analysis of several strain-unique PKSs and NRPSs reveal a strong correlation with a role in virulence.
The filamentous ascomycete genus Cochliobolus includes highly aggressive necrotrophic and hemibiotrophic pathogens with particular specificity to their host plants, often associated with production of host selective toxins (HST) that allow necrotrophs to trigger host cell death. Hemibiotrophs must keep their hosts alive during initial infection stages and rely on subverting host defenses by secreting small protein effectors. Many Cochliobolus species have emerged rapidly as devastating pathogens due to HSTs. The genomes of Cochliobolus and related pathogens that differ in host preference, host specificity, and virulence strategies have been sequenced. Our comparative results, at the whole-genome level, and with a spotlight on core genes for secondary metabolism and small secreted proteins, touch on how pathogens develop and hone these tools, according to host or lifestyle. We suggest that, while necrotrophs and hemibiotrophs employ fundamentally contrasting mechanisms of promoting disease, the tools they utilize (HSTs and protein effectors) overlap. The suites of secondary metabolite and SSP genes that each possesses reflect astounding diversity among species, hinting that gene products, particularly those associated with unique genomic regions, are candidates for pathogenic lifestyle differences. Manipulations of strain-unique secondary metabolite genes associated with host-specific virulence provide tangible examples.
Pyrenophora tritici-repentis is a necrotrophic fungus causal to the disease tan spot of wheat, whose contribution to crop loss has increased significantly during the last few decades. Pathogenicity by this fungus is attributed to the production of host-selective toxins (HST), which are recognized by their host in a genotype-specific manner. To better understand the mechanisms that have led to the increase in disease incidence related to this pathogen, we sequenced the genomes of three P. tritici-repentis isolates. A pathogenic isolate that produces two known HSTs was used to assemble a reference nuclear genome of approximately 40 Mb composed of 11 chromosomes that encode 12,141 predicted genes. Comparison of the reference genome with those of a pathogenic isolate that produces a third HST, and a nonpathogenic isolate, showed the nonpathogen genome to be more diverged than those of the two pathogens. Examination of gene-coding regions has provided candidate pathogen-specific proteins and revealed gene families that may play a role in a necrotrophic lifestyle. Analysis of transposable elements suggests that their presence in the genome of pathogenic isolates contributes to the creation of novel genes, effector diversification, possible horizontal gene transfer events, identified copy number variation, and the first example of transduplication by DNA transposable elements in fungi. Overall, comparative analysis of these genomes provides evidence that pathogenicity in this species arose through an influx of transposable elements, which created a genetically flexible landscape that can easily respond to environmental changes.
wheat (Triticum aestivum); copy number variation; histone H3 transduplication; ToxA; ToxB; anastomosis
Mid-water plankton collections commonly include bizarre and mysterious developmental stages that differ conspicuously from their adult counterparts in morphology and habitat. Unaware of the existence of planktonic larval stages, early zoologists often misidentified these unique morphologies as independent adult lineages. Many such mistakes have since been corrected by collecting larvae, raising them in the lab, and identifying the adult forms. However, challenges arise when the larva is remarkably rare in nature and relatively inaccessible due to its changing habitats over the course of ontogeny. The mid-water marine species Cerataspis monstrosa (Gray 1828) is an armored crustacean larva whose adult identity has remained a mystery for over 180 years. Our phylogenetic analyses, based in part on recent collections from the Gulf of Mexico, provide definitive evidence that the rare, yet broadly distributed larva, C. monstrosa, is an early developmental stage of the globally distributed deepwater aristeid shrimp, Plesiopenaeus armatus. Divergence estimates and phylogenetic relationships across five genes confirm the larva and adult are the same species. Our work demonstrates the diagnostic power of molecular systematics in instances where larval rearing seldom succeeds and morphology and habitat are not indicative of identity. Larval–adult linkages not only aid in our understanding of biodiversity, they provide insights into the life history, distribution, and ecology of an organism.
Cerataspis monstrosa; Decapoda; DNA barcoding; larval–adult linkage; phylogenetics
The ND18 strain of Barley stripe mosaic virus (BSMV) infects several lines of Brachypodium distachyon, a recently developed model system for genomics research in cereals. Among the inbred lines tested, Bd3-1 is highly resistant at 20 to 25°C, whereas Bd21 is susceptible and infection results in an intense mosaic phenotype accompanied by high levels of replicating virus. We generated an F6∶7 recombinant inbred line (RIL) population from a cross between Bd3-1 and Bd21 and used the RILs, and an F2 population of a second Bd21 × Bd3-1 cross to evaluate the inheritance of resistance. The results indicate that resistance segregates as expected for a single dominant gene, which we have designated Barley stripe mosaic virus resistance 1 (Bsr1). We constructed a genetic linkage map of the RIL population using SNP markers to map this gene to within 705 Kb of the distal end of the top of chromosome 3. Additional CAPS and Indel markers were used to fine map Bsr1 to a 23 Kb interval containing five putative genes. Our study demonstrates the power of using RILs to rapidly map the genetic determinants of BSMV resistance in Brachypodium. Moreover, the RILs and their associated genetic map, when combined with the complete genomic sequence of Brachypodium, provide new resources for genetic analyses of many other traits.
This study investigated the effects of modifying contact finger forces in one direction—normal or tangential—on the entire set of the contact forces, while statically holding an object. Subjects grasped a handle instrumented with finger force-moment sensors, maintained it at rest in the air, and then slowly: (1) increased the grasping force, (2) tried to spread fingers apart, and (3) tried to squeeze fingers together. Analysis was mostly performed at the virtual finger (VF) level (the VF is an imaginable finger that generates the same force and moment as the four fingers combined). For all three tasks there were statistically significant changes in the VF normal and tangential forces. For finger spreading/squeezing the tangential force neutral point was located between the index and middle fingers. We conclude that the internal forces are regulated as a whole, including adjustments in both normal and tangential force, instead of only a subset of forces (normal or tangential). The effects of such factors as EFFORT and TORQUE were additive; their interaction was not statistically significant, thus supporting the principle of superposition in human prehension.
prehension; grasping; motor control; occupational therapy
Paenibacillus sp. strain JDR-2, an aggressively xylanolytic bacterium isolated from sweetgum (Liquidambar styraciflua) wood, is able to efficiently depolymerize, assimilate and metabolize 4-O-methylglucuronoxylan, the predominant structural component of hardwood hemicelluloses. A basis for this capability was first supported by the identification of genes and characterization of encoded enzymes and has been further defined by the sequencing and annotation of the complete genome, which we describe. In addition to genes implicated in the utilization of β-1,4-xylan, genes have also been identified for the utilization of other hemicellulosic polysaccharides. The genome of Paenibacillus sp. JDR-2 contains 7,184,930 bp in a single replicon with 6,288 protein-coding and 122 RNA genes. Uniquely prominent are 874 genes encoding proteins involved in carbohydrate transport and metabolism. The prevalence and organization of these genes support a metabolic potential for bioprocessing of hemicellulose fractions derived from lignocellulosic resources.
aerobic; mesophile; Gram-positive; Paenibacillus; xylanolytic; xylan
Pharmacological inhibition of VEGF-A has proven to be effective in inhibiting angiogenesis and vascular leak associated with cancers and various eye diseases. However, little information is currently available on the binding kinetics and relative biological activity of various VEGF inhibitors. Therefore, we have evaluated the binding kinetics of two anti-VEGF antibodies, ranibizumab and bevacizumab, and VEGF Trap (also known as aflibercept), a novel type of soluble decoy receptor, with substantially higher affinity than conventional soluble VEGF receptors. VEGF Trap bound to all isoforms of human VEGF-A tested with subpicomolar affinity. Ranibizumab and bevacizumab also bound human VEGF-A, but with markedly lower affinity. The association rate for VEGF Trap binding to VEGF-A was orders of magnitude faster than that measured for bevacizumab and ranibizumab. Similarly, in cell-based bioassays, VEGF Trap inhibited the activation of VEGFR1 and VEGFR2, as well as VEGF-A induced calcium mobilization and migration in human endothelial cells more potently than ranibizumab or bevacizumab. Only VEGF Trap bound human PlGF and VEGF-B, and inhibited VEGFR1 activation and HUVEC migration induced by PlGF. These data differentiate VEGF Trap from ranibizumab and bevacizumab in terms of its markedly higher affinity for VEGF-A, as well as its ability to bind VEGF-B and PlGF.
Electronic supplementary material
The online version of this article (doi:10.1007/s10456-011-9249-6) contains supplementary material, which is available to authorized users.
VEGF receptor; Aflibercept; Affinity; Age-related macular degeneration; Placental growth factor; Biomedicine; Cardiology; Biomedicine general; Ophthalmology; Cancer Research; Cell Biology; Oncology
Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.
single nucleotide polymorphism; SNP; indel; comparative genomics; classical mutant
As clinical text mining continues to mature, its potential as an enabling technology for innovations in patient care and clinical research is becoming a reality. A critical part of that process is rigid benchmark testing of natural language processing methods on realistic clinical narrative. In this paper, the authors describe the design and performance of three state-of-the-art text-mining applications from the National Research Council of Canada on evaluations within the 2010 i2b2 challenge.
The three systems perform three key steps in clinical information extraction: (1) extraction of medical problems, tests, and treatments, from discharge summaries and progress notes; (2) classification of assertions made on the medical problems; (3) classification of relations between medical concepts. Machine learning systems performed these tasks using large-dimensional bags of features, as derived from both the text itself and from external sources: UMLS, cTAKES, and Medline.
Performance was measured per subtask, using micro-averaged F-scores, as calculated by comparing system annotations with ground-truth annotations on a test set.
The systems ranked high among all submitted systems in the competition, with the following F-scores: concept extraction 0.8523 (ranked first); assertion detection 0.9362 (ranked first); relationship detection 0.7313 (ranked second).
For all tasks, we found that the introduction of a wide range of features was crucial to success. Importantly, our choice of machine learning algorithms allowed us to be versatile in our feature design, and to introduce a large number of features without overfitting and without encountering computing-resource bottlenecks.
natural language processing; semantics; classification/*methods; computerized medical records systems; patient discharge/*statistics & numerical data; text mining; concept detection; relation extraction; document coding; machine learning; modeling physiologic and disease processes; linking the genotype and phenotype; identifying genome and protein structure and function; visualization of data and knowledge
Clinical trials are one of the most important sources of evidence for guiding evidence-based practice and the design of new trials. However, most of this information is available only in free text - e.g., in journal publications - which is labour intensive to process for systematic reviews, meta-analyses, and other evidence synthesis studies. This paper presents an automatic information extraction system, called ExaCT, that assists users with locating and extracting key trial characteristics (e.g., eligibility criteria, sample size, drug dosage, primary outcomes) from full-text journal articles reporting on randomized controlled trials (RCTs).
ExaCT consists of two parts: an information extraction (IE) engine that searches the article for text fragments that best describe the trial characteristics, and a web browser-based user interface that allows human reviewers to assess and modify the suggested selections. The IE engine uses a statistical text classifier to locate those sentences that have the highest probability of describing a trial characteristic. Then, the IE engine's second stage applies simple rules to these sentences to extract text fragments containing the target answer. The same approach is used for all 21 trial characteristics selected for this study.
We evaluated ExaCT using 50 previously unseen articles describing RCTs. The text classifier (first stage) was able to recover 88% of relevant sentences among its top five candidates (top5 recall) with the topmost candidate being relevant in 80% of cases (top1 precision). Precision and recall of the extraction rules (second stage) were 93% and 91%, respectively. Together, the two stages of the extraction engine were able to provide (partially) correct solutions in 992 out of 1050 test tasks (94%), with a majority of these (696) representing fully correct and complete answers.
Our experiments confirmed the applicability and efficacy of ExaCT. Furthermore, they demonstrated that combining a statistical method with 'weak' extraction rules can identify a variety of study characteristics. The system is flexible and can be extended to handle other characteristics and document types (e.g., study protocols).
Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. While de-identification tools have been developed for free-text, none have been developed or tested for the full range of primary care EMR data
We used deid open source de-identification software and modified it for an Ontario context for use on primary care EMR data. We developed the modified program on a training set of 1000 free-text records from one group practice and then tested it on two validation sets from a random sample of 700 free-text EMR records from 17 different physicians from 7 different practices in 5 different cities and 500 free-text records from a group practice that was in a different city than the group practice that was used for the training set. We measured the sensitivity/recall, precision, specificity, accuracy and F-measure of the modified tool against manually tagged free-text records to remove patient and physician names, locations, addresses, medical record, health card and telephone numbers.
We found that the modified training program performed with a sensitivity of 88.3%, specificity of 91.4%, precision of 91.3%, accuracy of 89.9% and F-measure of 0.90. The validations sets had sensitivities of 86.7% and 80.2%, specificities of 91.4% and 87.7%, precisions of 91.1% and 87.4%, accuracies of 89.0% and 83.8% and F-measures of 0.89 and 0.84 for the first and second validation sets respectively.
The deid program can be modified to reasonably accurately de-identify free-text primary care EMR records while preserving clinical content.
N-myristoylation is a common form of co-translational protein fatty acylation resulting from the attachment of myristate to a required N-terminal glycine residue.1,2 We show that aberrantly acquired N-myristoylation of SHOC2, a leucine-rich repeat-containing protein that positively modulates RAS-MAPK signal flow,3–6 underlies a clinically distinctive condition of the neuro-cardio-facial-cutaneous disorders family. Twenty-five subjects with a relatively consistent phenotype previously termed Noonan-like syndrome with loose anagen hair [OMIM 607721]7 shared the 4A>G missense change (Ser2Gly) in SHOC2 that introduces an N-myristoylation site, resulting in aberrant targeting of SHOC2 to the plasma membrane and impaired translocation to the nucleus upon growth factor stimulation. Expression of SHOC2S2G
in vitro enhanced MAPK activation in a cell type-specific fashion. Induction of SHOC2S2G in Caenorhabditis elegans engendered protruding vulva, a neomorphic phenotype previously associated with aberrant signaling. These results document the first example of an acquired N-terminal lipid modification of a protein causing human disease.
We have generated extreme ionizing radiation resistance in a relatively sensitive bacterial species, Escherichia coli, by directed evolution. Four populations of Escherichia coli K-12 were derived independently from strain MG1655, with each specifically adapted to survive exposure to high doses of ionizing radiation. D37 values for strains isolated from two of the populations approached that exhibited by Deinococcus radiodurans. Complete genomic sequencing was carried out on nine purified strains derived from these populations. Clear mutational patterns were observed that both pointed to key underlying mechanisms and guided further characterization of the strains. In these evolved populations, passive genomic protection is not in evidence. Instead, enhanced recombinational DNA repair makes a prominent but probably not exclusive contribution to genome reconstitution. Multiple genes, multiple alleles of some genes, multiple mechanisms, and multiple evolutionary pathways all play a role in the evolutionary acquisition of extreme radiation resistance. Several mutations in the recA gene and a deletion of the e14 prophage both demonstrably contribute to and partially explain the new phenotype. Mutations in additional components of the bacterial recombinational repair system and the replication restart primosome are also prominent, as are mutations in genes involved in cell division, protein turnover, and glutamate transport. At least some evolutionary pathways to extreme radiation resistance are constrained by the temporally ordered appearance of specific alleles.
Previous studies have documented two patterns of finger interaction during multi-finger pressing tasks, enslaving and error compensation, which do not agree with each other. Enslaving is characterized by positive correlation between instructed (master) and non-instructed (slave) finger(s) while error compensation can be described as a pattern of negative correlation between master and slave fingers. We hypothesize that pattern of finger interaction, enslaving or compensation, depends on the initial force level and the magnitude of the targeted force change. Subjects were instructed to press with four fingers (I - index, M - middle, R - ring, and L - little) from a specified initial force to a target forces following a ramp target line. Force-force relations between master and each of three slave fingers were analyzed during the ramp phase of trials by calculating correlation coefficients within each master-slave pair and then 2-factor ANOVA was performed to determine effect of initial force and force increase on the correlation coefficients. It was found that, as initial force increased, the value of the correlation coefficient decreased and in some cases became negative, i.e. the enslaving transformed into error compensation. Force increase magnitude had a smaller effect on the correlation coefficients. The observations support the hypothesis that the pattern of inter-finger interaction—enslaving or compensation—depends on the initial force level and, to a smaller degree, on the targeted magnitude of the force increase. They suggest that the controller views tasks with higher steady-state forces and smaller force changes as implying a requirement to avoid large changes in the total force.
fingers; force; enslaving; error compensation; synergy
Clinical trials are one of the most valuable sources of scientific evidence for improving the practice of medicine. The Trial Bank project aims to improve structured access to trial findings by including formalized trial information into a knowledge base. Manually extracting trial information from published articles is costly, but automated information extraction techniques can assist. The current study highlights a single architecture to extract a wide array of information elements from full-text publications of randomized clinical trials (RCTs). This architecture combines a text classifier with a weak regular expression matcher. We tested this two-stage architecture on 88 RCT reports from 5 leading medical journals, extracting 23 elements of key trial information such as eligibility rules, sample size, intervention, and outcome names. Results prove this to be a promising avenue to help critical appraisers, systematic reviewers, and curators quickly identify key information elements in published RCT articles.
The authors performed this study to determine the accuracy of several text classification methods to categorize wrist x-ray reports. We randomly sampled 751 textual wrist x-ray reports. Two expert reviewers rated the presence (n = 301) or absence (n = 450) of an acute fracture of wrist. We developed two information retrieval (IR) text classification methods and a machine learning method using a support vector machine (TC-1). In cross-validation on the derivation set (n = 493), TC-1 outperformed the two IR based methods and six benchmark classifiers, including Naive Bayes and a Neural Network. In the validation set (n = 258), TC-1 demonstrated consistent performance with 93.8% accuracy; 95.5% sensitivity; 92.9% specificity; and 87.5% positive predictive value. TC-1 was easy to implement and superior in performance to the other classification methods.
RecQ helicases, including Saccharomyces cerevisiae Sgs1p and the human Werner syndrome protein, are important for telomere maintenance in cells lacking telomerase activity. How maintenance is accomplished is only partly understood, although there is evidence that RecQ helicases function in telomere replication and recombination. Here we use two-dimensional gel electrophoresis (2DGE) and telomere sequence analysis to explore why cells lacking telomerase and Sgs1p (tlc1 sgs1 mutants) senesce more rapidly than tlc1 mutants with functional Sgs1p. We find that apparent X-shaped structures accumulate at telomeres in senescing tlc1 sgs1 mutants in a RAD52- and RAD53-dependent fashion. The X-structures are neither Holliday junctions nor convergent replication forks, but instead may be recombination intermediates related to hemicatenanes. Direct sequencing of examples of telomere I-L in senescing cells reveals a reduced recombination frequency in tlc1 sgs1 compared with tlc1 mutants, indicating that Sgs1p is needed for tlc1 mutants to complete telomere recombination. The reduction in recombinants is most prominent at longer telomeres, consistent with a requirement for Sgs1p to generate viable progeny following telomere recombination. We therefore suggest that Sgs1p may be required for efficient resolution of telomere recombination intermediates, and that resolution failure contributes to the premature senescence of tlc1 sgs1 mutants.
Because telomeres are situated at the ends of chromosomes, they are both essential for chromosome integrity and particularly susceptible to processes that lead to loss of their own DNA sequences. The enzyme telomerase can counter these losses, but there are also other means of telomere maintenance, some of which depend on DNA recombination. The RecQ family of DNA helicases process DNA recombination intermediates and also help ensure telomere integrity, but the relationship between these activities is poorly understood. Family members include yeast Sgs1p and human WRN and BLM, which are deficient in the Werner premature aging syndrome and the Bloom cancer predisposition syndrome, respectively. We have found that the telomeres of yeast cells lacking both telomerase and Sgs1p accumulate structures that resemble recombination intermediates. Further, we provide evidence that the inability of cells lacking Sgs1p to process these telomere recombination intermediates leads to the premature arrest of cell division. We predict that similar defects in the processing of recombination intermediates may contribute to telomere defects in human Werner and Bloom syndrome cells.
Yeast cells lacking the RecQ helicase Sgs1p show an accumulation of telomere recombination intermediates associated with premature senescence.
Alzheimer's disease (AD) is a complex disorder that involves multiple biological processes. Many genes implicated in these processes may be present in low abundance in the human brain. DNA microarray analysis identifies changed genes that are expressed at high or moderate levels. Complementary to this approach, we described here a novel technology designed specifically to isolate rare and novel genes previously undetectable by other methods. We have used this method to identify differentially expressed genes in brains affected by AD. Our method, termed Subtractive Transcription-based Amplification of mRNA (STAR), is a combination of subtractive RNA/DNA hybridization and RNA amplification, which allows the removal of non-differentially expressed transcripts and the linear amplification of the differentially expressed genes.
Using the STAR technology we have identified over 800 differentially expressed sequences in AD brains, both up- and down- regulated, compared to age-matched controls. Over 55% of the sequences represent genes of unknown function and roughly half of them were novel and rare discoveries in the human brain. The expression changes of nearly 80 unique genes were further confirmed by qRT-PCR and the association of additional genes with AD and/or neurodegeneration was established using an in-house literature mining tool (LitMiner).
The STAR process significantly amplifies unique and rare sequences relative to abundant housekeeping genes and, as a consequence, identifies genes not previously linked to AD. This method also offers new opportunities to study the subtle changes in gene expression that potentially contribute to the development and/or progression of AD.
This paper examines how the adoption of a subject-specific library service has changed the way in which its users interact with a digital library. The LitMiner text-analysis application was developed to enable biologists to explore gene relationships in the published literature. The application features a suite of interfaces that enable users to search PubMed as well as local databases, to view document abstracts, to filter terms, to select gene name aliases, and to visualize the co-occurrences of genes in the literature. At each of these stages, LitMiner offers the functionality of a digital library. Documents that are accessible online are identified by an icon. Users can also order documents from their institution's library collection from within the application. In so doing, LitMiner aims to integrate digital library services into the research process of its users.
This integration of digital library services into the research process of biologists results in increased access to the published literature.
In order to make better use of their collections, digital libraries should customize their services to suit the research needs of their patrons.
The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND.
Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days.
Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at . Current capabilities allow searching for human, mouse and yeast protein-interaction information.