Here we report a web server, the DelPhi web server, which utilizes DelPhi program to calculate electrostatic energies and the corresponding electrostatic potential and ionic distributions, and dielectric map. The server provides extra services to fix structural defects, as missing atoms in the structural file and allows for generation of missing hydrogen atoms. The hydrogen placement and the corresponding DelPhi calculations can be done with user selected force field parameters being either Charmm22, Amber98 or OPLS. Upon completion of the calculations, the user is given option to download fixed and protonated structural file, together with the parameter and Delphi output files for further analysis. Utilizing Jmol viewer, the user can see the corresponding structural file, to manipulate it and to change the presentation. In addition, if the potential map is requested to be calculated, the potential can be mapped onto the molecule surface. The DelPhi web server is available from http://compbio.clemson.edu/delphi_webserver.
DelPhi; electrostatics; proteins; continuum models; electrostatic potential; Finite-Difference Poisson-Boltzmann solver
Summary: A new edition of the DelPhi web server, DelPhi web server v2, is released to include atomic presentation of geometrical figures. These geometrical objects can be used to model nano-size objects together with real biological macromolecules. The position and size of the object can be manipulated by the user in real time until desired results are achieved. The server fixes structural defects, adds hydrogen atoms and calculates electrostatic energies and the corresponding electrostatic potential and ionic distributions.
Availability and implementation: The web server follows a client–server architecture built on PHP and HTML and utilizes DelPhi software. The computation is carried out on supercomputer cluster and results are given back to the user via http protocol, including the ability to visualize the structure and corresponding electrostatic potential via Jmol implementation. The DelPhi web server is available from http://compbio.clemson.edu/delphi_webserver.
Supplementary data are available at Bioinformatics online.
The Gauss-Seidel method is a standard iterative numerical method widely used to solve a system of equations and, in general, is more efficient comparing to other iterative methods, such as the Jacobi method. However, standard implementation of the Gauss-Seidel method restricts its utilization in parallel computing due to its requirement of using updated neighboring values (i.e., in current iteration) as soon as they are available. Here we report an efficient and exact (not requiring assumptions) method to parallelize iterations and to reduce the computational time as a linear/nearly linear function of the number of CPUs. In contrast to other existing solutions, our method does not require any assumptions and is equally applicable for solving linear and nonlinear equations. This approach is implemented in the DelPhi program, which is a finite difference Poisson-Boltzmann equation solver to model electrostatics in molecular biology. This development makes the iterative procedure on obtaining the electrostatic potential distribution in the parallelized DelPhi several folds faster than that in the serial code. Further we demonstrate the advantages of the new parallelized DelPhi by computing the electrostatic potential and the corresponding energies of large supramolecular structures.
electrostatics; DelPhi; Poisson- Boltzmann equation; Gauss-Seidel iteration; parallel computing
Many molecular events are associated with small or large conformational changes occurring in the corresponding proteins. Modeling such changes is a challenge and requires significant amount of computing time. From point of view of electrostatics, these changes can be viewed as a reorganization of local charges and dipoles in response to the changes of the electrostatic field, if the cause is insertion or deletion of a charged amino acid. Here we report a large scale investigation of modeling the changes of the folding energy due to single mutations involving charged group. This allows the changes of the folding energy to be considered mostly electrostatics in origin and to be calculated with DelPhi assigning residue-specific value of the internal dielectric constant of protein. The predicted energy changes are benchmarked against experimentally measured changes of the folding energy on a set of 257 single mutations. The best fit between experimental values and predicted changes is used to find out the effective value of the internal dielectric constant for each type of amino acid. The predicted folding free energy changes with the optimal, amino acid specific, dielectric constants are within RMSD=0.86 kcal/mol from experimentally measured changes.
DelPhi; protein electrostatics; dielectric constant; Poisson-Boltzmann equation; protein flexibility; energy calculations; single point mutations
There has been a consistent concern about the inadvertent disclosure of personal information through peer-to-peer file sharing applications, such as Limewire and Morpheus. Examples of personal health and financial information being exposed have been published. We wanted to estimate the extent to which personal health information (PHI) is being disclosed in this way, and compare that to the extent of disclosure of personal financial information (PFI).
After careful review and approval of our protocol by our institutional research ethics board, files were downloaded from peer-to-peer file sharing networks and manually analyzed for the presence of PHI and PFI. The geographic region of the IP addresses was determined, and classified as either USA or Canada.
We estimated the proportion of files that contain personal health and financial information for each region. We also estimated the proportion of search terms that return files with personal health and financial information. We ascertained and discuss the ethical issues related to this study.
Approximately 0.4% of Canadian IP addresses had PHI, as did 0.5% of US IP addresses. There was more disclosure of financial information, at 1.7% of Canadian IP addresses and 4.7% of US IP addresses. An analysis of search terms used in these file sharing networks showed that a small percentage of the terms would return PHI and PFI files (ie, there are people successfully searching for PFI and PHI on the peer-to-peer file sharing networks).
There is a real risk of inadvertent disclosure of PHI through peer-to-peer file sharing networks, although the risk is not as large as for PFI. Anyone keeping PHI on their computers should avoid installing file sharing applications on their computers, or if they have to use such tools, actively manage the risks of inadvertent disclosure of their, their family's, their clients', or patients' PHI.
Users of peer-to-peer (P2P) file-sharing networks risk the inadvertent disclosure of personal health information (PHI). In addition to potentially causing harm to the affected individuals, this can heighten the risk of data breaches for health information custodians. Automated PHI detection tools that crawl the P2P networks can identify PHI and alert custodians. While there has been previous work on the detection of personal information in electronic health records, there has been a dearth of research on the automated detection of PHI in heterogeneous user files.
To build a system that accurately detects PHI in files sent through P2P file-sharing networks. The system, which we call P2P Watch, uses a pipeline of text processing techniques to automatically detect PHI in files exchanged through P2P networks. P2P Watch processes unstructured texts regardless of the file format, document type, and content.
We developed P2P Watch to extract and analyze PHI in text files exchanged on P2P networks. We labeled texts as PHI if they contained identifiable information about a person (eg, name and date of birth) and specifics of the person’s health (eg, diagnosis, prescriptions, and medical procedures). We evaluated the system’s performance through its efficiency and effectiveness on 3924 files gathered from three P2P networks.
P2P Watch successfully processed 3924 P2P files of unknown content. A manual examination of 1578 randomly selected files marked by the system as non-PHI confirmed that these files indeed did not contain PHI, making the false-negative detection rate equal to zero. Of 57 files marked by the system as PHI, all contained both personally identifiable information and health information: 11 files were PHI disclosures, and 46 files contained organizational materials such as unfilled insurance forms, job applications by medical professionals, and essays.
PHI can be successfully detected in free-form textual files exchanged through P2P networks. Once the files with PHI are detected, affected individuals or data custodians can be alerted to take remedial action.
Privacy; personal health information; natural language processing, text data mining
Calculating the evolution of an open quantum system, i.e., a system in contact with a thermal environment, has presented a theoretical and computational challenge for many years. With the advent of supercomputers containing large amounts of memory and many processors, the computational challenge posed by the previously intractable theoretical models can now be addressed. The hierarchy equations of motion present one such model and offer a powerful method that remained under-utilized so far due to its considerable computational expense. By exploiting concurrent processing on parallel computers the hierarchy equations of motion can be applied to biological-scale systems. Herein we introduce the quantum dynamics software PHI, that solves the hierarchical equations of motion. We describe the integrator employed by PHI and demonstrate PHI’s scaling and efficiency running on large parallel computers by applying the software to the calculation of inter-complex excitation transfer between the light harvesting complexes 1 and 2 of purple photosynthetic bacteria, a 50 pigment system.
Over the past decade, nanopores have rapidly emerged as stochastic biosensors. This protocol describes the cloning, expression, and purification of the channel of bacteriophage phi29 DNA packaging nanomotor and its subsequent incorporation into lipid membranes for single-pore sensing of dsDNA and chemicals. The membrane-embedded phi29 nanochannels remain functional and structurally intact under a range of conditions. When ions and macromolecules translocate through these nanochannels, reliable fingerprint changes in conductance are observed. Compared with other well studied biological pores, the phi29 nanochannel has a larger cross-sectional area, which enables the translocation of dsDNA. Furthermore, specific amino acids can be introduced by site-directed mutagenesis within the large cavity of the channel to conjugate receptors that are able to bind specific ligands or analytes for desired applications. The lipid membrane embedded nanochannel system has immense potential nanotechnological and biomedical applications in bioreactors, environmental sensing, drug monitoring, controlled drug delivery, early disease diagnosis, and high-throughput DNA sequencing. The total time required for completing one round of this protocol is around one month.
bacteriophage phi29; connector; nanopore; liposomes; ion channel; single channel conductance; single pore; DNA packaging motor; membrane channel; bionanotechnology; nanobiotechnology; high throughput DNA sequencing
In recent years, striking discoveries have revealed that two-dimensional electron liquids (2DEL) confined at the interface between oxide band-insulators can be engineered to display a high mobility transport. The recognition that only few interfaces appear to suit hosting 2DEL is intriguing and challenges the understanding of these emerging properties not existing in bulk. Indeed, only the neutral TiO2 surface of (001)SrTiO3 has been shown to sustain 2DEL. We show that this restriction can be surpassed: (110) and (111) surfaces of SrTiO3 interfaced with epitaxial LaAlO3 layers, above a critical thickness, display 2DEL transport with mobilities similar to those of (001)SrTiO3. Moreover we show that epitaxial interfaces are not a prerequisite: conducting (110) interfaces with amorphous LaAlO3 and other oxides can also be prepared. These findings open a new perspective both for materials research and for elucidating the ultimate microscopic mechanism of carrier doping.
Clinical records contain significant medical information that can be useful to researchers in various disciplines. However, these records also contain personal health information (PHI) whose presence limits the use of the records outside of hospitals.
The goal of de-identification is to remove all PHI from clinical records. This is a challenging task because many records contain foreign and misspelled PHI; they also contain PHI that are ambiguous with non-PHI. These complications are compounded by the linguistic characteristics of clinical records. For example, medical discharge summaries, which are studied in this paper, are characterized by fragmented, incomplete utterances and domain-specific language; they cannot be fully processed by tools designed for lay language.
Methods and Results
In this paper, we show that we can de-identify medical discharge summaries using a de-identifier, Stat De-id, based on support vector machines and local context (F-measure = 97% on PHI). Our representation of local context aids de-identification even when PHI include out-of-vocabulary words and even when PHI are ambiguous with non-PHI within the same corpus. Comparison of Stat De-id with a rule-based approach shows that local context contributes more to de-identification than dictionaries combined with hand-tailored heuristics (F-measure = 85%). Comparison with two well-known named entity recognition (NER) systems, SNoW (F-measure = 94%) and IdentiFinder (F-measure = 36%), on five representative corpora show that when the language of documents is fragmented, a system with a relatively thorough representation of local context can be a more effective de-identifier than systems that combine (relatively simpler) local context with global context. Comparison with a Conditional Random Field De-identifier (CRFD), which utilizes global context in addition to the local context of Stat De-id, confirms this finding (F-measure = 88%) and establishes that strengthening the representation of local context may be more beneficial for de-identification than complementing local with global context.
automatic de-identification of narrative patient records; local lexical context; local syntactic context; dictionaries; sentential global context; syntactic information for de-identification
Since it was first defined in 1995, Public Health Informatics (PHI) has become a recognized discipline, with a research agenda, defined domain-specific competencies and a specialized corpus of technical knowledge. Information systems form a cornerstone of PHI research and implementation, representing significant progress for the nascent field. However, PHI does not advocate or incorporate standard, domain-appropriate design methods for implementing public health information systems. Reusable design is generalized design advice that can be reused in a range of similar contexts. We propose that PHI create and reuse information design knowledge by taking a systems approach that incorporates design methods from the disciplines of Human-Computer Interaction, Interaction Design and other related disciplines.
Although PHI operates in a domain with unique characteristics, many design problems in public health correspond to classic design problems, suggesting that existing design methods and solution approaches are applicable to the design of public health information systems. Among the numerous methodological frameworks used in other disciplines, we identify scenario-based design and participatory design as two widely-employed methodologies that are appropriate for adoption as PHI standards. We make the case that these methods show promise to create reusable design knowledge in PHI.
We propose the formalization of a set of standard design methods within PHI that can be used to pursue a strategy of design knowledge creation and reuse for cost-effective, interoperable public health information systems. We suggest that all public health informaticians should be able to use these design methods and the methods should be incorporated into PHI training.
SNUFER is a software for the automatic localization and generation of tables used for the presentation of single nucleotide
polymorphisms (SNPs). After input of a fasta file containing the sequences to be analyzed, a multiple sequence alignment is
generated using ClustalW ran inside SNUFER. The ClustalW output file is then used to generate a table which displays the
SNPs detected in the aligned sequences and their degree of similarity. This table can be exported to Microsoft Word,
Microsoft Excel or as a single text file, permitting further editing for publication. The software was written using Delphi
7 for programming and FireBird 2.0 for sequence database management. It is freely available for noncommercial use and can be
software; single nucleotide polymorphisms; multiple sequence alignment; ClustalW
The Adaptive Poisson-Boltzmann Solver (APBS) is a state-of-the-art suite for performing Poisson-Boltzmann electrostatic calculations on biomolecules. The iAPBS package provides a modular programmatic interface to the APBS library of electrostatic calculation routines. The iAPBS interface library can be linked with a FORTRAN or C/C++ program thus making all of the APBS functionality available from within the application. Several application modules for popular molecular dynamics simulation packages – Amber, NAMD and CHARMM are distributed with iAPBS allowing users of these packages to perform implicit solvent electrostatic calculations with APBS.
Coupling nucleic acid processing enzymes to nanoscale pores allows controlled movement of individual DNA or RNA strands that is reported as an ionic current time series. Hundreds of individual enzyme complexes can be examined in single-file order at high bandwidth and spatial resolution. The bacteriophage phi29 DNA polymerase (phi29 DNAP) is an attractive candidate for this technology, due to its remarkable processivity and high affinity for DNA substrates. Here we show that phi29 DNAP-DNA complexes are stable when captured in an electric field across the α-hemolysin nanopore. DNA substrates were activated for replication at the nanopore orifice by exploiting the 3′-5′ exonuclease activity of wild-type phi29 DNAP to excise a 3′-H terminal residue, yielding a primer strand 3′-OH. In the presence of deoxynucleoside triphosphates, DNA synthesis was initiated, allowing real time detection of numerous sequential nucleotide additions that was limited only by DNA template length. Translocation of phi29 DNAP along DNA substrates was observed in real time at Angstrom scale precision as the template strand was drawn through the nanopore lumen during replication.
Peptide histidine isoleucine (PHI) and vasoactive intestinal peptide (VIP) are neuropeptides synthesized from a common precursor, prepro-VIP, and share structural similarity and biological functions in many systems. Within the central nervous system and peripheral tissues, PHI and VIP have overlapping distribution. PHI-mediated functions are generally via activation of VIP receptors; however, the potency and affinity of PHI for VIP receptors are significantly lower than VIP. In addition, several studies suggest distinct PHI receptors that are independent of VIP receptors. PHI receptors have been cloned and characterized in fish, but their existence in mammals is still unknown. This study focuses on the functional role of PHI in the thalamus because of the localization of both PHI and VIP receptors in this brain region.
Using extracellular multiple-unit recording techniques, we found that PHI strongly attenuated the slow intrathalamic rhythmic activity. Using intracellular recording techniques, we found that PHI selectively depolarized thalamic relay neurons via an enhancement of the hyperpolarization-activated mixed cation current, Ih. Further, the actions of PHI were occluded by VIP and dopamine, indicating these modulators converge onto a common mechanism. In contrast to previous work, we found that PHI was more potent than VIP in producing excitatory actions on thalamic neurons. We next used the transgenic mice lacking a specific VIP receptor, VPAC2, to identify its possible role in PHI-mediated actions in the thalamus. PHI depolarized all relay neurons tested from wild-type mice (VPAC2+/+); however, in knockout mice (VPAC2-/-), PHI produced no change in membrane potential in all neurons tested. Our findings indicate that excitatory actions of PHI are mediated by VPAC2 receptors, not by its own PHI receptors and the excitatory actions of PHI clearly attenuates intrathalamic rhythmic activities, and likely influence information transfer through thalamocortical circuits.
PHI; VIP; thalamus; vasoactive intestinal peptide; thalamocortical; electrophysiology; epilepsy
The intracellular presence of a recombinant plasmid containing the intercistronic region between the genes H and A of bacteriophage phi X174 strongly inhibits the conversion of infecting single-stranded phi X DNA to parental replicative-form DNA. Also, transfection with single-stranded or double-stranded phi X174 DNA of spheroplasts from a strain containing such a "reduction" plasmid shows a strong decrease in phage yield. This phenomenon, the phi X reduction effect, was studied in more detail by using the phi X174 packaging system, by which plasmid DNA strands that contain the phi X(+) origin of replication were packaged as single-stranded DNA into phi X phage coats. These "plasmid particles" can transduce phi X-sensitive host cells to the antibiotic resistance coded for by the vector part of the plasmid. The phi X reduction sequence in the resident plasmid strongly affected the efficiency of the transduction process, but only when the transducing plasmid depended on primosome-mediated initiation of DNA synthesis for its conversion to double-stranded DNA. The combination of these results led to a model for the reduction effect in which the phi X reduction sequence interacted with an intracellular component that was present in limiting amounts and that specified the site at which phi X174 replicative-form DNA replication takes place. The phi X reduction sequence functioned as a viral incompatibility element in a way similar to the membrane attachment site model for plasmid incompatibility. In the DNA of bacteriophage G4, a sequence with a similar biological effect on infecting phages was identified. This reduction sequence not only inhibited phage G4 propagation, but also phi X174 infection.
Electrostatic forces are one of the primary determinants of molecular interactions. They help guide the folding of proteins, increase the binding of one protein to another and facilitate protein-DNA and protein-ligand binding. A popular method for computing the electrostatic properties of biological systems is to numerically solve the Poisson-Boltzmann (PB) equation, and there are several easy-to-use software packages available that solve the PB equation for soluble proteins. Here we present a freely available program, called APBSmem, for carrying out these calculations in the presence of a membrane. The Adaptive Poisson-Boltzmann Solver (APBS) is used as a back-end for solving the PB equation, and a Java-based graphical user interface (GUI) coordinates a set of routines that introduce the influence of the membrane, determine its placement relative to the protein, and set the membrane potential. The software Jmol is embedded in the GUI to visualize the protein inserted in the membrane before the calculation and the electrostatic potential after completing the computation. We expect that the ease with which the GUI allows one to carry out these calculations will make this software a useful resource for experimenters and computational researchers alike. Three examples of membrane protein electrostatic calculations are carried out to illustrate how to use APBSmem and to highlight the different quantities of interest that can be calculated.
We show that a pH-sensitive derivative of the green fluorescent protein, designated ratiometric GFP, can be used to measure intracellular pH (pHi) in both gram-positive and gram-negative bacterial cells. In cells expressing ratiometric GFP, the excitation ratio (fluorescence intensity at 410 and 430 nm) is correlated to the pHi, allowing fast and noninvasive determination of pHi that is ideally suited for direct analysis of individual bacterial cells present in complex environments.
Objective. To identify pharmacy faculty members’ perceptions of psychological contract breaches that can be used to guide improvements in faculty recruitment, retention, and development.
Methods. A list of psychological contract breaches was developed using a Delphi procedure involving a panel of experts assembled through purposive sampling. The Delphi consisted of 4 rounds, the first of which elicited examples of psychological contract breaches in an open-ended format. The ensuing 3 rounds consisting of a survey and anonymous feedback on aggregated group responses.
Results. Usable responses were obtained from 11 of 12 faculty members who completed the Delphi procedure. The final list of psychological contract breaches included 27 items, after modifications based on participant feedback in subsequent rounds.
Conclusion. The psychological contract breach items generated in this study provide guidance for colleges and schools of pharmacy regarding important aspects of faculty recruitment, retention, and development.
psychological contract breach; Delphi; faculty recruitment; faculty retention; faculty development
Primary CNS lymphoma (PCNSL) is an aggressive lymphoma but clinically validated biologic markers that can predict natural history to tailor treatment according to risk are lacking. Several genetic changes including BCL6 rearrangements and deletion of 6q22, containing the putative tumor suppressor gene PTPRK, are potential risk predictors. Herein we determined the prevalence and survival impact of del(6)(q22) and BCL6, immunoglobulin heavy chain (IGH), and MYC gene rearrangements in a large PCNSL cohort treated in a single center.
Patients and Methods
Interphase fluorescence in situ hybridization was performed using two-color probes for BCL6, MYC, IGH-BCL6, and del(6)(q22) on thin sections of 75 paraffin-embedded samples from 75 HIV-negative, immunocompetent patients newly diagnosed with PCNSL. Survival data were analyzed using Kaplan-Meier survival curves, log-rank tests, and proportional hazards regression adjusting for age, deep structure involvement, and high-dose methotrexate (HDMTX) treatment.
The prevalence of del(6)(q22) and BCL6, IGH, and MYC translocations was 45%,17%, 13%, and 3%, respectively. The presence of del(6)(q22) and/or a BCL6 translocation was associated with inferior overall survival (OS; P = .0097). The presence of either del(6)(q22) alone or a BCL6 translocation alone was also associated with inferior OS (P = .0087). Univariable results held after adjusting for age, deep structure involvement, and HDMTX.
Del (6)(q22) and BCL6 rearrangements are common in PCNSL and predict for decreased OS independent of deep structure involvement and HDMTX. Unlike systemic diffuse large B-cell lymphoma, del(6)(q22) is common and IGH translocations are infrequent and usually involve BCL6 rather than BCL2, suggesting a distinct pathogenesis.
An adaptive Cartesian grid (ACG) concept is presented for the fast and robust numerical solution of the 3D Poisson-Boltzmann Equation (PBE) governing the electrostatic interactions of large-scale biomolecules and highly charged multi-biomolecular assemblies such as ribosomes and viruses. The ACG offers numerous advantages over competing grid topologies such as regular 3D lattices and unstructured grids. For very large biological molecules and multi-biomolecule assemblies, the total number of grid-points is several orders of magnitude less than that required in a conventional lattice grid used in the current PBE solvers thus allowing the end user to obtain accurate and stable nonlinear PBE solutions on a desktop computer. Compared to tetrahedral-based unstructured grids, ACG offers a simpler hierarchical grid structure, which is naturally suited to multigrid, relieves indirect addressing requirements and uses fewer neighboring nodes in the finite difference stencils. Construction of the ACG and determination of the dielectric/ionic maps are straightforward, fast and require minimal user intervention. Charge singularities are eliminated by reformulating the problem to produce the reaction field potential in the molecular interior and the total electrostatic potential in the exterior ionic solvent region. This approach minimizes grid-dependency and alleviates the need for fine grid spacing near atomic charge sites. The technical portion of this paper contains three parts. First, the ACG and its construction for general biomolecular geometries are described. Next, a discrete approximation to the PBE upon this mesh is derived. Finally, the overall solution procedure and multigrid implementation are summarized. Results obtained with the ACG-based PBE solver are presented for: (i) a low dielectric spherical cavity, containing interior point charges, embedded in a high dielectric ionic solvent – analytical solutions are available for this case, thus allowing rigorous assessment of the solution accuracy; (ii) a pair of low dielectric charged spheres embedded in a ionic solvent to compute electrostatic interaction free energies as a function of the distance between sphere centers; (iii) surface potentials of proteins, nucleic acids and their larger-scale assemblies such as ribosomes; and (iv) electrostatic solvation free energies and their salt sensitivities – obtained with both linear and nonlinear Poisson-Boltzmann equation – for a large set of proteins. These latter results along with timings can serve as benchmarks for comparing the performance of different PBE solvers.
Poisson-Boltzmann equation; biomolecular electrostatics; implicit solvent model; algorithm; finite difference methods; Cartesian grid; adaptive; electrostatic potential
The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act.
We introduce here a novel, machine learning-based iterative Named Entity Recognition approach intended for use on semi-structured documents like discharge records. Our method identifies PHI in several steps. First, it labels all entities whose tags can be inferred from the structure of the text and it then utilizes this information to find further PHI phrases in the flow text parts of the document.
Following the standard evaluation method of the first Workshop on Challenges in Natural Language Processing for Clinical Data, we used token-level Precision, Recall and Fβ=1 measure metrics for evaluation.
Our system achieved outstanding accuracy on the standard evaluation dataset of the de-identification challenge, with an F measure of 99.7534% for the best submitted model.
We can say that our system is competitive with the current state-of-the-art solutions, while we describe here several techniques that can be beneficial in other tasks that need to handle structured documents such as clinical records.
The bacteriophage phi29 DNA packaging motor is a protein/RNA complex that can produce strong force to condense the linear-double stranded DNA genome into a pre-formed protein capsid. The RNA component, called the packaging RNA (pRNA), utilizes magnesium-dependent intermolecular base-pairing interactions to form ring-shaped complexes. The pRNA is a class of non-coding RNA, interacting with phi29 motor proteins to enable DNA packaging. Here, we report a 2-piece chimeric pRNA construct that is fully competent in interacting with partner pRNA to form ring-shaped complexes, in packaging DNA via the motor, and in assembling infectious phi29 virions in vitro. This is the first example of a fully functional pRNA assembled using two non-covalently interacting fragments. The results support the notion of modular pRNA architecture in the phi29 packaging motor.
Packaging RNA; packaging motor; DNA packaging; bacteriophage; phi29; RNA assembly; non-coding RNA
Electronic clinical documentation can be useful for activities such as public health surveillance, quality improvement, and research, but existing methods of de-identification may not provide sufficient protection of patient data. The general-purpose natural language processor MedLEE retains medical concepts while excluding the remaining text so, in addition to processing text into structured data, it may be able provide a secondary benefit of de-identification. Without modifying the system, the authors tested the ability of MedLEE to remove protected health information (PHI) by comparing 100 outpatient clinical notes with the corresponding XML-tagged output. Of 809 instances of PHI, 26 (3.2%) were detected in output as a result of processing and identification errors. However, PHI in the output was highly transformed, much appearing as normalized terms for medical concepts, potentially making re-identification more difficult. The MedLEE processor may be a good enhancement to other de-identification systems, both removing PHI and providing coded data from clinical text.
Mutation sites that arise in human mitochondrial DNA as a result of oxidation by a rhodium photooxidant have been identified. HeLa cells were incubated with [Rh(phi)2bpy]Cl3 (phi = 9,10-phenanthrenequinone diimine), an intercalating photooxidant, to allow the complex to enter the cell and bind mitochondrial DNA. Photoexcitation of DNA-bound [Rh(phi)2bpy]3+ can promote the oxidation of guanine from a distance through DNA-mediated charge transport. After two rounds of photolysis and growth of cells incubated with the rhodium complex, DNA mutations in a portion of the mitochondrial genome were assessed via manual sequencing. The mutational pattern is consistent with dG to dT transversions in the repetitive guanine tracts. Significantly, the mutational pattern found overlaps oxidative damage hot spots seen previously. These mutations are found within conserved sequence block II, a critical regulatory element involved in DNA replication, and these have been identified as sites of low oxidation potential to which oxidative damage is funneled. Based upon this mutational analysis and its correspondence to sites of long range oxidative damage, we infer a critical role for DNA charge transport in generating these mutations and, thus, in regulating mitochondrial DNA replication under oxidative stress.